How to Filter Data in Python
Learn different ways to filter lists and DataFrames in Python using comprehensions, filter(), and pandas.
Filtering data is a common operation in Python. Here are the main approaches.
Method 1: List Comprehension
The most Pythonic way to filter lists:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Filter even numbers
evens = [x for x in numbers if x % 2 == 0]
print(evens) # [2, 4, 6, 8, 10]
# Filter numbers greater than 5
greater = [x for x in numbers if x > 5]
print(greater) # [6, 7, 8, 9, 10]
Method 2: filter() Function
Use with a function or lambda:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Using lambda
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens) # [2, 4, 6, 8, 10]
# Using a defined function
def is_positive(x):
return x > 0
positives = list(filter(is_positive, [-1, 0, 1, 2, -3]))
print(positives) # [1, 2]
Method 3: Filtering DataFrames with pandas
Filter rows in a DataFrame:
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
'age': [25, 30, 35, 28],
'city': ['Paris', 'London', 'Paris', 'Berlin']
})
# Filter by single condition
adults = df[df['age'] >= 30]
print(adults)
# Filter by multiple conditions
paris_young = df[(df['city'] == 'Paris') & (df['age'] < 30)]
print(paris_young)
# Using query() method
result = df.query('age > 25 and city == "Paris"')
print(result)
Method 4: Using isin() for Multiple Values
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'city': ['Paris', 'London', 'Berlin']
})
# Filter rows where city is in a list
cities = ['Paris', 'Berlin']
filtered = df[df['city'].isin(cities)]
print(filtered)
Summary
- Use list comprehensions for simple list filtering
- Use filter() with lambda or functions
- Use boolean indexing for pandas DataFrames
- Use query() for readable DataFrame filtering