How to Filter Data in Python | The School of Code

Settings

Appearance

Choose a typography theme that suits your style

Back to How-to Guides
Python

How to Filter Data in Python

Learn different ways to filter lists and DataFrames in Python using comprehensions, filter(), and pandas.

PythonFilteringData Analysis

Filtering data is a common operation in Python. Here are the main approaches.

Method 1: List Comprehension

The most Pythonic way to filter lists:

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Filter even numbers
evens = [x for x in numbers if x % 2 == 0]
print(evens)  # [2, 4, 6, 8, 10]

# Filter numbers greater than 5
greater = [x for x in numbers if x > 5]
print(greater)  # [6, 7, 8, 9, 10]

Method 2: filter() Function

Use with a function or lambda:

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Using lambda
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens)  # [2, 4, 6, 8, 10]

# Using a defined function
def is_positive(x):
    return x > 0

positives = list(filter(is_positive, [-1, 0, 1, 2, -3]))
print(positives)  # [1, 2]

Method 3: Filtering DataFrames with pandas

Filter rows in a DataFrame:

import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'age': [25, 30, 35, 28],
    'city': ['Paris', 'London', 'Paris', 'Berlin']
})

# Filter by single condition
adults = df[df['age'] >= 30]
print(adults)

# Filter by multiple conditions
paris_young = df[(df['city'] == 'Paris') & (df['age'] < 30)]
print(paris_young)

# Using query() method
result = df.query('age > 25 and city == "Paris"')
print(result)

Method 4: Using isin() for Multiple Values

import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'city': ['Paris', 'London', 'Berlin']
})

# Filter rows where city is in a list
cities = ['Paris', 'Berlin']
filtered = df[df['city'].isin(cities)]
print(filtered)

Summary

  • Use list comprehensions for simple list filtering
  • Use filter() with lambda or functions
  • Use boolean indexing for pandas DataFrames
  • Use query() for readable DataFrame filtering