Pandas: Select rows that match a string

less than 1 minute read

Micro tutorial:

Select rows of a Pandas DataFrame that match a (partial) string.

import pandas as pd

#create sample data
data = {'model': ['Lisa', 'Lisa 2', 'Macintosh 128K', 'Macintosh 512K'],
        'launched': [1983,1984,1984,1984],
        'discontinued': [1986, 1985, 1984, 1986]}

df = pd.DataFrame(data, columns = ['model', 'launched', 'discontinued'])
df
model launched discontinued
0 Lisa 1983 1986
1 Lisa 2 1984 1985
2 Macintosh 128K 1984 1984
3 Macintosh 512K 1984 1986

We want to select all rows where the column ‘model’ starts with the string ‘Mac’.

df[df['model'].str.match('Mac')]
model launched discontinued
2 Macintosh 128K 1984 1984
3 Macintosh 512K 1984 1986

We can also search less strict for all rows where the column ‘model’ contains the string ‘ac’ (note the difference: contains vs. match).

df[df['model'].str.contains('ac')]
model launched discontinued
2 Macintosh 128K 1984 1984
3 Macintosh 512K 1984 1986

More info about working with text data: https://pandas.pydata.org/pandas-docs/stable/text.html

Like to comment? Feel free to send me an email or reach out on Twitter.