Pandas Tutorial
Creating Objects
Viewing Data
Selection
Manipulating Data
Grouping Data
Merging, Joining and Concatenating
Working with Date and Time
Working With Text Data
Working with CSV and Excel files
Operations
Visualization
Applications and Projects
The replace()
method in pandas is a versatile tool that lets you replace values in a Series (or DataFrame). In this tutorial, we'll focus on how to replace text values in a Series.
series.replace()
in PandasEnsure you have pandas installed:
pip install pandas
import pandas as pd
Let's make a Series with some textual data:
s = pd.Series(['apple', 'banana', 'cherry', 'apple', 'date', 'fig', 'apple']) print(s)
To replace the word "apple" with "apricot":
s_replaced = s.replace('apple', 'apricot') print(s_replaced)
You can replace multiple values by passing two lists: the first one containing the values to find and the second one containing their respective replacements.
To replace "apple" with "apricot" and "banana" with "blueberry":
s_multi_replaced = s.replace(['apple', 'banana'], ['apricot', 'blueberry']) print(s_multi_replaced)
Alternatively, you can use a dictionary for the same purpose:
replace_dict = { 'apple': 'apricot', 'banana': 'blueberry' } s_dict_replaced = s.replace(replace_dict) print(s_dict_replaced)
The replace()
method also supports regular expressions. Let's say we want to replace all fruit names that end in the letter 'e' with 'fruit':
s_regex_replaced = s.replace(r'.*e$', 'fruit', regex=True) print(s_regex_replaced)
In this example, the regular expression .*e$
matches any string ending with the letter 'e'.
The replace()
method in pandas is powerful and can handle not just simple replacements but also complex patterns with the help of regular expressions. It's a valuable tool for data cleaning and manipulation when working with textual data in a Series.
Using replace() to substitute values in Pandas Series:
replace()
method in Pandas is a versatile function that allows you to substitute specified values with other values in a Series.import pandas as pd # Sample Series data = pd.Series(['apple', 'banana', 'orange', 'apple']) # Replace 'apple' with 'pear' data.replace('apple', 'pear', inplace=True)
Replace specific strings in Pandas Series:
replace()
method to replace specific strings in a Pandas Series.import pandas as pd # Sample Series data = pd.Series(['red', 'green', 'blue', 'red']) # Replace 'red' with 'yellow' data.replace('red', 'yellow', inplace=True)
String replacement in Pandas using series.replace():
replace()
method can be applied to a Pandas Series to perform string replacement.import pandas as pd # Sample Series data = pd.Series(['cat', 'dog', 'bird', 'cat']) # Replace 'cat' with 'fish' data.replace('cat', 'fish', inplace=True)
Conditional text replacement in Pandas Series:
import pandas as pd # Sample Series data = pd.Series([10, 20, 30, 40]) # Replace values greater than 30 with 999 data.replace(data[data > 30], 999, inplace=True)
Replace multiple values in Pandas Series:
import pandas as pd # Sample Series data = pd.Series(['A', 'B', 'C', 'A']) # Replace 'A' with 'X' and 'B' with 'Y' replacements = {'A': 'X', 'B': 'Y'} data.replace(replacements, inplace=True)
Case-insensitive string replacement in Pandas:
case
parameter.import pandas as pd # Sample Series data = pd.Series(['Apple', 'banana', 'Orange', 'apple']) # Replace 'apple' with 'pear' (case-insensitive) data.replace('apple', 'pear', inplace=True, case=False)
Replace NaN values with a string in Pandas Series:
import pandas as pd # Sample Series with NaN values data = pd.Series(['A', 'B', pd.NA, 'D']) # Replace NaN with 'Unknown' data.replace(pd.NA, 'Unknown', inplace=True)
Regex-based text replacement in Pandas using replace():
import pandas as pd # Sample Series data = pd.Series(['apple', 'banana', 'orange', 'pear']) # Replace words starting with 'a' or 'o' with 'fruit' data.replace(to_replace=r'^[ao].*', value='fruit', regex=True, inplace=True)