Pandas Tutorial
Creating Objects
Viewing Data
Selection
Manipulating Data
Grouping Data
Merging, Joining and Concatenating
Working with Date and Time
Working With Text Data
Working with CSV and Excel files
Operations
Visualization
Applications and Projects
Concatenating strings in pandas is a common task, especially during data preparation and cleaning. This tutorial will walk you through various ways you can concatenate strings in pandas using DataFrames and Series.
First, ensure you have pandas installed:
pip install pandas
import pandas as pd
df = pd.DataFrame({ 'First_Name': ['John', 'Jane', 'Doe'], 'Last_Name': ['Doe', 'Smith', 'Johnson'] }) print(df)
+
Operator:The simplest way to concatenate strings in pandas is using the +
operator:
df['Full_Name'] = df['First_Name'] + ' ' + df['Last_Name'] print(df)
.str.cat()
method:This method is more flexible, especially if you want to concatenate more than two strings:
df['Full_Name'] = df['First_Name'].str.cat(df['Last_Name'], sep=' ') print(df)
To concatenate more columns, you can use:
# Assuming there's a 'Middle_Name' column df['Full_Name'] = df['First_Name'].str.cat([df['Middle_Name'], df['Last_Name']], sep=' ')
str
accessor directly:This is useful if you have missing values and you want to handle them explicitly:
df['Full_Name'] = df['First_Name'].str + ' ' + df['Last_Name'].str print(df)
apply()
with a lambda function:For more complex concatenation, or when you need to incorporate some logic, using apply()
with a lambda function can be helpful:
df['Full_Name'] = df.apply(lambda row: row['First_Name'] + ' ' + row['Last_Name'], axis=1) print(df)
If your columns have missing values (NaN), the result of concatenation will also be NaN. To handle this, you can use the fillna()
method:
# Assuming some rows have missing values df['First_Name'].fillna('', inplace=True) df['Last_Name'].fillna('', inplace=True) df['Full_Name'] = df['First_Name'] + ' ' + df['Last_Name'] print(df)
Concatenating strings in pandas is quite straightforward, and the library offers multiple methods to cater to different needs. Whether you're using simple operators or more advanced functions like apply()
, pandas provides the flexibility to handle text data efficiently.
Concatenating strings in Pandas Series:
+
operator or the .str.cat()
method to concatenate strings in a Pandas Series.import pandas as pd # Sample Series data = pd.Series(['Hello', ' ', 'World']) # Concatenate strings using the + operator result = data[0] + data[1] + data[2]
String concatenation in Pandas DataFrame:
import pandas as pd # Sample DataFrame df = pd.DataFrame({'A': ['Hello', 'Good'], 'B': [' ', 'Morning']}) # Concatenate strings column-wise result = df['A'] + df['B']
Using + operator to concatenate strings in Pandas:
+
operator to concatenate strings in Pandas Series or DataFrames.import pandas as pd # Sample Series data = pd.Series(['Hello', ' ', 'World']) # Concatenate strings using the + operator result = data[0] + data[1] + data[2]
Concatenate string columns in Pandas:
import pandas as pd # Sample DataFrame df = pd.DataFrame({'A': ['Hello', 'Good'], 'B': [' ', 'Morning']}) # Concatenate string columns result = df['A'] + df['B']
Joining strings in Pandas Series:
.str.join()
method to join strings in a Pandas Series.import pandas as pd # Sample Series of lists data = pd.Series([['apple', 'orange'], ['banana', 'grape']]) # Join strings in each list using ',' result = data.str.join(',')
Combine two string columns in Pandas:
import pandas as pd # Sample DataFrame df = pd.DataFrame({'A': ['Hello', 'Good'], 'B': [' ', 'Morning']}) # Combine columns A and B into a new column C df['C'] = df['A'] + df['B']
Concatenate strings with separator in Pandas:
.str.cat()
method.import pandas as pd # Sample Series data = pd.Series(['apple', 'orange', 'banana']) # Concatenate strings with ', ' separator result = data.str.cat(sep=', ')
String concatenation with conditions in Pandas:
numpy.where()
function.import pandas as pd import numpy as np # Sample DataFrame df = pd.DataFrame({'A': ['apple', 'banana', 'orange'], 'B': [True, False, True]}) # Concatenate ' is a fruit' if B is True, else '' df['Result'] = np.where(df['B'], df['A'] + ' is a fruit', '')
Pandas str.cat() method for string concatenation:
.str.cat()
method for efficient string concatenation in Pandas.import pandas as pd # Sample Series data = pd.Series(['apple', 'orange', 'banana']) # Concatenate strings with ', ' separator using str.cat() result = data.str.cat(sep=', ')