Pandas Tutorial
Creating Objects
Viewing Data
Selection
Manipulating Data
Grouping Data
Merging, Joining and Concatenating
Working with Date and Time
Working With Text Data
Working with CSV and Excel files
Operations
Visualization
Applications and Projects
XlsxWriter
is a Python module that can be used to write text, numbers, formulas, and hyperlinks to multiple worksheets in an Excel 2007+ XLSX file. It's very popular for tasks where high customization of the Excel output is desired.
Pandas integrates with XlsxWriter
, allowing for easy saving of DataFrames to Excel while also providing advanced functionalities.
Here's a tutorial on how to work with Pandas and XlsxWriter
:
1. Installation:
First, install the necessary libraries:
pip install pandas xlsxwriter
2. Basic DataFrame to Excel:
import pandas as pd # Sample dataframe df = pd.DataFrame({ 'Name': ['Alice', 'Bob'], 'Age': [25, 30], 'City': ['NY', 'LA'] }) # Save to Excel with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1')
3. Applying Some Formatting:
Using XlsxWriter
, you can easily apply formatting:
with pd.ExcelWriter('output_formatted.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Access the XlsxWriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Get the dimensions of the dataframe (max_row, max_col) = df.shape # Create a format for a blue cell blue_format = workbook.add_format({'bg_color': 'blue'}) # Set the format of the second column to our blue format worksheet.set_column(1, 1, cell_format=blue_format)
4. Inserting Charts:
XlsxWriter
can also add charts:
with pd.ExcelWriter('output_chart.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) workbook = writer.book worksheet = writer.sheets['Sheet1'] # Create a new chart object chart = workbook.add_chart({'type': 'column'}) # Configure the first series using data from the DataFrame chart.add_series({ 'name': 'Sheet1!$B$1', 'categories': '=Sheet1!$A$2:$A$3', 'values': '=Sheet1!$B$2:$B$3', }) # Insert the chart into the worksheet worksheet.insert_chart('D2', chart)
5. Conditional Formatting:
You can also apply conditional formatting to Excel outputs:
with pd.ExcelWriter('output_conditional.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) workbook = writer.book worksheet = writer.sheets['Sheet1'] # Apply conditional formatting to age column where age > 28 is red red_format = workbook.add_format({'bg_color': 'red', 'font_color': 'white'}) worksheet.conditional_format('B2:B3', {'type': 'cell', 'criteria': '>', 'value': 28, 'format': red_format})
These are just a few of the numerous features XlsxWriter
offers. With its integration into Pandas, you can handle complex Excel outputs with ease. For more advanced functionalities, you should refer to the XlsxWriter
documentation and Pandas IO docs.
Export Pandas DataFrame to Excel using XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Column1': [1, 2, 3], 'Column2': ['A', 'B', 'C']}) # Export DataFrame to Excel with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False)
Writing Excel files with Pandas and XlsxWriter:
import pandas as pd # Create DataFrames df1 = pd.DataFrame({'Data1': [1, 2, 3]}) df2 = pd.DataFrame({'Data2': ['A', 'B', 'C']}) # Export multiple DataFrames to Excel with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df1.to_excel(writer, sheet_name='Sheet1', index=False) df2.to_excel(writer, sheet_name='Sheet2', index=False)
Custom formatting in Excel with Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with custom formatting with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Add custom formatting (e.g., bold header) header_format = workbook.add_format({'bold': True, 'text_wrap': True, 'valign': 'vcenter'}) worksheet.set_row(0, None, header_format)
Creating Excel charts with Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with a line chart with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Create a line chart chart = workbook.add_chart({'type': 'line'}) chart.add_series({'values': 'Sheet1!$B$2:$B$4'}) worksheet.insert_chart('D2', chart)
Advanced Excel features with Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with data validation with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Add data validation to a cell worksheet.data_validation('B2', {'validate': 'integer', 'criteria': 'between', 'minimum': 1, 'maximum': 100})
Styling and theming in Excel with Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with styles and theme with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Apply styles to cells (e.g., bold header) header_format = workbook.add_format({'bold': True, 'text_wrap': True, 'valign': 'vcenter'}) worksheet.set_row(0, None, header_format) # Apply a theme to the workbook workbook.set_theme(1)
Adding images to Excel using Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with an inserted image with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Insert an image into the worksheet worksheet.insert_image('D2', 'logo.png', {'x_scale': 0.5, 'y_scale': 0.5})
Conditional formatting in Excel with Pandas and XlsxWriter:
import pandas as pd # Create DataFrame df = pd.DataFrame({'Data': [10, 20, 30]}) # Export DataFrame to Excel with conditional formatting with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer: df.to_excel(writer, sheet_name='Sheet1', index=False) # Get the xlsxwriter workbook and worksheet objects workbook = writer.book worksheet = writer.sheets['Sheet1'] # Add conditional formatting to highlight cells with values > 20 cell_format = workbook.add_format({'bg_color': '#FFC7CE', 'font_color': '#9C0006'}) worksheet.conditional_format('B2:B4', {'type': 'cell', 'criteria': '>', 'value': 20, 'format': cell_format})