Pandas Tutorial

Creating Objects

Viewing Data

Selection

Manipulating Data

Grouping Data

Merging, Joining and Concatenating

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Operations

Visualization

Applications and Projects

Working with Pandas and XlsxWriter

XlsxWriter is a Python module that can be used to write text, numbers, formulas, and hyperlinks to multiple worksheets in an Excel 2007+ XLSX file. It's very popular for tasks where high customization of the Excel output is desired.

Pandas integrates with XlsxWriter, allowing for easy saving of DataFrames to Excel while also providing advanced functionalities.

Here's a tutorial on how to work with Pandas and XlsxWriter:

1. Installation:

First, install the necessary libraries:

pip install pandas xlsxwriter

2. Basic DataFrame to Excel:

import pandas as pd

# Sample dataframe
df = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'Age': [25, 30],
    'City': ['NY', 'LA']
})

# Save to Excel
with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Sheet1')

3. Applying Some Formatting:

Using XlsxWriter, you can easily apply formatting:

with pd.ExcelWriter('output_formatted.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Sheet1', index=False)
    
    # Access the XlsxWriter workbook and worksheet objects
    workbook  = writer.book
    worksheet = writer.sheets['Sheet1']
    
    # Get the dimensions of the dataframe
    (max_row, max_col) = df.shape
    
    # Create a format for a blue cell
    blue_format = workbook.add_format({'bg_color': 'blue'})
    
    # Set the format of the second column to our blue format
    worksheet.set_column(1, 1, cell_format=blue_format)

4. Inserting Charts:

XlsxWriter can also add charts:

with pd.ExcelWriter('output_chart.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Sheet1', index=False)
    
    workbook  = writer.book
    worksheet = writer.sheets['Sheet1']
    
    # Create a new chart object
    chart = workbook.add_chart({'type': 'column'})
    
    # Configure the first series using data from the DataFrame
    chart.add_series({
        'name': 'Sheet1!$B$1',
        'categories': '=Sheet1!$A$2:$A$3',
        'values': '=Sheet1!$B$2:$B$3',
    })

    # Insert the chart into the worksheet
    worksheet.insert_chart('D2', chart)

5. Conditional Formatting:

You can also apply conditional formatting to Excel outputs:

with pd.ExcelWriter('output_conditional.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Sheet1', index=False)

    workbook  = writer.book
    worksheet = writer.sheets['Sheet1']

    # Apply conditional formatting to age column where age > 28 is red
    red_format = workbook.add_format({'bg_color': 'red', 'font_color': 'white'})
    worksheet.conditional_format('B2:B3', {'type': 'cell', 'criteria': '>', 'value': 28, 'format': red_format})

These are just a few of the numerous features XlsxWriter offers. With its integration into Pandas, you can handle complex Excel outputs with ease. For more advanced functionalities, you should refer to the XlsxWriter documentation and Pandas IO docs.

  1. Export Pandas DataFrame to Excel using XlsxWriter:

    • Description: Export a Pandas DataFrame to an Excel file using the XlsxWriter library.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Column1': [1, 2, 3], 'Column2': ['A', 'B', 'C']})
      
      # Export DataFrame to Excel
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
  2. Writing Excel files with Pandas and XlsxWriter:

    • Description: Write data to an Excel file using Pandas and XlsxWriter, with multiple sheets.
    • Code:
      import pandas as pd
      
      # Create DataFrames
      df1 = pd.DataFrame({'Data1': [1, 2, 3]})
      df2 = pd.DataFrame({'Data2': ['A', 'B', 'C']})
      
      # Export multiple DataFrames to Excel
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df1.to_excel(writer, sheet_name='Sheet1', index=False)
          df2.to_excel(writer, sheet_name='Sheet2', index=False)
      
  3. Custom formatting in Excel with Pandas and XlsxWriter:

    • Description: Apply custom formatting to cells in an Excel file using Pandas and XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with custom formatting
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Add custom formatting (e.g., bold header)
          header_format = workbook.add_format({'bold': True, 'text_wrap': True, 'valign': 'vcenter'})
          worksheet.set_row(0, None, header_format)
      
  4. Creating Excel charts with Pandas and XlsxWriter:

    • Description: Create Excel charts (e.g., line chart) from a Pandas DataFrame using XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with a line chart
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Create a line chart
          chart = workbook.add_chart({'type': 'line'})
          chart.add_series({'values': 'Sheet1!$B$2:$B$4'})
          worksheet.insert_chart('D2', chart)
      
  5. Advanced Excel features with Pandas and XlsxWriter:

    • Description: Utilize advanced Excel features, such as data validation, using Pandas and XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with data validation
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Add data validation to a cell
          worksheet.data_validation('B2', {'validate': 'integer', 'criteria': 'between', 'minimum': 1, 'maximum': 100})
      
  6. Styling and theming in Excel with Pandas and XlsxWriter:

    • Description: Apply styles and themes to an Excel file created with Pandas and XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with styles and theme
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Apply styles to cells (e.g., bold header)
          header_format = workbook.add_format({'bold': True, 'text_wrap': True, 'valign': 'vcenter'})
          worksheet.set_row(0, None, header_format)
      
          # Apply a theme to the workbook
          workbook.set_theme(1)
      
  7. Adding images to Excel using Pandas and XlsxWriter:

    • Description: Insert images into an Excel file created with Pandas and XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with an inserted image
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Insert an image into the worksheet
          worksheet.insert_image('D2', 'logo.png', {'x_scale': 0.5, 'y_scale': 0.5})
      
  8. Conditional formatting in Excel with Pandas and XlsxWriter:

    • Description: Apply conditional formatting to cells in an Excel file using Pandas and XlsxWriter.
    • Code:
      import pandas as pd
      
      # Create DataFrame
      df = pd.DataFrame({'Data': [10, 20, 30]})
      
      # Export DataFrame to Excel with conditional formatting
      with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
          df.to_excel(writer, sheet_name='Sheet1', index=False)
      
          # Get the xlsxwriter workbook and worksheet objects
          workbook = writer.book
          worksheet = writer.sheets['Sheet1']
      
          # Add conditional formatting to highlight cells with values > 20
          cell_format = workbook.add_format({'bg_color': '#FFC7CE', 'font_color': '#9C0006'})
          worksheet.conditional_format('B2:B4', {'type': 'cell', 'criteria': '>', 'value': 20, 'format': cell_format})