Pandas Tutorial

Creating Objects

Viewing Data

Selection

Manipulating Data

Grouping Data

Merging, Joining and Concatenating

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Operations

Visualization

Applications and Projects

Mean absolute deviation of the values for the requested axis in Pandas

The Mean Absolute Deviation (MAD) is a measure of the dispersion of a set of data points. It calculates the average of the absolute differences from the mean. In pandas, the mad() function computes the MAD for a given Series or along a particular axis of a DataFrame.

Here's a tutorial on how to compute the Mean Absolute Deviation using pandas:

1. Setup:

Firstly, ensure you have pandas installed:

pip install pandas

2. Import Necessary Libraries:

import pandas as pd

3. Create a Sample DataFrame:

data = {
    'A': [1, 2, 3, 4, 5],
    'B': [5, 6, 7, 8, 10],
    'C': [9, 8, 7, 6, 5]
}
df = pd.DataFrame(data)
print(df)

4. Compute MAD for a Series:

You can compute the MAD for a particular column (Series) like this:

mad_A = df['A'].mad()
print(f"Mean Absolute Deviation of Column 'A': {mad_A}")

5. Compute MAD for a DataFrame:

To compute the MAD for all columns in a DataFrame:

mad_all = df.mad()
print("Mean Absolute Deviation for each column:")
print(mad_all)

6. Explanation:

Here's a simple breakdown of how MAD is calculated:

For a set of values:

x1​,x2​,…,xn​

With a mean of:

The MAD is:

MAD=n1​∑i=1n​∣xi​−xˉ∣

So, it's the average of the absolute deviations of values from the mean.

Summary:

The Mean Absolute Deviation offers a straightforward and interpretable measure of data dispersion. Pandas makes the computation of MAD very simple with its built-in .mad() method. It's a useful metric when you want to understand the average variation in your dataset without squaring the deviations (as is done in variance and standard deviation).

  1. Calculate mean absolute deviation in Pandas DataFrame:

    • The mean absolute deviation measures the average absolute difference of each data point from the mean.
    mad_value = df.mad().mean()
    
  2. Mean absolute deviation by axis in Pandas:

    • Compute MAD along a specific axis (e.g., rows or columns).
    row_mad = df.mad(axis=1)
    column_mad = df.mad(axis=0)
    
  3. Pandas mad() function examples:

    • The .mad() function directly computes mean absolute deviation.
    mad_value = df.mad()
    
  4. Compute MAD for a specific column in Pandas:

    • Calculate MAD for a particular column.
    column_mad = df['Column_Name'].mad()
    
  5. Axis-wise mean absolute deviation in Pandas:

    • Calculate MAD along rows or columns.
    row_mad = df.mad(axis=1)
    column_mad = df.mad(axis=0)
    
  6. Using mad() to calculate mean absolute deviation:

    • Direct application of the .mad() function for MAD computation.
    mad_value = df.mad()
    
  7. Pandas mad vs std for dispersion:

    • Compare MAD and standard deviation for measuring data dispersion.
    mad_value = df.mad()
    std_value = df.std()
    
  8. Custom mean absolute deviation function in Pandas:

    • Implement a custom MAD function for specific requirements.
    def custom_mad(data):
        return abs(data - data.mean()).mean()
    
    mad_value = df.apply(custom_mad)
    
  9. Calculate robust mean absolute deviation in Pandas:

    • Use the median instead of the mean for a robust MAD calculation.
    robust_mad = df.mad(axis=0, center='median')