Pandas Tutorial

Creating Objects

Viewing Data

Selection

Manipulating Data

Grouping Data

Merging, Joining and Concatenating

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Operations

Visualization

Applications and Projects

Convert series or dataframe object to Numpy-array using .as_matrix() in Pandas

The .as_matrix() method has been deprecated since pandas version 0.23.0 and you should use .values or .to_numpy() instead.

In this tutorial, I'll show you how to convert a Series or DataFrame object to a Numpy array using the recommended methods.

Convert Series or DataFrame to Numpy Array using Pandas

1. Setup:

First, make sure you have both pandas and numpy installed:

pip install pandas numpy

2. Import Necessary Libraries:

import pandas as pd
import numpy as np

3. Convert Series to Numpy Array:

Using .values:

s = pd.Series([1, 2, 3, 4, 5])
array_values = s.values
print(type(array_values))
print(array_values)

Using .to_numpy():

s = pd.Series([1, 2, 3, 4, 5])
array_to_numpy = s.to_numpy()
print(type(array_to_numpy))
print(array_to_numpy)

4. Convert DataFrame to Numpy Array:

Using .values:

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
})

array_df_values = df.values
print(type(array_df_values))
print(array_df_values)

Using .to_numpy():

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
})

array_df_to_numpy = df.to_numpy()
print(type(array_df_to_numpy))
print(array_df_to_numpy)

5. Summary:

While .as_matrix() was once used to convert pandas objects to numpy arrays, it's now deprecated and you should use .values or the more explicit .to_numpy() method. Both approaches return the data stored within pandas structures as Numpy arrays, which can be useful for operations requiring Numpy functionalities or for interfacing with other libraries.

  1. DataFrame to NumPy array using .values in Pandas:

    • Use the .values attribute to convert a DataFrame to a NumPy array.
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_array = df.values
    
  2. Convert Pandas Series to NumPy array with .to_numpy():

    • Use the .to_numpy() method for Series to NumPy array conversion.
    import pandas as pd
    
    series = pd.Series([1, 2, 3])
    numpy_array = series.to_numpy()
    
  3. Deprecated .as_matrix() replacement in Pandas:

    • While deprecated, you can use .to_numpy() as a replacement for .as_matrix().
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_array = df.as_matrix()
    
  4. NumPy array conversion in Pandas using .values:

    • Another example using the .values attribute for DataFrame to NumPy conversion.
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_array = df.values
    
  5. Using .to_numpy() for DataFrame to NumPy array conversion:

    • Demonstrate the .to_numpy() method for DataFrame conversion.
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_array = df.to_numpy()
    
  6. Convert specific column to NumPy array in Pandas:

    • Select a specific column and convert it to a NumPy array.
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_column = df['A'].values
    
  7. Pandas DataFrame to NumPy array with and without .values:

    • Compare DataFrame to NumPy conversion with and without using .values.
    import pandas as pd
    
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    numpy_array_with_values = df.values
    numpy_array_without_values = df.to_numpy()
    
  8. NumPy array conversion in Pandas for machine learning:

    • Show a common use case for converting Pandas DataFrame to NumPy array for machine learning.
    import pandas as pd
    from sklearn.model_selection import train_test_split
    
    df = pd.read_csv('data.csv')
    X = df.drop('target', axis=1).values
    y = df['target'].values
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)