Pandas Tutorial
Creating Objects
Viewing Data
Selection
Manipulating Data
Grouping Data
Merging, Joining and Concatenating
Working with Date and Time
Working With Text Data
Working with CSV and Excel files
Operations
Visualization
Applications and Projects
The .as_matrix()
method has been deprecated since pandas version 0.23.0 and you should use .values
or .to_numpy()
instead.
In this tutorial, I'll show you how to convert a Series or DataFrame object to a Numpy array using the recommended methods.
First, make sure you have both pandas and numpy installed:
pip install pandas numpy
import pandas as pd import numpy as np
Using .values
:
s = pd.Series([1, 2, 3, 4, 5]) array_values = s.values print(type(array_values)) print(array_values)
Using .to_numpy()
:
s = pd.Series([1, 2, 3, 4, 5]) array_to_numpy = s.to_numpy() print(type(array_to_numpy)) print(array_to_numpy)
Using .values
:
df = pd.DataFrame({ 'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9] }) array_df_values = df.values print(type(array_df_values)) print(array_df_values)
Using .to_numpy()
:
df = pd.DataFrame({ 'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9] }) array_df_to_numpy = df.to_numpy() print(type(array_df_to_numpy)) print(array_df_to_numpy)
While .as_matrix()
was once used to convert pandas objects to numpy arrays, it's now deprecated and you should use .values
or the more explicit .to_numpy()
method. Both approaches return the data stored within pandas structures as Numpy arrays, which can be useful for operations requiring Numpy functionalities or for interfacing with other libraries.
DataFrame to NumPy array using .values
in Pandas:
.values
attribute to convert a DataFrame to a NumPy array.import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_array = df.values
Convert Pandas Series to NumPy array with .to_numpy()
:
.to_numpy()
method for Series to NumPy array conversion.import pandas as pd series = pd.Series([1, 2, 3]) numpy_array = series.to_numpy()
Deprecated .as_matrix()
replacement in Pandas:
.to_numpy()
as a replacement for .as_matrix()
.import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_array = df.as_matrix()
NumPy array conversion in Pandas using .values
:
.values
attribute for DataFrame to NumPy conversion.import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_array = df.values
Using .to_numpy()
for DataFrame to NumPy array conversion:
.to_numpy()
method for DataFrame conversion.import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_array = df.to_numpy()
Convert specific column to NumPy array in Pandas:
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_column = df['A'].values
Pandas DataFrame to NumPy array with and without .values
:
.values
.import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) numpy_array_with_values = df.values numpy_array_without_values = df.to_numpy()
NumPy array conversion in Pandas for machine learning:
import pandas as pd from sklearn.model_selection import train_test_split df = pd.read_csv('data.csv') X = df.drop('target', axis=1).values y = df['target'].values X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)