Pandas Tutorial

Creating Objects

Viewing Data

Selection

Manipulating Data

Grouping Data

Merging, Joining and Concatenating

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Operations

Visualization

Applications and Projects

Timestamp using Pandas

In this tutorial, we'll cover the fundamentals of working with timestamps in pandas.

Timestamp in Pandas

A Timestamp represents a single point in time. In pandas, it's a replacement for Python's native datetime, but is based on the more efficient numpy.datetime64 data type.

1. Setup:

Make sure you have pandas installed:

pip install pandas

2. Import Necessary Libraries:

import pandas as pd

3. Creating a Timestamp:

You can create a timestamp using pd.Timestamp:

ts = pd.Timestamp('2023-08-31')
print(ts)

4. Creating Timestamp from Various Formats:

Timestamps can be created from various string formats:

ts1 = pd.Timestamp('2023-08-31 12:45:30')
ts2 = pd.Timestamp('2023/08/31')
ts3 = pd.Timestamp('31/08/2023')
ts4 = pd.Timestamp('2023, 31 August')
print(ts1, ts2, ts3, ts4, sep="\n")

5. Current Date and Time:

To get the current date and time:

current = pd.Timestamp.now()
print(current)

6. Timezones:

By default, Timestamp objects are timezone-naive. To localize a timestamp:

ts_tz = pd.Timestamp('2023-08-31 12:45:30').tz_localize('Asia/Tokyo')
print(ts_tz)

To convert to another timezone:

ts_tz_ny = ts_tz.tz_convert('America/New_York')
print(ts_tz_ny)

7. Timestamp Attributes and Methods:

Once you have a Timestamp, there are numerous attributes and methods you can access:

  • Attributes:

    print(ts.year)
    print(ts.month)
    print(ts.day)
    print(ts.hour)
    
  • Date-related methods:

    print(ts.to_period('D'))  # Convert to a period (in this case, daily frequency)
    print(ts.weekday())       # Returns day of the week (Monday=0, Sunday=6)
    

8. Date Offsets:

You can add or subtract time from a timestamp using date offsets:

week_later = ts + pd.DateOffset(weeks=1)
print(week_later)

two_days_prior = ts - pd.DateOffset(days=2)
print(two_days_prior)

9. Timestamp in DataFrame and Series:

Timestamps can be part of DataFrame and Series objects, which allows for more complex operations like time-based indexing and time series analysis:

dates = pd.date_range('20230101', periods=6)
df = pd.DataFrame({'date': dates, 'value': range(6)})
print(df)

# Time-based indexing
print(df[df['date'] > '2023-01-03'])

10. Summary:

Pandas provides the Timestamp class as a powerful tool for handling date and time data, offering numerous built-in methods and attributes for common operations. It's the foundation for much of pandas' time series functionality, making it essential for anyone working with time-based data in Python.

  1. Create Timestamp in Pandas DataFrame:

    • Description: Use the pd.to_datetime() function to create a Pandas DataFrame with Timestamps.
    • Code:
      import pandas as pd
      
      # Create DataFrame with Timestamps
      df = pd.DataFrame({'timestamp': pd.to_datetime(['2022-01-01', '2022-01-02', '2022-01-03'])})
      
  2. Working with Timestamps in Pandas:

    • Description: Utilize Pandas to work with Timestamps, which are represented as pd.Timestamp objects.
    • Code:
      import pandas as pd
      
      # Create Timestamp
      timestamp = pd.Timestamp('2022-01-01 12:30:45')
      
      # Access components
      year = timestamp.year
      month = timestamp.month
      day = timestamp.day
      hour = timestamp.hour
      minute = timestamp.minute
      second = timestamp.second
      
  3. Convert string to Timestamp in Pandas:

    • Description: Use pd.to_datetime() to convert a string to a Pandas Timestamp.
    • Code:
      import pandas as pd
      
      # Convert string to Timestamp
      timestamp = pd.to_datetime('2022-01-01 12:30:45')
      
  4. Indexing and selecting by Timestamp in Pandas:

    • Description: Index and select data based on Timestamps in a Pandas DataFrame.
    • Code:
      import pandas as pd
      
      # Create DataFrame with Timestamps
      df = pd.DataFrame({'value': [10, 20, 30]}, index=pd.to_datetime(['2022-01-01', '2022-01-02', '2022-01-03']))
      
      # Select data based on Timestamp
      selected_data = df.loc['2022-01-02':'2022-01-03']
      
  5. Pandas to_datetime for Timestamp conversion:

    • Description: Use pd.to_datetime() for converting various formats to Pandas Timestamps.
    • Code:
      import pandas as pd
      
      # Convert different formats to Timestamp
      timestamp_1 = pd.to_datetime('2022-01-01')
      timestamp_2 = pd.to_datetime('2022-01-01 12:30:45')
      timestamp_3 = pd.to_datetime(1641000000, unit='s')
      
  6. Resampling time series data with Pandas Timestamp:

    • Description: Use the .resample() method to resample time series data based on Timestamps.
    • Code:
      import pandas as pd
      
      # Create DataFrame with Timestamps
      df = pd.DataFrame({'value': [10, 20, 30]}, index=pd.to_datetime(['2022-01-01', '2022-01-02', '2022-01-03']))
      
      # Resample to daily frequency
      resampled_df = df.resample('D').sum()
      
  7. Manipulating Timestamps in Pandas Series:

    • Description: Manipulate Timestamps in a Pandas Series using arithmetic operations.
    • Code:
      import pandas as pd
      
      # Create Series with Timestamps
      series = pd.Series(pd.to_datetime(['2022-01-01', '2022-01-02', '2022-01-03']))
      
      # Add 1 day to Timestamps
      series += pd.Timedelta(days=1)
      
  8. Handling time zones with Pandas Timestamp:

    • Description: Use the .tz_localize() and .tz_convert() methods to handle time zones in Pandas Timestamps.
    • Code:
      import pandas as pd
      
      # Create Timestamp with time zone
      timestamp_with_tz = pd.Timestamp('2022-01-01 12:30:45', tz='UTC')
      
      # Convert time zone
      timestamp_converted = timestamp_with_tz.tz_convert('America/New_York')
      
  9. Timestamp arithmetic in Pandas:

    • Description: Perform arithmetic operations on Timestamps in Pandas.
    • Code:
      import pandas as pd
      
      # Create Timestamps
      timestamp_1 = pd.Timestamp('2022-01-01 12:30:45')
      timestamp_2 = pd.Timestamp('2022-01-02 15:00:00')
      
      # Calculate time difference
      time_difference = timestamp_2 - timestamp_1