R Tutorial

Fundamentals of R

Variables

Input and Output

Decision Making

Control Flow

Functions

Strings

Vectors

Lists

Arrays

Matrices

Factors

DataFrames

Object Oriented Programming

Error Handling

File Handling

Packages in R

Data Interfaces

Data Visualization

Statistics

Machine Learning with R

Data Wrangling - Working with Tibbles in R

Tibbles are a modern reimagining of data frames in R and come as part of the tidyverse suite of packages. Tibbles are similar to data frames but have some notable differences that make them more user-friendly, especially during the data wrangling process.

1. Installing and Loading Required Packages:

install.packages("tidyverse")
library(tidyverse)

2. Creating Tibbles:

a. From a data frame:

df <- data.frame(x = 1:5, y = letters[1:5])
tb <- as_tibble(df)

b. Using tibble():

tb <- tibble(x = 1:5, y = letters[1:5])

3. Tibble Advantages Over Data Frames:

  • Printing: Tibbles don't print too many rows by default, making it easier to view in the console.

  • Subsetting: With a tibble, if you subset with square brackets and only select one column, it remains a tibble. With a data frame, it becomes a vector.

  • Column Data Types: Tibbles are less strict when creating columns of different data types.

4. Accessing and Manipulating Tibble Data:

a. Accessing columns:

Using the $ operator:

tb$y

Using the double square bracket:

tb[[2]]

b. Adding columns:

tb <- tb %>%
  mutate(z = x * 2)

c. Renaming columns:

tb <- tb %>%
  rename(new_x = x)

d. Removing columns:

tb <- tb %>%
  select(-new_x)

5. Accessing Metadata:

Tibbles provide easy access to their metadata:

a. Column data types:

glimpse(tb)

b. Number of rows:

nrow(tb)

c. Number of columns:

ncol(tb)

6. Working with Row Data:

a. Adding rows:

You can use add_row() to add new rows:

tb <- add_row(tb, x = 6, y = "f", z = 12)

b. Removing rows:

tb <- tb %>%
  filter(x != 6)

7. Tibble-specific functions:

a. enframe(): Converts a named vector into a two-column tibble.

vec <- c(a = 1, b = 2, c = 3)
enframe(vec)

b. deframe(): Converts a two-column tibble into a named vector.

deframe(enframe(vec))

8. Converting Tibbles Back to Data Frames:

In case you need to convert a tibble back to a regular data frame:

df <- as.data.frame(tb)

Conclusion:

Tibbles are an essential tool for data wrangling in the modern R ecosystem, offering various advantages over traditional data frames, especially in terms of usability. As you explore the tidyverse, you'll find that many functions return tibbles by default, making it a valuable structure to understand and use effectively.

  1. Tibble vs data.frame in R:

    • Tibble is a modern and user-friendly alternative to the traditional data.frame.
    # Creating a data.frame
    df <- data.frame(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
    # Creating a tibble
    library(tibble)
    tbl <- tibble(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
  2. Data wrangling with tibbles examples:

    • Tibbles offer enhanced data wrangling capabilities compared to data.frames.
    # Using dplyr and tibble for data wrangling
    library(dplyr)
    library(tibble)
    
    df <- data.frame(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    tbl <- as_tibble(df)
    
    # Data wrangling with tibbles
    wrangled_tbl <- tbl %>%
      filter(ID > 1) %>%
      mutate(NewColumn = nchar(Name))
    
  3. R tibble functions and operations:

    • Tibbles have specific functions and operations that make data manipulation easier.
    # Using tibble functions
    library(tibble)
    
    tbl <- tibble(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
    # Adding a new column
    tbl <- add_column(tbl, NewColumn = c(10, 20, 30))
    
    # Selecting specific columns
    selected_tbl <- select(tbl, ID, Name)
    
  4. Introduction to tibbles in R:

    • Tibbles are a part of the tidyverse ecosystem, designed for better data manipulation.
    # Creating a tibble
    library(tibble)
    tbl <- tibble(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
  5. Using tibbles for data manipulation in R:

    • Tibbles integrate seamlessly with tidyverse packages for effective data manipulation.
    # Using tibbles with dplyr
    library(dplyr)
    library(tibble)
    
    tbl <- tibble(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
    # Data manipulation with dplyr and tibbles
    manipulated_tbl <- tbl %>%
      filter(ID > 1) %>%
      mutate(NewColumn = nchar(Name))
    
  6. Converting data.frame to tibble in R:

    • Convert a data.frame to a tibble using the as_tibble function.
    # Converting data.frame to tibble
    library(tibble)
    
    df <- data.frame(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    tbl <- as_tibble(df)
    
  7. R tidyverse and tibble usage:

    • Tibbles are a key component of the tidyverse, providing a consistent data structure.
    # Using tibbles within the tidyverse
    library(tidyverse)
    
    tbl <- tibble(ID = 1:3, Name = c("Alice", "Bob", "Charlie"))
    
    # Data wrangling with tidyverse
    wrangled_tbl <- tbl %>%
      filter(ID > 1) %>%
      mutate(NewColumn = nchar(Name))