R Tutorial

Fundamentals of R

Variables

Input and Output

Decision Making

Control Flow

Functions

Strings

Vectors

Lists

Arrays

Matrices

Factors

DataFrames

Object Oriented Programming

Error Handling

File Handling

Packages in R

Data Interfaces

Data Visualization

Statistics

Machine Learning with R

What Are the Tidyverse Packages in R?

The tidyverse is a collection of R packages designed for data science. All packages share an underlying philosophy and common APIs. The main packages included in the tidyverse are:

  • ggplot2 - Data visualization.
  • dplyr - Data manipulation.
  • tidyr - Data tidying.
  • readr - Data import.
  • purrr - Functional programming.
  • tibble - Tidy data structure.
  • stringr - String manipulation.
  • forcats - Factor manipulation.

Let's go through a quick tutorial for some of these packages:

  • Installing and Loading the tidyverse:
install.packages("tidyverse")
library(tidyverse)
  • dplyr Basics:
  • Select columns with select():
starwars %>% select(name, height, species)
  • Filter rows with filter():
starwars %>% filter(species == "Droid")
  • Arrange rows with arrange():
starwars %>% arrange(desc(height))
  • Create new columns with mutate():
starwars %>% mutate(height_in_meters = height / 100)
  • Summarize data with summarise():
starwars %>% group_by(species) %>% summarise(avg_mass = mean(mass, na.rm = TRUE))
  • tidyr Basics:
  • Gather columns into rows:
data <- tibble(x = 1, y = 2, z = 3)
data %>% gather("key", "value")
  • Spread rows into columns:
data <- tibble(key = c("x", "y", "z"), value = c(1, 2, 3))
data %>% spread(key, value)
  • ggplot2 Basics:
  • A simple plot:
ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy))
  • Add layers:
ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy, color = class)) +
  geom_smooth(mapping = aes(x = displ, y = hwy))
  • readr Basics:

To read in a CSV file:

data <- read_csv("path_to_file.csv")
  • stringr Basics:
  • Find length of a string:
str_length("Hello, World!")
  • Replace in a string:
str_replace("I like cats.", "cats", "dogs")
  • purrr Basics:
  • Map a function to a list:
list(1, 2, 3) %>% map(~ .x * 2)
  1. Core Packages of Tidyverse in R:

    • Tidyverse is a collection of R packages designed for data science and analysis, promoting a tidy data approach.
    # Example: Loading Tidyverse core packages
    library(tidyverse)
    
  2. R Packages Included in Tidyverse:

    • Tidyverse includes several packages for various data tasks, providing a cohesive and integrated data science environment.
    # Example: List of Tidyverse packages
    tidyverse_packages <- c("dplyr", "ggplot2", "tidyr", "readr", "purrr", "stringr", "forcats", ...)
    
  3. dplyr Package in Tidyverse:

    • dplyr is a powerful package for data manipulation tasks, providing functions for filtering, arranging, summarizing, and more.
    # Example: Using dplyr functions
    my_data %>%
      filter(condition) %>%
      group_by(variable) %>%
      summarise(mean_value = mean(value))
    
  4. ggplot2 Package in Tidyverse:

    • ggplot2 is a versatile package for creating data visualizations using a grammar of graphics approach.
    # Example: Creating a ggplot2 plot
    ggplot(data = my_data, aes(x = x_variable, y = y_variable)) +
      geom_point() +
      labs(title = "My Plot")
    
  5. Tibble and tidyr in Tidyverse:

    • Tibble provides an improved data frame, and tidyr offers functions for data tidying and reshaping.
    # Example: Using Tibble and tidyr
    my_tibble <- as_tibble(my_data)
    tidied_data <- gather(my_data, key = "variable", value = "value", -id_column)
    
  6. Stringr and forcats in Tidyverse:

    • Stringr is for string manipulation, and forcats is for handling factors in Tidyverse.
    # Example: Using Stringr and forcats
    modified_string <- str_replace(my_string, pattern, replacement)
    my_data$factor_variable <- fct_relevel(my_data$factor_variable, "desired_level")
    
  7. Working with Factors in Tidyverse:

    • Tidyverse provides tools for effective handling and manipulation of factors.
    # Example: Working with factors in Tidyverse
    my_data$factor_variable <- fct_explicit_na(my_data$factor_variable, na_level = "NA")
    
  8. Tidyverse Data Manipulation in R:

    • Tidyverse packages collectively enable seamless data manipulation workflows.
    # Example: Tidyverse data manipulation
    my_data %>%
      filter(condition) %>%
      select(columns) %>%
      mutate(new_variable = transform_function(existing_variable))
    
  9. Tidyverse Data Visualization in R:

    • Leverage Tidyverse packages for creating expressive and customizable data visualizations.
    # Example: Tidyverse data visualization
    my_data %>%
      ggplot(aes(x = x_variable, y = y_variable)) +
      geom_point() +
      facet_wrap(~facet_variable)
    
  10. R purrr Package in Tidyverse:

    • purrr provides functional programming tools for working with data in a vectorized manner.
    # Example: Using purrr functions
    my_list %>% map(function)
    
  11. Tidyverse readr for Data Import in R:

    • readr is a Tidyverse package for efficient and user-friendly data import.
    # Example: Using readr for data import
    my_data <- read_csv("my_data.csv")
    
  12. Combining Tidyverse Packages for Data Analysis in R:

    • Combine Tidyverse packages to streamline and enhance your data analysis workflow.
    # Example: Comprehensive Tidyverse data analysis
    my_data %>%
      filter(condition) %>%
      ggplot(aes(x = x_variable, y = y_variable)) +
      geom_point() +
      labs(title = "Analyzing My Data")