R Tutorial

Fundamentals of R

Variables

Input and Output

Decision Making

Control Flow

Functions

Strings

Vectors

Lists

Arrays

Matrices

Factors

DataFrames

Object Oriented Programming

Error Handling

File Handling

Packages in R

Data Interfaces

Data Visualization

Statistics

Machine Learning with R

Charts and Graphs in R

Creating charts and graphs is an integral part of data analysis and visualization in R. In this tutorial, we will cover the basics of producing various types of visualizations using both base R graphics and the ggplot2 package.

1. Base R Graphics

1.1. Scatter Plot

data(mtcars)
plot(mtcars$mpg, mtcars$wt, main="Scatterplot of mpg vs. wt", 
     xlab="Miles Per Gallon", ylab="Car Weight", pch=19, col="blue")

1.2. Boxplot

boxplot(mpg ~ cyl, data=mtcars, main="Boxplot of MPG by Cylinder Count", 
        xlab="Number of Cylinders", ylab="Miles Per Gallon", col="lightblue")

1.3. Histogram

hist(mtcars$mpg, main="Histogram of mpg", xlab="Miles Per Gallon", 
     col="lightgreen", border="black")

2. ggplot2 Graphics

First, install and load the ggplot2 package:

install.packages("ggplot2")
library(ggplot2)

2.1. Scatter Plot

ggplot(mtcars, aes(x=mpg, y=wt)) + 
  geom_point(aes(color=cyl), size=3) + 
  labs(title="Scatterplot of mpg vs. wt", x="Miles Per Gallon", y="Car Weight")

2.2. Boxplot

ggplot(mtcars, aes(x=as.factor(cyl), y=mpg)) + 
  geom_boxplot(aes(fill=as.factor(cyl))) + 
  labs(title="Boxplot of MPG by Cylinder Count", x="Number of Cylinders", y="Miles Per Gallon")

2.3. Histogram

ggplot(mtcars, aes(x=mpg)) + 
  geom_histogram(binwidth=2, fill="lightgreen", color="black") + 
  labs(title="Histogram of mpg", x="Miles Per Gallon")

2.4. Bar Chart

# Count of cars by cylinder
cyl_data <- as.data.frame(table(mtcars$cyl))

ggplot(cyl_data, aes(x=Var1, y=Freq)) + 
  geom_bar(stat="identity", fill="steelblue") + 
  labs(title="Number of Cars by Cylinder", x="Number of Cylinders", y="Count")

2.5. Line Chart

For demonstration, we'll create a simple data frame:

df <- data.frame(x=c(1,2,3,4,5), y=c(5,9,3,11,8))

ggplot(df, aes(x=x, y=y)) + 
  geom_line(aes(group=1), color="blue") + 
  geom_point(size=3) + 
  labs(title="Simple Line Chart", x="X-Axis", y="Y-Axis")

3. Customizing Plots

With both base R and ggplot2, you can further customize plots (colors, themes, axis scales, etc.). ggplot2 has particularly extensive customization options, with the added advantage of theme() functions and the ability to layer graphical elements.

Conclusion

Both base R and ggplot2 offer extensive capabilities for data visualization. While base R provides quick and straightforward plots, ggplot2 excels in its flexibility, layering system, and aesthetics. For advanced visualizations and customization, ggplot2 is often the preferred choice among R users.

  1. Customizing colors and styles in R plots:

    # Customizing colors and styles in a scatter plot
    data <- data.frame(x = rnorm(100), y = rnorm(100))
    plot(data$x, data$y, col = "blue", pch = 16, main = "Scatter Plot", xlab = "X-axis", ylab = "Y-axis")
    
  2. Combining multiple charts in R:

    # Combining multiple plots
    par(mfrow = c(2, 2))  # Divide the plotting area into a 2x2 grid
    plot(1:10, main = "Plot 1")
    hist(rnorm(100), main = "Histogram")
    boxplot(rnorm(100), main = "Boxplot")
    plot(runif(100), main = "Plot 2")
    
  3. Interactive charts in R with Shiny:

    • Shiny is an R package for creating interactive web applications. You can use it to build dashboards and interactive plots. Example:
    # Install and load Shiny
    install.packages("shiny")
    library(shiny)
    
    # Define a Shiny app with an interactive plot
    shinyApp(
      ui = fluidPage(
        plotOutput("plot")
      ),
      server = function(input, output) {
        output$plot <- renderPlot({
          plot(rnorm(100), main = "Interactive Plot")
        })
      }
    )
    
  4. Time series visualization in R:

    • Time series plots can be created using functions like plot() or specialized packages like ggplot2 or dygraphs. Example:
    # Time series plot with base graphics
    ts_data <- ts(rnorm(100), start = c(2022, 1), frequency = 12)
    plot(ts_data, main = "Time Series Plot", xlab = "Year-Month", ylab = "Value")
    
  5. Using ggplot2 for comprehensive data visualization in R:

    # Using ggplot2 for scatter plot
    library(ggplot2)
    ggplot(data, aes(x = x, y = y)) +
      geom_point(color = "red", size = 3) +
      ggtitle("Scatter Plot") +
      xlab("X-axis") +
      ylab("Y-axis")