R Tutorial

Fundamentals of R

Variables

Input and Output

Decision Making

Control Flow

Functions

Strings

Vectors

Lists

Arrays

Matrices

Factors

DataFrames

Object Oriented Programming

Error Handling

File Handling

Packages in R

Data Interfaces

Data Visualization

Statistics

Machine Learning with R

Boxplots in R Language

Boxplots (or box-and-whisker plots) are a graphical representation of the distribution of data. They show the median, quartiles, and potential outliers for a dataset.

Let's walk through the basics of creating and customizing boxplots in R:

1. Basic Boxplot

Given a dataset:

data <- rnorm(100)
boxplot(data, main="Basic Boxplot", ylab="Values")

This will give you a simple boxplot of the data.

2. Multiple Boxplots

If you have multiple groups:

group1 <- rnorm(100, mean=5)
group2 <- rnorm(100, mean=7)
group3 <- rnorm(100, mean=6)

data_list <- list(Group1=group1, Group2=group2, Group3=group3)
boxplot(data_list, main="Multiple Boxplots", ylab="Values")

3. Coloring the Boxplot

You can customize the colors:

boxplot(data_list, main="Colored Boxplots", ylab="Values", col=c("red", "blue", "green"))

4. Horizontal Boxplot

Change the orientation using the horizontal argument:

boxplot(data, horizontal=TRUE, main="Horizontal Boxplot")

5. Notch

A notch can be added to the boxplot to give a rough indication of the significance of the differences between medians:

boxplot(data_list, notch=TRUE, main="Notched Boxplots", ylab="Values")

If two boxplots have notches that do not overlap, this is 'strong evidence' that their medians differ.

6. Plotting Without Outliers

Outliers are typically plotted as individual points outside the whiskers. To suppress them:

boxplot(data, outline=FALSE, main="Boxplot without Outliers")

7. Getting Boxplot Statistics

If you want to extract the boxplot statistics:

stats <- boxplot.stats(data)
print(stats$stats)  # Print the five-number summary (min, lower-hinge, median, upper-hinge, max)
print(stats$out)    # Print the outliers

8. Adding Points to a Boxplot

Sometimes it's helpful to overlay the actual data points:

boxplot(data_list, main="Boxplot with Overlaid Points", ylab="Values")
stripchart(data_list, vertical=TRUE, method="jitter", add=TRUE, pch=21, bg="blue")

9. Customizing the Appearance

You can customize further by passing graphical parameters:

boxplot(data, col="lightblue", border="black", whisklty=2, staplelty=1, 
        main="Customized Boxplot", ylab="Values")

Where:

  • whisklty: line type for the whiskers.
  • staplelty: line type for the boxplot staple ends.

10. Combining with Other Plots

You can combine a boxplot with other types of plots. For instance, adding a density plot:

boxplot(data, main="Boxplot with Density", ylab="Values")
par(new=TRUE)
plot(density(data), col="red", lty=2, lwd=2, axes=FALSE, ann=FALSE)
axis(4)
mtext("Density", side=4, line=2)

Boxplots are versatile and provide a compact view of the distribution of data, making them a crucial tool for exploratory data analysis.

  1. R Boxplot Example:

    # Create a simple boxplot
    set.seed(123)
    data <- rnorm(100)
    boxplot(data)
    
  2. How to Create Boxplots in R:

    # Create boxplots for multiple groups
    set.seed(123)
    group1 <- rnorm(50, mean = 10, sd = 2)
    group2 <- rnorm(50, mean = 15, sd = 3)
    boxplot(group1, group2, names = c("Group 1", "Group 2"))
    
  3. Customizing Boxplots in ggplot2 in R:

    # Create a boxplot using ggplot2
    library(ggplot2)
    set.seed(123)
    data <- data.frame(value = rnorm(100), group = rep(c("A", "B"), each = 50))
    
    ggplot(data, aes(x = group, y = value)) +
      geom_boxplot() +
      labs(title = "Boxplot Example", x = "Group", y = "Value")
    
  4. Adding Colors to Boxplots in R:

    # Add colors to boxplots
    set.seed(123)
    data <- data.frame(value = rnorm(100), group = rep(c("A", "B"), each = 50))
    
    boxplot(value ~ group, data = data, col = c("lightblue", "lightgreen"))
    
  5. Side-by-Side Boxplots in R:

    # Side-by-side boxplots
    set.seed(123)
    group1 <- rnorm(50, mean = 10, sd = 2)
    group2 <- rnorm(50, mean = 15, sd = 3)
    
    boxplot(group1, group2, names = c("Group 1", "Group 2"), col = c("lightblue", "lightgreen"))
    
  6. Notched Boxplots in R:

    # Notched boxplot
    set.seed(123)
    data <- rnorm(100)
    boxplot(data, notch = TRUE)
    
  7. Outlier Detection in Boxplots Using R:

    # Outlier detection in boxplots
    set.seed(123)
    data <- rnorm(100)
    boxplot(data, outline = TRUE)
    
  8. Grouped Boxplots in R:

    # Grouped boxplots
    set.seed(123)
    group <- rep(c("A", "B"), each = 50)
    value <- rnorm(100)
    boxplot(value ~ group)
    
  9. Comparing Boxplots in R:

    # Compare boxplots
    set.seed(123)
    group1 <- rnorm(50, mean = 10, sd = 2)
    group2 <- rnorm(50, mean = 15, sd = 3)
    
    boxplot(group1, group2, names = c("Group 1", "Group 2"), col = c("lightblue", "lightgreen"))