R Tutorial
Fundamentals of R
Variables
Input and Output
Decision Making
Control Flow
Functions
Strings
Vectors
Lists
Arrays
Matrices
Factors
DataFrames
Object Oriented Programming
Error Handling
File Handling
Packages in R
Data Interfaces
Data Visualization
Statistics
Machine Learning with R
A Plot Matrix, or sometimes called a Scatterplot Matrix, is a grid of scatter plots used to visualize pairwise relationships between multiple variables. In R, the pairs()
function can be used to create this matrix. The ggpairs()
function from the GGally
package (an extension of ggplot2
) is another option that offers enhanced functionality and aesthetics.
Let's go through both methods:
pairs()
function (base R):Let's use the iris
dataset, which is a built-in dataset in R, to illustrate:
# Load dataset data(iris) # Create a scatterplot matrix pairs(iris[,1:4], main="Scatterplot Matrix", pch=19, col=iris$Species)
In the code above:
iris[,1:4]
selects the first 4 columns (all numeric) of the iris dataset.pch=19
specifies a solid circle plotting symbol.col=iris$Species
colors the points by species.ggpairs()
function from the GGally
package:First, you'll need to install and load the GGally
package if you haven't done so:
install.packages("GGally") library(GGally)
Now, let's create a scatterplot matrix:
# Use the ggpairs() function p <- ggpairs(iris, columns=1:4, aes(color=Species)) # Print the plot print(p)
In this example:
columns=1:4
specifies that we want to use the first 4 columns of the iris dataset.aes(color=Species)
colors the points by species.You can further customize the scatterplot matrix. For example, you can change the type of plots in the diagonal, upper, and lower sections of the matrix:
p <- ggpairs( iris, columns=1:4, aes(color=Species), diag=list(continuous="densityDiag"), # Diagonal plots to show density upper=list(continuous="points"), # Upper matrix to show scatterplots lower=list(continuous="cor") # Lower matrix to show correlation coefficients ) print(p)
With GGally
, you can have a fine-tuned control over the aesthetics and type of plots in your matrix, making it versatile for various data exploration tasks.
Create plot matrix in R with scatterplots:
# Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create a scatterplot matrix using the pairs() function pairs(data_frame)
R pairs() function example:
# Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create a scatterplot matrix using the pairs() function pairs(data_frame)
Scatterplot matrix ggplot2 in R:
# Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create a scatterplot matrix using ggplot2 library(ggplot2) ggplot(data_frame, aes(x = x, y = y)) + geom_point() + facet_grid(. ~ z)
Pairwise scatterplots in R:
# Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create pairwise scatterplots using the pairs() function pairs(data_frame)
Plot matrix with correlation coefficients in R:
# Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create a plot matrix with correlation coefficients using the pairs() function pairs(data_frame, cor.panel = cor)
R scatterplot matrix color by group:
# Create a numeric data frame with a grouping variable data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9), group = c("A", "B", "A") ) # Create a scatterplot matrix with color by group using ggplot2 library(ggplot2) ggplot(data_frame, aes(x = x, y = y, color = group)) + geom_point() + facet_grid(. ~ z)
Interactive plot matrix in R:
For interactive plot matrices, you can use packages like plotly
to create interactive visualizations.
Example using plotly
:
# Install and load the plotly package # install.packages("plotly") library(plotly) # Create a numeric data frame data_frame <- data.frame( x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9) ) # Create an interactive scatterplot matrix using plotly plot_ly(data = data_frame, type = "scatter", mode = "markers", dimensions = list( list(label = "x", values = ~x), list(label = "y