R Tutorial
Fundamentals of R
Variables
Input and Output
Decision Making
Control Flow
Functions
Strings
Vectors
Lists
Arrays
Matrices
Factors
DataFrames
Object Oriented Programming
Error Handling
File Handling
Packages in R
Data Interfaces
Data Visualization
Statistics
Machine Learning with R
R is a powerful tool for statistical analysis and visualization. In this tutorial, we will cover the basics of statistics in R, including:
Let's dive in.
Descriptive statistics provide a summary of the main aspects of the data.
data <- c(1, 2, 3, 4, 5) mean(data)
median(data)
getmode <- function(v) { uniqv <- unique(v) uniqv[which.max(tabulate(match(v, uniqv)))] } getmode(data)
var(data)
sd(data)
range(data)
To get a quick summary of data:
summary(data)
Inferential statistics are used to make inferences or draw conclusions about a population based on a sample.
For comparing the means of two groups:
group1 <- c(1, 2, 3, 4, 5) group2 <- c(5, 6, 7, 8, 9) t.test(group1, group2)
For comparing the means of more than two groups:
group3 <- c(10, 11, 12, 13, 14) anova_results <- aov(data ~ group, data = data.frame(data = c(group1, group2, group3), group = factor(rep(1:3, each = 5)))) summary(anova_results)
For testing relationships between categorical variables:
table1 <- matrix(c(10, 20, 30, 40), ncol = 2) chisq.test(table1)
To find the correlation between two variables:
x <- c(1, 2, 3, 4, 5) y <- c(5, 4, 3, 2, 1) cor(x, y)
To establish a relationship between two variables:
model <- lm(y ~ x) summary(model)
This tutorial provides a basic introduction to some common statistical analyses in R. R offers a plethora of packages and functions for more advanced statistics, so it's beneficial to explore further based on your specific needs. Whether you're looking to understand the basics or dive deep into advanced statistical modeling, R has the tools to help you make data-driven decisions.
Descriptive statistics in R programming:
# Example of descriptive statistics data <- c(10, 15, 12, 8, 20) mean_value <- mean(data) median_value <- median(data) sd_value <- sd(data) summary_stats <- summary(data)
Inferential statistics using R:
# Example of inferential statistics (t-test) group1 <- c(25, 30, 35, 40, 45) group2 <- c(20, 22, 25, 28, 30) t_test_result <- t.test(group1, group2)
Statistical tests in R (t-test, ANOVA, chi-square, etc.):
# Example of ANOVA factor_levels <- rep(c("A", "B", "C"), each = 10) response_variable <- rnorm(30) anova_result <- aov(response_variable ~ factor(factor_levels))
Correlation and regression analysis in R:
# Example of correlation and regression x <- c(2, 3, 5, 7, 8) y <- c(10, 12, 15, 20, 22) correlation_coefficient <- cor(x, y) regression_model <- lm(y ~ x)
Probability distributions in R:
# Example of probability distribution (normal distribution) data <- rnorm(1000, mean = 0, sd = 1)
Data visualization for statistical analysis in R:
# Example of data visualization (boxplot) boxplot(group1, group2, names = c("Group 1", "Group 2"), col = c("blue", "green"))
Multivariate statistics in R:
# Example of multivariate analysis (principal component analysis) data_matrix <- matrix(rnorm(100), ncol = 5) pca_result <- princomp(data_matrix)
Time series analysis and forecasting in R:
# Example of time series analysis (ARIMA model) time_series_data <- ts(rnorm(100), start = 1) arima_model <- arima(time_series_data, order = c(1, 1, 1))