R Tutorial
Fundamentals of R
Variables
Input and Output
Decision Making
Control Flow
Functions
Strings
Vectors
Lists
Arrays
Matrices
Factors
DataFrames
Object Oriented Programming
Error Handling
File Handling
Packages in R
Data Interfaces
Data Visualization
Statistics
Machine Learning with R
In this tutorial, we'll explore the normal distribution, its properties, and how to work with it in R.
The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetric about the mean. The distribution is determined by two parameters: the mean (��) and the standard deviation (��). It is widely used in statistics and the natural sciences due to its desirable properties.
You can generate random numbers from a normal distribution using the rnorm()
function.
set.seed(123) # For reproducibility random_numbers <- rnorm(n=1000, mean=0, sd=1) hist(random_numbers, main="Histogram of Randomly Generated Numbers", xlab="Value", breaks=50)
Density: The dnorm()
function gives the height of the probability density function for the normal distribution.
x <- seq(-4, 4, by=0.1) y <- dnorm(x, mean=0, sd=1) plot(x, y, type="l", main="Density of Standard Normal Distribution", ylab="Density", xlab="Value")
Distribution Function: The pnorm()
function gives the cumulative distribution function (probability that a normally distributed random number is less than x).
prob <- pnorm(1, mean=0, sd=1) print(prob) # Probability that X < 1 for standard normal distribution
Quantile Function: The qnorm()
function returns the quantile function, which is the inverse of the distribution function.
quantile_val <- qnorm(0.95, mean=0, sd=1) print(quantile_val) # Returns the 95th percentile of standard normal distribution
The shapiro.test()
function tests the null hypothesis that data is drawn from a normal distribution.
test_result <- shapiro.test(random_numbers) print(test_result)
Sometimes, data may need to be transformed to approximate a normal distribution. Common transformations include the log, square root, and Box-Cox transformations.
For instance:
non_normal_data <- rexp(1000) transformed_data <- log(non_normal_data) hist(transformed_data, main="Histogram of Log-transformed Data", xlab="Value", breaks=50)
The functions mentioned above (rnorm()
, dnorm()
, pnorm()
, qnorm()
) all accept mean
and sd
parameters to work with non-standard normal distributions. The standard normal distribution has mean 0 and standard deviation 1.
Understanding the normal distribution and knowing how to work with it is fundamental in statistics and many applications of data science. R provides a comprehensive set of functions for working with normal distributions, making it easy to generate, analyze, and visualize normally distributed data.
Generating random numbers from a normal distribution in R:
Overview: Introduce the concept of generating random numbers from a normal distribution.
Code:
# Generating random numbers from a normal distribution set.seed(123) random_numbers <- rnorm(1000, mean = 0, sd = 1) # Printing the first few random numbers print("First few random numbers:") print(head(random_numbers))
R code for plotting normal distribution curve:
Overview: Demonstrate how to create a plot of the normal distribution curve in R.
Code:
# Plotting the normal distribution curve x <- seq(-3, 3, by = 0.01) y <- dnorm(x, mean = 0, sd = 1) plot(x, y, type = "l", col = "blue", lwd = 2, main = "Normal Distribution Curve", xlab = "x", ylab = "Density")
Calculating probabilities for normal distribution in R:
Overview: Explain how to calculate probabilities for a normal distribution in R.
Code:
# Calculating probabilities for normal distribution probability <- pnorm(1.96, mean = 0, sd = 1) # Printing the probability print(paste("Probability:", probability))
R dnorm function usage for normal distribution:
Overview: Discuss the usage of the dnorm
function for evaluating the probability density function (PDF) of the normal distribution.
Code:
# Using dnorm function for normal distribution density <- dnorm(0, mean = 0, sd = 1) # Printing the density print(paste("Density at 0:", density))
Fitting normal distribution to data in R:
Overview: Illustrate how to fit a normal distribution to data in R.
Code:
# Fitting normal distribution to data data <- rnorm(1000, mean = 2, sd = 1) fit_params <- fitdist(data, "norm") # Printing the fitted parameters print("Fitted Parameters:") print(fit_params)
Statistical tests for normality in R:
Overview: Discuss statistical tests for checking the normality of a distribution in R.
Code:
# Performing a normality test data <- rnorm(1000, mean = 0, sd = 1) normality_test <- shapiro.test(data) # Printing the test results print("Normality Test Results:") print(normality_test)
R mean and standard deviation for normal distribution:
Overview: Calculate the mean and standard deviation for a normal distribution in R.
Code:
# Calculating mean and standard deviation for normal distribution mean_value <- mean(random_numbers) sd_value <- sd(random_numbers) # Printing the results print(paste("Mean:", mean_value)) print(paste("Standard Deviation:", sd_value))
Normal Q-Q plot in R programming:
Overview: Create a Normal Q-Q plot for assessing the normality of data in R.
Code:
# Creating a Normal Q-Q plot qqnorm(random_numbers) qqline(random_numbers, col = "red")