R Tutorial

Fundamentals of R

Variables

Input and Output

Decision Making

Control Flow

Functions

Strings

Vectors

Lists

Arrays

Matrices

Factors

DataFrames

Object Oriented Programming

Error Handling

File Handling

Packages in R

Data Interfaces

Data Visualization

Statistics

Machine Learning with R

Skewness and Kurtosis in R

Skewness and kurtosis are two important statistics that describe the shape of a distribution.

  • Skewness measures the asymmetry of a distribution. A positive skewness indicates a distribution that is skewed towards the left, whereas a negative skewness indicates a distribution that is skewed towards the right.

  • Kurtosis measures the "tailedness" of a distribution. Higher values indicate a distribution with heavier tails, while lower values indicate a distribution with lighter tails.

Here's a tutorial on how to calculate skewness and kurtosis in R:

1. Install and Load Required Packages

The moments package in R provides functions to compute skewness and kurtosis:

install.packages("moments")
library(moments)

2. Sample Data

For this tutorial, let's use a sample dataset. We'll generate two data sets: one from a normal distribution and another from a beta distribution:

set.seed(123)
data_normal <- rnorm(1000)
data_beta <- rbeta(1000, 2, 5)

3. Calculate Skewness

Using the skewness() function from the moments package, you can compute the skewness of the dataset:

skewness_normal <- skewness(data_normal)
skewness_beta <- skewness(data_beta)

print(paste("Skewness of normal data:", skewness_normal))
print(paste("Skewness of beta data:", skewness_beta))

4. Calculate Kurtosis

Similarly, use the kurtosis() function to compute the kurtosis. Note that the standard definition of kurtosis in moments package returns the excess kurtosis, which is the kurtosis minus 3:

kurtosis_normal <- kurtosis(data_normal)
kurtosis_beta <- kurtosis(data_beta)

print(paste("Kurtosis of normal data:", kurtosis_normal))
print(paste("Kurtosis of beta data:", kurtosis_beta))

5. Interpretation

  • Skewness:

    • ~0: The data is fairly symmetrical.
    • > 0: The data are right-skewed (i.e., tail is on the right).
    • < 0: The data are left-skewed (i.e., tail is on the left).
  • Kurtosis:

    • ~0: The kurtosis is similar to a normal distribution (remember, this is excess kurtosis, so it's the kurtosis value minus 3).
    • > 0: The distribution has heavier tails than a normal distribution.
    • < 0: The distribution has lighter tails than a normal distribution.

6. Visualization

It's often helpful to visualize the data alongside these statistics:

hist(data_normal, main="Histogram of Normal Data", breaks=30, col="lightblue", border="black")
hist(data_beta, main="Histogram of Beta Data", breaks=30, col="lightblue", border="black")

This will give you a clear picture of the skewness and kurtosis of the data.

7. Wrap-up

Understanding skewness and kurtosis is essential for many statistical analyses since assumptions about the normality of errors can be critical for hypothesis testing. These metrics provide a quantitative way to describe the shape and tails of a distribution.

  1. Skewness and Kurtosis Functions in R:

    • Use skewness() and kurtosis() functions from various packages to compute skewness and kurtosis.
    library(e1071)
    
    # Skewness
    skew_value <- skewness(my_data)
    
    # Kurtosis
    kurt_value <- kurtosis(my_data)
    
  2. R Moments Package for Skewness and Kurtosis:

    • The moments package provides the skewness() and kurtosis() functions.
    library(moments)
    
    skew_value <- skewness(my_data)
    kurt_value <- kurtosis(my_data)
    
  3. Descriptive Statistics in R for Skewness and Kurtosis:

    • Use summary() or psych::describe() for an overview of skewness and kurtosis in a dataset.
    summary(my_data)
    psych::describe(my_data)
    
  4. Skewness and Kurtosis Tests in R:

    • Conduct tests like Jarque-Bera or Shapiro-Wilk to assess normality.
    jarque_bera_test <- jarque.bera.test(my_data)
    shapiro_test <- shapiro.test(my_data)
    
  5. R Packages for Statistical Moments:

    • Explore packages like e1071, moments, and base R functions for statistical moments.
    library(e1071)
    library(moments)
    
    skewness_value <- skewness(my_data)
    kurtosis_value <- kurtosis(my_data)
    
  6. Visualizing Skewness and Kurtosis in R:

    • Create histograms, density plots, or boxplots to visualize skewness and kurtosis.
    hist(my_data)
    
  7. R Functions for Normality Tests:

    • Use functions like shapiro.test() or ad.test() for normality tests.
    shapiro_test <- shapiro.test(my_data)
    
  8. Checking Normality Assumptions in R:

    • Assess normality assumptions before applying parametric statistical tests.
    qqnorm(my_data)
    qqline(my_data)
    
  9. Handling Skewed Data in R:

    • Transform data using techniques like log transformation for skewed distributions.
    log_transformed_data <- log(my_data)