R Tutorial
Fundamentals of R
Variables
Input and Output
Decision Making
Control Flow
Functions
Strings
Vectors
Lists
Arrays
Matrices
Factors
DataFrames
Object Oriented Programming
Error Handling
File Handling
Packages in R
Data Interfaces
Data Visualization
Statistics
Machine Learning with R
Survival analysis, also known as time-to-event analysis, is used to analyze the expected duration of time until a specific event of interest occurs. Common applications include analyzing time-to-failure for machinery, time until purchase in marketing, or time-to-death in medical studies.
In this tutorial, we'll use R to perform basic survival analysis using the survival
package.
install.packages("survival") library(survival)
The Surv()
function creates a survival object, which represents the data in a format suitable for survival analysis.
# Sample data time <- c(2, 3, 5, 8, 12) # time till event or censoring event <- c(1, 1, 0, 1, 0) # 1 if event occurred, 0 if censored # Create a survival object s_obj <- Surv(time, event)
The survfit()
function fits survival curves using the Kaplan-Meier method.
fit <- survfit(s_obj ~ 1) print(fit) # Plotting the survival curve plot(fit)
For analyzing the relationship between survival time and one or more predictor variables, we use the Cox proportional hazards model.
Let's use the lung
dataset from the survival
package.
# Load dataset data(lung) # Fit the model cox_model <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data=lung) # Summary of the model summary(cox_model)
We can visually inspect whether the proportional hazards assumption holds by plotting the scaled Schoenfeld residuals.
install.packages("ggplot2") library(ggplot2) # Compute residuals resid <- residuals(cox_model, type="scaledsch") # Plot ggplot(data.frame(time=cox_model$time, resid=resid), aes(x=time, y=resid)) + geom_point() + geom_smooth(se=FALSE)
A horizontal line in the plot suggests the assumption holds, while a non-horizontal trend suggests it might be violated.
You can make predictions using the Cox model:
# Predict survival for new data newdata <- data.frame(age=65, sex=1, ph.ecog=1) predict(cox_model, newdata, type="expected")
Right Censoring: In survival analysis, it's common for the event of interest not to occur for all subjects during the study period. Such subjects are "censored".
Multiple Events: Some subjects may experience the event more than once. There are extensions of basic survival analysis to handle such "repeated events".
Time-varying Covariates: If covariates change over time, they're "time-varying". Handling such covariates requires advanced methods.
Survival analysis is a powerful technique for analyzing time-to-event data. With R and the survival
package, you have the tools to explore, model, and predict such data effectively.
Introduction to Survival Analysis with R:
# Example: Creating a survival object library(survival) survival_object <- Surv(time = c(5, 10, 15), event = c(1, 1, 0))
Kaplan-Meier Survival Curves in R:
# Example: Kaplan-Meier curve km_curve <- survfit(survival_object ~ 1) plot(km_curve, main = "Kaplan-Meier Survival Curve")
Cox Proportional Hazards Model in R:
# Example: Cox Proportional Hazards Model cox_model <- coxph(Surv(time, event) ~ age + treatment, data = my_data)
R Survival Analysis for Clinical Data:
# Example: Analyzing clinical survival data survival_object <- Surv(time = clinical_data$follow_up_time, event = clinical_data$outcome)
Time-to-Event Analysis in R:
# Example: Time-to-event analysis km_curve <- survfit(Surv(time, event) ~ 1, data = time_data)
Survival Analysis Plotting in R:
# Example: Plotting survival curves plot(km_curve, main = "Survival Analysis", xlab = "Time", ylab = "Survival Probability")
Log-Rank Test in R:
# Example: Log-rank test logrank_test <- survdiff(Surv(time, event) ~ group, data = survival_data)
Comparing Survival Curves in R:
# Example: Comparing survival curves multi_km_curve <- survfit(Surv(time, event) ~ group, data = survival_data)
Handling Censored Data in Survival Analysis with R:
Surv
object.# Example: Handling censored data survival_object <- Surv(time = my_data$time, event = my_data$status)
Stratified Survival Analysis in R:
# Example: Stratified survival analysis stratified_km_curve <- survfit(Surv(time, event) ~ strata(group), data = survival_data)
Parametric Survival Models in R:
# Example: Fitting a Weibull survival model weibull_model <- survreg(Surv(time, event) ~ covariate, data = survival_data, dist = "weibull")
Multivariate Survival Analysis in R:
# Example: Multivariate survival analysis coxph_model <- coxph(Surv(time, event) ~ covariate1 + covariate2, data = survival_data)
Survival Analysis with Competing Risks in R:
# Example: Competing risks survival analysis library(cmprsk) competing_risks_model <- cuminc(time, status, group)