Full text link:http://tecdat.cn/?p=5438
survival analysis It refers to a series of statistical methods used to explore the occurrence time of events of interest.
survival analysis It is used in various fields, such as:
Cancer research is an analysis of patients’ survival time,
Sociology of “historical analysis of events”
In the “failure time analysis” of engineering.
In cancer research, typical research problems are as follows:
What is the impact of some clinical features on the survival of patients?
What’s the probability of an individual surviving in three years?
Is there any difference in survival rate among groups?
=
Basic concepts
Here, we start by defining the basic terms of survival analysis
Time to live and events
Survival function and risk function
Survival time and event types in cancer research
There are different types of events, including:
recrudescence
death
From the beginning of observation to the end of observation \_ time \_ Commonly referred to as \_ survival time \_ (or the time of the event).
The two most important evaluation methods in cancer research include: I)Time of death; And II) none \_ Recurrence survival time \_ It corresponds to the time between treatment response and disease recurrence. It’s also known as none \_ Disease survival time \_ He Wu \_ Event lifetime \_。
As mentioned above, survival analysis focuses on the expected duration until the occurrence of an event of interest (recurrence or death).
Kaplan Meier survival assessment
Kaplan – Meier (km) method is a nonparametric method used to estimate the survival probability of observed survival time (Kaplan and Meier, 1958).
The survival curve is the relationship curve between management survival probability and time. It provides a useful summary of data and can be used to estimate measures such as median survival time.
R survival analysis
Survival analysis summary and visualization of survival analysis results
Sample data set
We will use the lung cancer data provided in the survival package.
head(lung)
inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1 3 306 2 74 1 1 90 100 1175 NA
2 3 455 2 68 1 0 90 90 1225 15
3 3 1010 1 56 1 0 90 90 NA 15
4 5 210 2 57 1 1 90 60 1150 11
5 1 883 2 60 1 0 100 90 NA 0
6 12 1022 1 74 1 1 50 80 513 0
Inst: institution code
Time: survival time in days
Status: status 1 = review, 2 = death
Age: age
Gender: male = 1, female = 2
ph.ecog : ECOG performance score (0 = normal, 5 = death)
ph . Karno: Karnofsky performance score (poor) = 0 normal = 100) assessed by physician
pat.karno Karnofsky performance score was assessed by the patient
Meals: calories consumed during meals
wt . Loss: weight loss in the past six months
Survfit ()
We need to calculate the probability of survival by sex.
function \_ survfit \_ () can be used to calculate Kaplan – Meier survival estimate.
Using functions\_ Surv\_ () created by
To calculate the survival curve, enter the following:
print(fit)
n events median 0.95LCL 0.95UCL
sex=1 138 112 270 212 310
sex=2 90 53 426 348 550
By default, the function print() displays a summary of the survival curve. It shows the number of observations, number of events, median survival and median confidence interval.
To display a more complete summary of the survival curve, enter the following:
#Survival curve summary
summary(fit)#
summary(fit)$table
Visual survival curve
We generated survival curves for two groups of subjects.
ggplot(fit,
pval = TRUE, conf.int = TRUE,
risk.table =True, # add risk table
risk.table.col ="Strata", # change risk table color by group
\_ legend . labs \_ Change the legend label.
ggplot(
Fit, # survfit object with calculated statistics.
PVAL = true, # shows the p value of log rank test.
conf.int =True, # shows the confidence interval of survival curve point estimation.
conf . int . style = " step ", # Custom confidence interval style
xlab = " Time in days ", # Customize the xaxis label.
break.time.by =200, # breaks the xaxis at 200 intervals.
ggtheme = theme_ Use theme to customize drawing and risk table.
risk . table = " abs_ pct ", # Absolute value
The median survival time of each group represents the time when the survival probability s (T) is 0.5.
Use parameters\_ xlim\_ The range of survival curve can be shortened as follows:
Note that parameters can be used\_ fun\_ Specify three frequently used transformations:
Cumulative risk is often used to estimate the probability of risk.
、
Kaplan Meier life table: summary of survival curve
As mentioned above, you can use functions \_ summary \_ () to obtain a complete summary of the survival curve
summary(fit)
Log – Rank test: survdiff ()
Yes\_ Rank test\_ It is the most widely used method to compare two or more survival curves. The null hypothesis is that there is no difference in survival between the two groups.
Survdiff() can be used as follows:
surv_diff
N Observed Expected (OE)^2/E (OE)^2/V
sex=1 138 112 91.6 4.55 10.3
sex=2 90 53 73.4 5.68 10.3
Chisq= 10.3 on 1 degrees of freedom, p= 0.00131
The log rank test of survival rate difference gave a p value of P = 0.0013, which indicated that there was significant difference in survival rate between male and female groups.
Complex survival curve
In this section, we will calculate the survival curve using a combination of multiple factors. Next, we will use ggsurvplot() to output the result
ggplot(fit,
conf.int = TRUE,
risk.table.col ="Strata", # change risk table color by group
ggtheme = theme_ BW (), # change ggplot2 theme
Visual output. The following figure shows the survival curve of the sex variable according to the value of Rx & here.
outline
Survival analysis is a statistical method of data analysis, in which the result variable of interest is the time before the event.
In this article, we demonstrate how to use two R packages to perform and visualize survival analysis.

 –
Most popular insights
1.R language drawing survival curve estimation  survival analysis  how to R draw survival curve
2.Visual analysis of R language survival analysis
3.How does R language calculate IDI and NRI in survival analysis and Cox regression
4.Using Bioconductor to analyze chip data in R language
5.R language survival analysis data analysis visualization case
6.R language ggplot2 error bar chart Quick Guide
7.Drawing function enriched bubble graph with R language
8.How can r language find indicators with differences in patient data? (PLSDA analysis)
9.Survival analysis in R language four patients with advanced lung cancer