|
|
xii | |
|
|
xiv | |
|
|
xvii | |
Preface |
|
xviii | |
|
1 The fundamentals of survival and event history analysis |
|
|
1 | (17) |
|
1.1 Introduction: what is survival and event history analysis? |
|
|
1 | (1) |
|
1.2 Key concepts and terminology |
|
|
2 | (2) |
|
1.3 Censoring and truncation |
|
|
4 | (3) |
|
|
5 | (1) |
|
|
6 | (1) |
|
|
6 | (1) |
|
1.4 Mathematical expression and relation of basic statistical functions |
|
|
7 | (2) |
|
1.5 Why use survival and event history analysis? |
|
|
9 | (2) |
|
1.5.1 Potential problems that might arise if censored data is ignored |
|
|
9 | (2) |
|
1.5.2 What does survival analysis offer that ordinary regression models do not? |
|
|
11 | (1) |
|
1.6 Overview of survival and event history models and this book |
|
|
11 | (7) |
|
1.6.1 Non-, semi- and parametric models |
|
|
11 | (3) |
|
1.6.2 Outline of this book |
|
|
14 | (3) |
|
|
17 | (1) |
|
2 An introduction to R and data exploration via descriptive statistics and graphics |
|
|
18 | (29) |
|
2.1 An introduction to R and data exploration |
|
|
18 | (2) |
|
2.2 Downloading R on your personal computer |
|
|
20 | (1) |
|
2.3 The R base system and add-on packages |
|
|
21 | (1) |
|
2.3.1 Add-on packages and how to install them |
|
|
21 | (1) |
|
2.3.2 Loading an add-on package |
|
|
22 | (1) |
|
|
22 | (3) |
|
2.4.1 Running R interactively by typing at the > prompt |
|
|
22 | (1) |
|
2.4.2 Running R non-interactively using a script file |
|
|
23 | (1) |
|
2.4.3 Running R using the R Commander graphical user interface |
|
|
24 | (1) |
|
2.5 Determining and setting your working directory |
|
|
25 | (2) |
|
2.5.1 Determining your working directory |
|
|
25 | (1) |
|
2.5.2 Setting a new working directory |
|
|
26 | (1) |
|
2.6 Help and documentation |
|
|
27 | (1) |
|
2.7 Importing data into R |
|
|
27 | (4) |
|
2.7.1 Importing Stata or SPSS data into R |
|
|
28 | (2) |
|
2.7.2 Importing ASCII text or Excel data into R |
|
|
30 | (1) |
|
2.8 Working with data: opening and accessing variables from a data frame |
|
|
31 | (4) |
|
2.8.1 Placing the name of the data within a function |
|
|
32 | (1) |
|
|
32 | (1) |
|
2.8.3 Using (and abusing) the attach function |
|
|
33 | (1) |
|
2.8.4 Using data that is part of an existing library package |
|
|
34 | (1) |
|
|
34 | (1) |
|
2.9 Saving your work and quitting R |
|
|
35 | (2) |
|
2.9.1 Save to file and capture output options |
|
|
35 | (1) |
|
2.9.2 Quitting R and saving your workspace |
|
|
35 | (2) |
|
2.9.3 Saving your history |
|
|
37 | (1) |
|
2.10 Basic descriptive statistics |
|
|
37 | (6) |
|
|
37 | (1) |
|
2.10.2 Descriptive summary statistics |
|
|
38 | (5) |
|
2.11 Descriptive data exploration with graphics |
|
|
43 | (4) |
|
|
45 | (2) |
|
3 Survival and event history data structures |
|
|
47 | (15) |
|
3.1 Introduction: why discuss data structures? |
|
|
47 | (1) |
|
3.2 Sources of event history data |
|
|
48 | (1) |
|
|
49 | (1) |
|
|
50 | (3) |
|
3.4.1 Understanding multi-episode data |
|
|
50 | (1) |
|
3.4.2 Converting single-episode to multi-episode data |
|
|
51 | (2) |
|
3.5 Subject-period (discrete-time) data, episode-splitting and counting process format |
|
|
53 | (5) |
|
3.5.1 Subject-period or discrete-time data |
|
|
53 | (1) |
|
3.5.2 Creating a subject-period file: survSplit in survival library |
|
|
54 | (1) |
|
3.5.3 Creation of a subject-period file: to Binary in eha package |
|
|
55 | (1) |
|
|
56 | (2) |
|
3.5.5 Counting process style of data |
|
|
58 | (1) |
|
|
58 | (4) |
|
|
59 | (1) |
|
3.6.2 Converting date variables to a numeric format |
|
|
59 | (1) |
|
|
60 | (1) |
|
|
61 | (1) |
|
4 Non-parametric methods: the Kaplan-Meier estimator |
|
|
62 | (24) |
|
|
62 | (1) |
|
4.2 The Kaplan-Meier (KM) estimator |
|
|
63 | (1) |
|
4.3 Undertaking KM estimations in R |
|
|
64 | (3) |
|
4.3.1 The survival package in R |
|
|
64 | (2) |
|
4.3.2 Loading RcmdrPlugin.survival to use in the R Commander |
|
|
66 | (1) |
|
4.4 Kaplan-Meier estimation |
|
|
67 | (6) |
|
4.4.1 Producing KM estimates using the R Commander |
|
|
67 | (2) |
|
4.4.2 Producing KM estimates with a script file |
|
|
69 | (2) |
|
4.4.3 Interpretation of KM estimates |
|
|
71 | (2) |
|
4.5 Plotting the Kaplan-Meier survival curve |
|
|
73 | (6) |
|
4.5.1 Plotting a univariate KM survival curve |
|
|
73 | (2) |
|
4.5.2 Comparing two KM survival curves |
|
|
75 | (4) |
|
4.6 Testing differences between two groups using survdiff |
|
|
79 | (4) |
|
4.6.1 The Fleming-Harrington test |
|
|
80 | (1) |
|
4.6.2 The log-rank (Mantel-Haenszel) test |
|
|
80 | (1) |
|
4.6.3 The Peto and Peto test |
|
|
81 | (1) |
|
4.6.4 Comparing tests: which test to choose? |
|
|
82 | (1) |
|
4.7 Stratifying the analysis by a covariate |
|
|
83 | (3) |
|
|
85 | (1) |
|
5 The Cox proportional-hazards regression model |
|
|
86 | (28) |
|
5.1 Introduction: The Cox regression model |
|
|
86 | (5) |
|
5.1.1 The Cox proportional hazard model with fixed covariates |
|
|
87 | (2) |
|
5.1.2 The Cox proportional hazards model with time-varying covariates |
|
|
89 | (1) |
|
5.1.3 Why is the Cox model so popular? |
|
|
90 | (1) |
|
5.2 Estimating and interpreting the Cox model with fixed covariates |
|
|
91 | (9) |
|
|
91 | (1) |
|
5.2.2 Estimating the Cox regression model |
|
|
91 | (2) |
|
5.2.3 Interpreting covariate estimates in the Cox regression model |
|
|
93 | (4) |
|
5.2.4 Significance of the model |
|
|
97 | (1) |
|
5.2.5 Plotting the estimated survival function |
|
|
98 | (1) |
|
5.2.6 Plotting the estimated survival function by a covariate |
|
|
99 | (1) |
|
5.3 The Cox regression model with time-varying covariates |
|
|
100 | (14) |
|
5.3.1 Creating a subject-period file to accommodate time-varying covariates |
|
|
100 | (3) |
|
5.3.2 Modelling time-varying covariates using person-period data |
|
|
103 | (3) |
|
5.3.3 Creating a subject-period file with lagged variables to reduce problems of causal ordering |
|
|
106 | (1) |
|
5.3.4 Lagged time-varying covariates to reduce problems of causal ordering |
|
|
107 | (1) |
|
5.3.5 Interactions with time as time-dependent covariates: episode-splitting at time intervals |
|
|
108 | (5) |
|
|
113 | (1) |
|
|
114 | (27) |
|
|
114 | (1) |
|
6.2 Relationship of the probability density, hazard and survival function |
|
|
115 | (1) |
|
6.3 Proportional hazards (PH) versus accelerated failure time (AFT) models |
|
|
116 | (1) |
|
6.4 Specification of parametric models |
|
|
117 | (8) |
|
6.4.1 Summary of selected parametric survival distributions |
|
|
117 | (1) |
|
6.4.2 The exponential model |
|
|
118 | (3) |
|
6.4.3 Piecewise constant exponential model |
|
|
121 | (1) |
|
|
121 | (3) |
|
6.4.5 Log-logistic and log-normal models |
|
|
124 | (1) |
|
6.4.6 Other parametric models |
|
|
125 | (1) |
|
6.5 Estimating parametric survival models using the survival and eha packages |
|
|
125 | (1) |
|
6.5.1 Estimating parametric models using the survreg function in the survival library |
|
|
125 | (1) |
|
6.5.2 Estimating parametric models using the phreg and aftreg functions in the eha library |
|
|
126 | (1) |
|
6.6 Estimation and interpretation of parametric models |
|
|
126 | (13) |
|
6.6.1 Exponential model: PH parameterization |
|
|
126 | (2) |
|
6.6.2 Exponential model: AFT parameterization |
|
|
128 | (5) |
|
6.6.3 Piecewise exponential model: PH and AFT parameterization |
|
|
133 | (3) |
|
6.6.4 Weibull model: PH parameterization |
|
|
136 | (1) |
|
6.6.5 Weibull model: AFT parameterization |
|
|
136 | (1) |
|
6.6.6 Log-logistic and log-normal models: AFT parameterization |
|
|
137 | (2) |
|
6.7 What happens if a parametric model is specified incorrectly? |
|
|
139 | (2) |
|
|
140 | (1) |
|
7 Model-building and diagnostics |
|
|
141 | (23) |
|
|
141 | (1) |
|
7.2 Model-building and selection of covariates and a model |
|
|
142 | (4) |
|
7.2.1 Purposeful selection of covariates |
|
|
142 | (2) |
|
7.2.2 The decision path to choosing an appropriate model |
|
|
144 | (2) |
|
7.3 Assessing the overall goodness of fit of your model |
|
|
146 | (3) |
|
7.3.1 The log-likelihood and likelihood ratio tests |
|
|
146 | (2) |
|
7.3.2 Akaike information criterion (AIC) and evaluation of standard errors |
|
|
148 | (1) |
|
7.4 Testing overall model adequacy: Cox-Snell residuals |
|
|
149 | (2) |
|
7.5 Testing the proportional hazards assumption: Schoenfeld residuals |
|
|
151 | (6) |
|
7.5.1 Understanding and estimating Schoenfeld residuals |
|
|
151 | (3) |
|
7.5.2 Dealing with non-proportional hazards: introducing an interaction effect |
|
|
154 | (1) |
|
7.5.3 Dealing with non-proportional hazards: stratifying the data |
|
|
155 | (2) |
|
7.6 Checking for influential observations: score residuals |
|
|
157 | (3) |
|
7.6.1 What should be done if influential observations are identified? |
|
|
160 | (1) |
|
7.7 Assessing nonlinearity: martingale residual and component-plus-residual plots |
|
|
160 | (4) |
|
|
163 | (1) |
|
8 Frailty and recurrent event models |
|
|
164 | (15) |
|
|
164 | (2) |
|
8.2 Shared frailty: modelling recurrent events and clustering in groups |
|
|
166 | (3) |
|
|
166 | (1) |
|
8.2.2 Shared clustering in groups |
|
|
167 | (2) |
|
8.3 Additional frailty models: unshared, nested, joint and additive models |
|
|
169 | (2) |
|
8.3.1 Individual (unshared) frailty models |
|
|
169 | (1) |
|
8.3.2 Nested frailty models |
|
|
170 | (1) |
|
8.3.3 Joint and additive frailty models |
|
|
170 | (1) |
|
8.4 Estimating frailty models in R |
|
|
171 | (1) |
|
8.4.1 Using the frailty function |
|
|
171 | (1) |
|
8.4.2 The frailtypack and survrec library in R |
|
|
171 | (1) |
|
8.5 Frailty model estimation and interpretation |
|
|
172 | (7) |
|
8.5.1 Description of the data |
|
|
172 | (1) |
|
8.5.2 Frailty model with a gamma distribution |
|
|
173 | (4) |
|
8.5.3 Frailty model with a Gaussian distribution |
|
|
177 | (1) |
|
|
178 | (1) |
|
|
179 | (11) |
|
|
179 | (2) |
|
|
181 | (3) |
|
9.2.1 Specification of the hazard, survival and cumulative probability density functions |
|
|
181 | (1) |
|
9.2.2 Models to estimate discrete-time data: logit, probit and complementary log-log functions |
|
|
182 | (2) |
|
9.3 Restructuring data for discrete-time modelling |
|
|
184 | (1) |
|
9.4 Estimation and interpretation of discrete-time models |
|
|
184 | (5) |
|
9.4.1 Estimation of logit, probit and cloglog discrete-time models |
|
|
184 | (3) |
|
9.4.2 Interpretation and comparison of estimates |
|
|
187 | (2) |
|
9.5 Advantages and disadvantages of discrete-time models |
|
|
189 | (1) |
|
|
189 | (1) |
|
10 Competing risk and multi-state models |
|
|
190 | (23) |
|
|
190 | (1) |
|
10.2 Competing risk models |
|
|
191 | (4) |
|
10.2.1 Three central techniques to model competing risks |
|
|
192 | (1) |
|
10.2.2 The latent or cause-specific approach |
|
|
192 | (1) |
|
10.2.3 The cumulative incidence curve (CIC) |
|
|
193 | (2) |
|
10.3 Estimating competing risks using the latent versus CIC approach |
|
|
195 | (3) |
|
10.3.1 Data preparation and restructuring |
|
|
195 | (2) |
|
10.3.2 Estimating CIC estimates and their standard errors |
|
|
197 | (1) |
|
10.4 Regression analysis with competing risks |
|
|
198 | (4) |
|
|
202 | (2) |
|
10.5.1 A brief introduction to multi-state models and their applications |
|
|
202 | (1) |
|
10.5.2 Markov, semi-Markov and extended Markov model properties |
|
|
203 | (1) |
|
10.6 Estimation of multi-state models |
|
|
204 | (9) |
|
10.6.1 Preparation of data for multi-state models using the mstate package |
|
|
204 | (3) |
|
10.6.2 Estimation of Markov model with stratified hazards |
|
|
207 | (3) |
|
10.6.3 Estimation of Markov model with proportional hazards |
|
|
210 | (1) |
|
10.6.4 Estimation of state arrival extended Markov proportional hazards model |
|
|
211 | (1) |
|
10.6.5 Further predictions and estimation of multi-state models with the cumulative incidence function |
|
|
212 | (1) |
|
|
212 | (1) |
|
|
213 | (14) |
|
11.1 Introduction: sequence analysis |
|
|
213 | (2) |
|
11.1.1 A brief introduction to sequence analysis |
|
|
213 | (2) |
|
11.1.2 Optimal-matching techniques |
|
|
215 | (1) |
|
11.2 Sequence analysis data and estimation using the TraMineR package |
|
|
215 | (2) |
|
|
216 | (1) |
|
11.2.2 The transition from school to work using the mvad data |
|
|
216 | (1) |
|
11.3 Describing and visualizing sequence datasets |
|
|
217 | (4) |
|
11.3.1 Exploring the data, sequence frequency and state distribution plots |
|
|
217 | (2) |
|
11.3.2 Calculating entropy and turbulence |
|
|
219 | (2) |
|
11.4 Measuring similarities and distances between sequences |
|
|
221 | (1) |
|
11.5 Producing typologies of trajectories: cluster analysis |
|
|
221 | (2) |
|
11.6 Event sequence analysis |
|
|
223 | (1) |
|
11.7 Criticisms of the OM approach and the dynamic future of sequence analysis |
|
|
224 | (3) |
|
|
225 | (2) |
Appendix 1 Description of the data used in this book |
|
227 | (5) |
Appendix 2 Survival and event history analysis using stata |
|
232 | (23) |
Glossary |
|
255 | (6) |
References |
|
261 | (12) |
Index |
|
273 | |