|
Part I Setting the Stage: Rationale Behind and Challenges to Health Data Analysis |
|
|
|
1 Objectives of the Secondary Analysis of Electronic Health Record Data |
|
|
3 | (6) |
|
|
3 | (1) |
|
1.2 Current Research Climate |
|
|
3 | (1) |
|
1.3 Power of the Electronic Health Record |
|
|
4 | (1) |
|
1.4 Pitfalls and Challenges |
|
|
5 | (1) |
|
|
6 | (3) |
|
|
7 | (2) |
|
2 Review of Clinical Databases |
|
|
9 | (8) |
|
|
9 | (1) |
|
|
9 | (1) |
|
2.3 The Medical Information Mart for Intensive Care (MIMIC) Database |
|
|
10 | (2) |
|
|
11 | (1) |
|
2.3.2 Access and Interface |
|
|
12 | (1) |
|
|
12 | (1) |
|
|
12 | (1) |
|
2.4.2 Access and Interface |
|
|
13 | (1) |
|
|
13 | (1) |
|
|
13 | (1) |
|
2.5.2 Access and Interface |
|
|
13 | (1) |
|
2.6 Other Ongoing Research |
|
|
14 | (3) |
|
|
14 | (1) |
|
|
14 | (1) |
|
|
15 | (1) |
|
|
16 | (1) |
|
3 Challenges and Opportunities in Secondary Analyses of Electronic Health Record Data |
|
|
17 | (10) |
|
|
17 | (1) |
|
3.2 Challenges in Secondary Analysis of Electronic Health Records Data |
|
|
17 | (3) |
|
3.3 Opportunities in Secondary Analysis of Electronic Health Records Data |
|
|
20 | (1) |
|
3.4 Secondary EHR Analyses as Alternatives to Randomized Controlled Clinical Trials |
|
|
21 | (1) |
|
3.5 Demonstrating the Power of Secondary EHR Analysis: Examples in Pharmacovigilance and Clinical Care |
|
|
22 | (1) |
|
3.6 A New Paradigm for Supporting Evidence-Based Practice and Ethical Considerations |
|
|
23 | (4) |
|
|
25 | (2) |
|
4 Pulling It All Together: Envisioning a Data-Driven, Ideal Care System |
|
|
27 | (16) |
|
4.1 Use Case Examples Based on Unavoidable Medical Heterogeneity |
|
|
28 | (1) |
|
4.2 Clinical Workflow, Documentation, and Decisions |
|
|
29 | (3) |
|
4.3 Levels of Precision and Personalization |
|
|
32 | (3) |
|
4.4 Coordination, Communication, and Guidance Through the Clinical Labyrinth |
|
|
35 | (1) |
|
4.5 Safety and Quality in an ICS |
|
|
36 | (3) |
|
|
39 | (4) |
|
|
41 | (2) |
|
|
43 | (8) |
|
|
43 | (1) |
|
|
44 | (2) |
|
|
44 | (1) |
|
|
45 | (1) |
|
|
46 | (1) |
|
5.3 Data Merger and Organization |
|
|
46 | (1) |
|
|
47 | (1) |
|
|
47 | (1) |
|
|
48 | (1) |
|
|
48 | (1) |
|
|
49 | (2) |
|
|
49 | (2) |
|
6 Integrating Non-clinical Data with EHRs |
|
|
51 | (10) |
|
|
51 | (1) |
|
6.2 Non-clinical Factors and Determinants of Health |
|
|
51 | (2) |
|
6.3 Increasing Data Availability |
|
|
53 | (1) |
|
6.4 Integration, Application and Calibration |
|
|
54 | (3) |
|
6.5 A Well-Connected Empowerment |
|
|
57 | (1) |
|
|
58 | (3) |
|
|
59 | (2) |
|
7 Using EHR to Conduct Outcome and Health Services Research |
|
|
61 | (10) |
|
|
61 | (1) |
|
7.2 The Rise of EHRs in Health Services Research |
|
|
62 | (2) |
|
7.2.1 The EHR in Outcomes and Observational Studies |
|
|
62 | (1) |
|
7.2.2 The EHR as Tool to Facilitate Patient Enrollment in Prospective Trials |
|
|
63 | (1) |
|
7.2.3 The EHR as Tool to Study and Improve Patient Outcomes |
|
|
64 | (1) |
|
7.3 How to Avoid Common Pitfalls When Using EHR to Do Health Services Research |
|
|
64 | (3) |
|
7.3.1 Step 1: Recognize the Fallibility of the EHR |
|
|
65 | (1) |
|
7.3.2 Step 2: Understand Confounding, Bias, and Missing Data When Using the HER for Research |
|
|
65 | (2) |
|
7.4 Future Directions for the EHR and Health Services Research |
|
|
67 | (1) |
|
7.4.1 Ensuring Adequate Patient Privacy Protection |
|
|
67 | (1) |
|
7.5 Multidimensional Collaborations |
|
|
67 | (1) |
|
|
68 | (3) |
|
|
68 | (3) |
|
8 Residual Confounding Lurking in Big Data: A Source of Error |
|
|
71 | (10) |
|
|
71 | (1) |
|
8.2 Confounding Variables in Big Data |
|
|
72 | (5) |
|
8.2.1 The Obesity Paradox |
|
|
72 | (1) |
|
|
73 | (1) |
|
8.2.3 Uncertain Pathophysiology |
|
|
74 | (3) |
|
|
77 | (4) |
|
|
77 | (4) |
|
Part II A Cookbook: From Research Question Formulation to Validation of Findings |
|
|
|
9 Formulating the Research Question |
|
|
81 | (12) |
|
|
81 | (1) |
|
9.2 The Clinical Scenario: Impact of Indwelling Arterial Catheters |
|
|
82 | (1) |
|
9.3 Turning Clinical Questions into Research Questions |
|
|
82 | (3) |
|
|
82 | (1) |
|
|
83 | (1) |
|
|
84 | (1) |
|
9.4 Matching Study Design to the Research Question |
|
|
85 | (2) |
|
9.5 Types of Observational Research |
|
|
87 | (2) |
|
9.6 Choosing the Right Database |
|
|
89 | (1) |
|
|
90 | (3) |
|
|
91 | (2) |
|
10 Defining the Patient Cohort |
|
|
93 | (8) |
|
|
93 | (1) |
|
10.2 Part 1---Theoretical Concepts |
|
|
94 | (4) |
|
10.2.1 Exposure and Outcome of Interest |
|
|
94 | (1) |
|
|
95 | (1) |
|
10.2.3 Building the Study Cohort |
|
|
95 | (2) |
|
|
97 | (1) |
|
10.2.5 Data Visualization |
|
|
97 | (1) |
|
10.2.6 Study Cohort Fidelity |
|
|
98 | (1) |
|
10.3 Part 2---Case Study: Cohort Selection |
|
|
98 | (3) |
|
|
100 | (1) |
|
|
101 | (14) |
|
|
101 | (1) |
|
11.2 Part 1---Theoretical Concepts |
|
|
102 | (7) |
|
11.2.1 Categories of Hospital Data |
|
|
102 | (1) |
|
11.2.2 Context and Collaboration |
|
|
103 | (1) |
|
11.2.3 Quantitative and Qualitative Data |
|
|
104 | (1) |
|
11.2.4 Data Files and Databases |
|
|
104 | (3) |
|
|
107 | (2) |
|
11.3 Part 2---Practical Examples of Data Preparation |
|
|
109 | (6) |
|
|
109 | (1) |
|
|
109 | (3) |
|
|
112 | (1) |
|
11.3.4 Ranking Across Rows Using a Window Function |
|
|
113 | (1) |
|
11.3.5 Making Queries More Manageable Using WITH |
|
|
113 | (1) |
|
|
114 | (1) |
|
|
115 | (28) |
|
|
115 | (1) |
|
12.2 Part 1---Theoretical Concepts |
|
|
116 | (5) |
|
|
116 | (2) |
|
|
118 | (1) |
|
12.2.3 Data Transformation |
|
|
119 | (1) |
|
|
120 | (1) |
|
12.3 Part 2---Examples of Data Pre-processing in R |
|
|
121 | (19) |
|
|
121 | (8) |
|
|
129 | (3) |
|
12.3.3 Data Transformation |
|
|
132 | (4) |
|
|
136 | (4) |
|
|
140 | (3) |
|
|
141 | (2) |
|
|
143 | (20) |
|
|
143 | (1) |
|
13.2 Part 1---Theoretical Concepts |
|
|
144 | (9) |
|
13.2.1 Types of Missingness |
|
|
144 | (2) |
|
13.2.2 Proportion of Missing Data |
|
|
146 | (1) |
|
13.2.3 Dealing with Missing Data |
|
|
146 | (6) |
|
13.2.4 Choice of the Best Imputation Method |
|
|
152 | (1) |
|
|
153 | (8) |
|
13.3.1 Proportion of Missing Data and Possible Reasons for Missingness |
|
|
153 | (1) |
|
13.3.2 Univariate Missingness Analysis |
|
|
154 | (5) |
|
13.3.3 Evaluating the Performance of Imputation Methods on Mortality Prediction |
|
|
159 | (2) |
|
|
161 | (2) |
|
|
161 | (2) |
|
|
163 | (22) |
|
|
163 | (1) |
|
14.2 Part 1---Theoretical Concepts |
|
|
164 | (1) |
|
|
165 | (3) |
|
|
166 | (1) |
|
|
166 | (1) |
|
|
166 | (1) |
|
14.3.4 Interquartile Range with Log-Normal Distribution |
|
|
167 | (1) |
|
14.3.5 Ordinary and Studentized Residuals |
|
|
167 | (1) |
|
|
167 | (1) |
|
14.3.7 Mahalanobis Distance |
|
|
168 | (1) |
|
14.4 Proximity Based Models |
|
|
168 | (3) |
|
|
169 | (1) |
|
|
169 | (1) |
|
14.4.3 Criteria for Outlier Detection |
|
|
169 | (2) |
|
14.5 Supervised Outlier Detection |
|
|
171 | (1) |
|
14.6 Outlier Analysis Using Expert Knowledge |
|
|
171 | (1) |
|
14.7 Case Study: Identification of Outliers in the Indwelling Arterial Catheter (IAC) Study |
|
|
171 | (1) |
|
14.8 Expert Knowledge Analysis |
|
|
172 | (1) |
|
|
172 | (5) |
|
14.10 Multivariable Analysis |
|
|
177 | (2) |
|
14.11 Classification of Mortality in IAC and Non-IAC Patients |
|
|
179 | (2) |
|
14.12 Conclusions and Summary |
|
|
181 | (4) |
|
|
182 | (1) |
|
|
183 | (2) |
|
15 Exploratory Data Analysis |
|
|
185 | (20) |
|
|
185 | (1) |
|
15.2 Part 1---Theoretical Concepts |
|
|
186 | (13) |
|
15.2.1 Suggested EDA Techniques |
|
|
186 | (1) |
|
|
187 | (4) |
|
|
191 | (8) |
|
|
199 | (3) |
|
|
199 | (1) |
|
|
200 | (2) |
|
|
202 | (3) |
|
|
202 | (1) |
|
|
203 | (2) |
|
|
205 | (58) |
|
16.1 Introduction to Data Analysis |
|
|
205 | (5) |
|
|
205 | (1) |
|
16.1.2 Identifying Data Types and Study Objectives |
|
|
206 | (3) |
|
|
209 | (1) |
|
|
210 | (14) |
|
|
210 | (1) |
|
|
210 | (3) |
|
|
213 | (7) |
|
16.2.4 Reporting and Interpreting Linear Regression |
|
|
220 | (3) |
|
16.2.5 Caveats and Conclusions |
|
|
223 | (1) |
|
|
224 | (13) |
|
|
224 | (1) |
|
|
225 | (1) |
|
|
225 | (2) |
|
16.3.4 Introducing Logistic Regression |
|
|
227 | (5) |
|
16.3.5 Hypothesis Testing and Model Selection |
|
|
232 | (1) |
|
16.3.6 Confidence Intervals |
|
|
233 | (1) |
|
|
234 | (1) |
|
16.3.8 Presenting and Interpreting Logistic Regression Analysis |
|
|
235 | (1) |
|
16.3.9 Caveats and Conclusions |
|
|
236 | (1) |
|
|
237 | (7) |
|
|
237 | (1) |
|
|
237 | (1) |
|
16.4.3 Kaplan-Meier Survival Curves |
|
|
238 | (2) |
|
16.4.4 Cox Proportional Hazards Models |
|
|
240 | (3) |
|
16.4.5 Caveats and Conclusions |
|
|
243 | (1) |
|
16.5 Case Study and Summary |
|
|
244 | (19) |
|
|
244 | (1) |
|
|
244 | (6) |
|
16.5.3 Logistic Regression Analysis |
|
|
250 | (9) |
|
16.5.4 Conclusion and Summary |
|
|
259 | (2) |
|
|
261 | (2) |
|
17 Sensitivity Analysis and Model Validation |
|
|
263 | (12) |
|
|
263 | (1) |
|
17.2 Part 1---Theoretical Concepts |
|
|
264 | (3) |
|
|
264 | (1) |
|
17.2.2 Common Evaluation Tools |
|
|
265 | (1) |
|
17.2.3 Sensitivity Analysis |
|
|
265 | (1) |
|
|
266 | (1) |
|
17.3 Case Study: Examples of Validation and Sensitivity Analysis |
|
|
267 | (3) |
|
17.3.1 Analysis 1: Varying the Inclusion Criteria of Time to Mechanical Ventilation |
|
|
267 | (1) |
|
17.3.2 Analysis 2: Changing the Caliper Level for Propensity Matching |
|
|
268 | (1) |
|
17.3.3 Analysis 3: Hosmer-Lemeshow Test |
|
|
269 | (1) |
|
17.3.4 Implications for a `Failing' Model |
|
|
269 | (1) |
|
|
270 | (5) |
|
|
270 | (1) |
|
|
271 | (4) |
|
Part III Case Studies Using MIMIC |
|
|
|
18 Trend Analysis: Evolution of Tidal Volume Over Time for Patients Receiving Invasive Mechanical Ventilation |
|
|
275 | (10) |
|
|
275 | (2) |
|
|
277 | (1) |
|
18.3 Study Pre-processing |
|
|
277 | (1) |
|
|
277 | (1) |
|
|
278 | (2) |
|
|
280 | (1) |
|
|
280 | (1) |
|
|
281 | (4) |
|
|
282 | (1) |
|
|
282 | (3) |
|
19 Instrumental Variable Analysis of Electronic Health Records |
|
|
285 | (10) |
|
|
285 | (2) |
|
|
287 | (4) |
|
|
287 | (1) |
|
|
287 | (3) |
|
|
290 | (1) |
|
|
291 | (1) |
|
|
292 | (1) |
|
|
293 | (2) |
|
|
293 | (1) |
|
|
293 | (2) |
|
20 Mortality Prediction in the ICU Based on MIMIC-II Results from the Super ICU Learner Algorithm (SICULA) Project |
|
|
295 | (20) |
|
|
295 | (2) |
|
20.2 Dataset and Pre-preprocessing |
|
|
297 | (2) |
|
20.2.1 Data Collection and Patients Characteristics |
|
|
297 | (1) |
|
20.2.2 Patient Inclusion and Measures |
|
|
297 | (2) |
|
|
299 | (3) |
|
20.3.1 Prediction Algorithms |
|
|
299 | (2) |
|
20.3.2 Performance Metrics |
|
|
301 | (1) |
|
|
302 | (6) |
|
|
302 | (1) |
|
|
303 | (2) |
|
20.4.3 Super Learner Library |
|
|
305 | (1) |
|
20.4.4 Reclassification Tables |
|
|
305 | (3) |
|
|
308 | (1) |
|
20.6 What Are the Next Steps? |
|
|
309 | (1) |
|
|
309 | (6) |
|
|
310 | (1) |
|
|
311 | (4) |
|
21 Mortality Prediction in the ICU |
|
|
315 | (10) |
|
|
315 | (1) |
|
|
316 | (1) |
|
|
317 | (1) |
|
|
318 | (1) |
|
|
319 | (1) |
|
|
319 | (2) |
|
|
321 | (1) |
|
|
321 | (1) |
|
|
322 | (3) |
|
|
323 | (1) |
|
|
323 | (2) |
|
22 Data Fusion Techniques for Early Warning of Clinical Deterioration |
|
|
325 | (14) |
|
|
325 | (1) |
|
|
326 | (1) |
|
|
327 | (1) |
|
|
328 | (2) |
|
|
330 | (3) |
|
|
333 | (2) |
|
|
335 | (1) |
|
|
335 | (1) |
|
22.9 Personalised Prediction of Deteriorations |
|
|
336 | (3) |
|
|
337 | (1) |
|
|
337 | (2) |
|
23 Comparative Effectiveness: Propensity Score Analysis |
|
|
339 | (12) |
|
23.1 Incentives for Using Propensity Score Analysis |
|
|
339 | (1) |
|
23.2 Concerns for Using Propensity Score |
|
|
340 | (1) |
|
23.3 Different Approaches for Estimating Propensity Scores |
|
|
340 | (1) |
|
23.4 Using Propensity Score to Adjust for Pre-treatment Conditions |
|
|
341 | (2) |
|
23.5 Study Pre-processing |
|
|
343 | (3) |
|
|
346 | (1) |
|
|
346 | (1) |
|
|
347 | (1) |
|
|
347 | (4) |
|
|
348 | (1) |
|
|
348 | (3) |
|
24 Markov Models and Cost Effectiveness Analysis: Applications in Medical Research |
|
|
351 | (18) |
|
|
351 | (1) |
|
24.2 Formalization of Common Markov Models |
|
|
352 | (4) |
|
|
352 | (1) |
|
24.2.2 Exploring Markov Chains with Monte Carlo Simulations |
|
|
353 | (2) |
|
24.2.3 Markov Decision Process and Hidden Markov Models |
|
|
355 | (1) |
|
24.2.4 Medical Applications of Markov Models |
|
|
356 | (1) |
|
24.3 Basics of Health Economics |
|
|
356 | (3) |
|
24.3.1 The Goal of Health Economics: Maximizing Cost-Effectiveness |
|
|
356 | (1) |
|
|
357 | (2) |
|
24.4 Case Study: Monte Carlo Simulations of a Markov Chain for Daily Sedation Holds in Intensive Care, with Cost-Effectiveness Analysis |
|
|
359 | (5) |
|
24.5 Model Validation and Sensitivity Analysis for Cost-Effectiveness Analysis |
|
|
364 | (1) |
|
|
365 | (1) |
|
|
366 | (3) |
|
|
366 | (1) |
|
|
366 | (3) |
|
25 Blood Pressure and the Risk of Acute Kidney Injury in the ICU: Case-Control Versus Case-Crossover Designs |
|
|
369 | (8) |
|
|
369 | (1) |
|
|
370 | (4) |
|
25.2.1 Data Pre-processing |
|
|
370 | (1) |
|
25.2.2 A Case-Control Study |
|
|
370 | (2) |
|
25.2.3 A Case-Crossover Design |
|
|
372 | (2) |
|
|
374 | (1) |
|
|
374 | (3) |
|
|
375 | (1) |
|
|
375 | (2) |
|
26 Waveform Analysis to Estimate Respiratory Rate |
|
|
377 | (14) |
|
|
377 | (1) |
|
|
378 | (2) |
|
|
380 | (1) |
|
|
381 | (3) |
|
|
384 | (1) |
|
|
385 | (1) |
|
|
386 | (1) |
|
|
386 | (1) |
|
26.9 Non-contact Vital Sign Estimation |
|
|
387 | (4) |
|
|
388 | (1) |
|
|
389 | (2) |
|
27 Signal Processing: False Alarm Reduction |
|
|
391 | (14) |
|
|
391 | (2) |
|
|
393 | (1) |
|
27.3 Study Pre-processing |
|
|
394 | (1) |
|
|
395 | (2) |
|
|
397 | (1) |
|
27.6 Study Visualizations |
|
|
398 | (1) |
|
|
399 | (1) |
|
27.8 Next Steps/Potential Follow-Up Studies |
|
|
400 | (5) |
|
|
401 | (4) |
|
28 Improving Patient Cohort Identification Using Natural Language Processing |
|
|
405 | (14) |
|
|
405 | (2) |
|
|
407 | (3) |
|
28.2.1 Study Dataset and Pre-processing |
|
|
407 | (1) |
|
28.2.2 Structured Data Extraction from MIMIC-III Tables |
|
|
408 | (1) |
|
28.2.3 Unstructured Data Extraction from Clinical Notes |
|
|
409 | (1) |
|
|
410 | (1) |
|
|
410 | (3) |
|
|
413 | (1) |
|
|
414 | (5) |
|
|
414 | (1) |
|
|
415 | (4) |
|
29 Hyperparameter Selection |
|
|
419 | |
|
|
419 | (1) |
|
|
420 | (1) |
|
|
420 | (3) |
|
|
423 | (1) |
|
29.5 Study Visualizations |
|
|
424 | (1) |
|
|
425 | (1) |
|
|
425 | (1) |
|
|
426 | |
|
|
427 | |