About the Author |
|
xv | |
About the Technical Reviewer |
|
xvii | |
Acknowledgments |
|
xix | |
Introduction |
|
xxi | |
|
Chapter 1 Introduction to R Programming |
|
|
1 | (50) |
|
|
1 | (2) |
|
|
3 | (10) |
|
|
4 | (2) |
|
|
6 | (3) |
|
|
9 | (2) |
|
|
11 | (2) |
|
|
13 | (1) |
|
|
13 | (13) |
|
Getting Documentation for Functions |
|
|
14 | (2) |
|
Writing Your Own Functions |
|
|
16 | (1) |
|
Summarizing and Vector Functions |
|
|
17 | (3) |
|
A Quick Look at Control Flow |
|
|
20 | (6) |
|
|
26 | (6) |
|
|
32 | (4) |
|
|
36 | (1) |
|
Dealing with Missing Values |
|
|
37 | (1) |
|
|
38 | (12) |
|
Writing Pipelines of Function Calls |
|
|
39 | (2) |
|
Writing Functions That Work with Pipelines |
|
|
41 | (1) |
|
|
42 | (5) |
|
Other Pipeline Operations |
|
|
47 | (2) |
|
Coding and Naming Conventions |
|
|
49 | (1) |
|
|
50 | (1) |
|
|
50 | (1) |
|
|
50 | (1) |
|
Chapter 2 Reproducible Analysis |
|
|
51 | (22) |
|
Literate Programming and Integration of Workflow and Documentation |
|
|
52 | (1) |
|
Creating an R Markdown/knitr Document in RStudio |
|
|
53 | (4) |
|
|
57 | (2) |
|
|
59 | (7) |
|
|
60 | (4) |
|
|
64 | (1) |
|
|
65 | (1) |
|
Controlling the Output (Templates/Stylesheets) |
|
|
66 | (1) |
|
Running R Code in Markdown Documents |
|
|
66 | (6) |
|
Using chunks when analyzing data (without compiling documents) |
|
|
69 | (1) |
|
|
70 | (1) |
|
|
71 | (1) |
|
|
72 | (1) |
|
Create an R Markdown Document |
|
|
72 | (1) |
|
|
72 | (1) |
|
|
72 | (1) |
|
Chapter 3 Data Manipulation |
|
|
73 | (48) |
|
|
73 | (2) |
|
|
75 | (2) |
|
|
77 | (2) |
|
Examples of Reading and Formatting Data Sets |
|
|
79 | (13) |
|
|
79 | (8) |
|
|
87 | (3) |
|
|
90 | (2) |
|
Manipulating Data with dplyr |
|
|
92 | (26) |
|
Some Useful dplyr Functions |
|
|
94 | (12) |
|
Breast Cancer Data Manipulation |
|
|
106 | (4) |
|
|
110 | (8) |
|
|
118 | (3) |
|
|
118 | (1) |
|
|
119 | (1) |
|
|
119 | (2) |
|
Chapter 4 Visualizing Data |
|
|
121 | (40) |
|
|
121 | (7) |
|
The Grammar of Graphics and the ggplot2 Package |
|
|
128 | (13) |
|
|
129 | (4) |
|
|
133 | (8) |
|
|
141 | (4) |
|
|
145 | (11) |
|
Themes and Other Graphics Transformations |
|
|
151 | (5) |
|
Figures with Multiple Plots |
|
|
156 | (4) |
|
|
160 | (1) |
|
Chapter 5 Working with Large Data Sets |
|
|
161 | (18) |
|
Subsample Your Data Before You Analyze the Full Data Set |
|
|
162 | (2) |
|
Running Out of Memory During an Analysis |
|
|
164 | (2) |
|
|
166 | (5) |
|
|
171 | (2) |
|
|
173 | (4) |
|
|
177 | (2) |
|
|
177 | (1) |
|
|
177 | (2) |
|
Chapter 6 Supervised Learning |
|
|
179 | (60) |
|
|
179 | (1) |
|
|
180 | (3) |
|
Regression vs. Classification |
|
|
181 | (1) |
|
|
182 | (1) |
|
|
183 | (21) |
|
|
183 | (6) |
|
Logistic Regression (Classification, Really) |
|
|
189 | (5) |
|
Model Matrices and Formula |
|
|
194 | (10) |
|
|
204 | (14) |
|
Evaluating Regression Models |
|
|
206 | (3) |
|
Evaluating Classification Models |
|
|
209 | (1) |
|
|
210 | (3) |
|
|
213 | (2) |
|
Sensitivity and Specificity |
|
|
215 | (1) |
|
|
216 | (2) |
|
|
218 | (1) |
|
|
218 | (11) |
|
Random Permutations of Your Data |
|
|
219 | (4) |
|
|
223 | (4) |
|
Selecting Random Training and Testing Data |
|
|
227 | (2) |
|
Examples of Supervised Learning Packages |
|
|
229 | (6) |
|
|
230 | (2) |
|
|
232 | (1) |
|
|
233 | (2) |
|
|
235 | (1) |
|
|
235 | (1) |
|
|
236 | (3) |
|
|
236 | (1) |
|
Evaluating Different Classification Measures |
|
|
236 | (1) |
|
Breast Cancer Classification |
|
|
237 | (1) |
|
Leave-One-Out Cross-Validation (Slightly More Difficult) |
|
|
237 | (1) |
|
|
237 | (1) |
|
|
237 | (1) |
|
|
238 | (1) |
|
|
238 | (1) |
|
Compare Classification Algorithms |
|
|
238 | (1) |
|
Chapter 7 Unsupervised Learning |
|
|
239 | (36) |
|
|
239 | (16) |
|
Principal Component Analysis |
|
|
240 | (10) |
|
|
250 | (5) |
|
|
255 | (12) |
|
|
255 | (8) |
|
|
263 | (4) |
|
|
267 | (6) |
|
|
273 | (2) |
|
Dealing with Missing Data in the HouseVotes84 Data |
|
|
273 | (1) |
|
|
274 | (1) |
|
Chapter 8 Project 1: Hitting the Bottle |
|
|
275 | (12) |
|
|
275 | (1) |
|
|
276 | (6) |
|
Distribution of Quality Scores |
|
|
276 | (1) |
|
Is This Wine Red or White? |
|
|
277 | (5) |
|
|
282 | (3) |
|
|
285 | (2) |
|
|
285 | (1) |
|
Exploring Different Models |
|
|
285 | (1) |
|
Analyzing Your Own Data Set |
|
|
285 | (2) |
|
Chapter 9 Deeper into R Programming |
|
|
287 | (42) |
|
|
287 | (3) |
|
|
287 | (2) |
|
|
289 | (1) |
|
|
290 | (4) |
|
|
291 | (1) |
|
|
291 | (1) |
|
|
292 | (1) |
|
|
292 | (1) |
|
|
293 | (1) |
|
|
294 | (12) |
|
|
294 | (2) |
|
|
296 | (2) |
|
|
298 | (2) |
|
|
300 | (4) |
|
|
304 | (1) |
|
|
305 | (1) |
|
|
305 | (1) |
|
|
306 | (5) |
|
|
306 | (1) |
|
|
307 | (4) |
|
|
311 | (11) |
|
|
312 | (1) |
|
|
313 | (1) |
|
|
314 | (1) |
|
|
315 | (2) |
|
|
317 | (5) |
|
Function Names Are Different from Variable Names |
|
|
322 | (1) |
|
|
322 | (3) |
|
|
325 | (4) |
|
|
325 | (1) |
|
|
325 | (1) |
|
|
325 | (1) |
|
|
326 | (1) |
|
|
326 | (1) |
|
Selecting the K Smallest Element |
|
|
327 | (2) |
|
Chapter 10 Working with Vectors and Lists |
|
|
329 | (20) |
|
Working with Vectors and Vectorizing Functions |
|
|
329 | (13) |
|
|
332 | (1) |
|
|
332 | (3) |
|
|
335 | (1) |
|
|
336 | (3) |
|
Nothing Good, It Would Seem |
|
|
339 | (1) |
|
|
340 | (2) |
|
|
342 | (1) |
|
|
342 | (5) |
|
|
342 | (1) |
|
|
343 | (1) |
|
|
344 | (3) |
|
How Mutable Is Data Anyway? |
|
|
347 | (1) |
|
|
348 | (1) |
|
|
348 | (1) |
|
|
348 | (1) |
|
Chapter 11 Functional Programming |
|
|
349 | (24) |
|
|
349 | (2) |
|
|
351 | (6) |
|
Functions Taking Functions As Arguments |
|
|
351 | (1) |
|
Functions Returning Functions (and Closures) |
|
|
352 | (5) |
|
|
357 | (3) |
|
Functional Programming with purrr |
|
|
360 | (3) |
|
Functions As Both Input and Output |
|
|
363 | (7) |
|
|
368 | (2) |
|
|
370 | (3) |
|
|
370 | (1) |
|
|
370 | (1) |
|
|
370 | (1) |
|
|
370 | (1) |
|
|
371 | (1) |
|
|
371 | (2) |
|
Chapter 12 Object-Oriented Programming |
|
|
373 | (18) |
|
Immutable Objects and Polymorphic Functions |
|
|
373 | (1) |
|
|
374 | (2) |
|
Example: Bayesian Linear Model Fitting |
|
|
374 | (2) |
|
|
376 | (3) |
|
|
379 | (3) |
|
Defining Your Own Polymorphic Functions |
|
|
380 | (2) |
|
|
382 | (6) |
|
Specialization As Interface |
|
|
383 | (1) |
|
Specialization in Implementations |
|
|
384 | (4) |
|
|
388 | (3) |
|
|
388 | (1) |
|
|
389 | (2) |
|
Chapter 13 Building an R Package |
|
|
391 | (18) |
|
|
391 | (2) |
|
|
392 | (1) |
|
The Structure of an R Package |
|
|
392 | (1) |
|
|
393 | (1) |
|
|
393 | (6) |
|
|
394 | (1) |
|
|
394 | (1) |
|
|
395 | (1) |
|
|
395 | (1) |
|
|
396 | (1) |
|
|
396 | (1) |
|
|
396 | (1) |
|
|
396 | (1) |
|
Using an Imported Package |
|
|
397 | (1) |
|
Using a Suggested Package |
|
|
398 | (1) |
|
|
399 | (1) |
|
|
400 | (1) |
|
|
400 | (1) |
|
|
401 | (4) |
|
|
401 | (1) |
|
|
402 | (2) |
|
Package Scope vs. Global Scope |
|
|
404 | (1) |
|
|
404 | (1) |
|
|
404 | (1) |
|
Adding Data to Your Package |
|
|
405 | (2) |
|
|
406 | (1) |
|
|
407 | (1) |
|
|
407 | (2) |
|
Chapter 14 Testing and Package Checking |
|
|
409 | (10) |
|
|
409 | (3) |
|
|
411 | (1) |
|
|
412 | (5) |
|
|
414 | (1) |
|
Using Random Numbers in Tests |
|
|
415 | (1) |
|
|
416 | (1) |
|
Checking a Package for Consistency |
|
|
417 | (1) |
|
|
417 | (2) |
|
Chapter 15 Version Control |
|
|
419 | (22) |
|
Version Control and Repositories |
|
|
419 | (1) |
|
|
420 | (14) |
|
|
421 | (1) |
|
Making Changes to Files, Staging Files, and Committing Changes |
|
|
422 | (2) |
|
Adding Git to an Existing Project |
|
|
424 | (1) |
|
Bare Repositories and Cloning Repositories |
|
|
425 | (1) |
|
Pushing Local Changes and Fetching and Pulling Remote Changes |
|
|
426 | (2) |
|
|
428 | (1) |
|
|
429 | (3) |
|
Typical Workflows Involve Lots of Branches |
|
|
432 | (1) |
|
Pushing Branches to the Global Repository |
|
|
433 | (1) |
|
|
434 | (3) |
|
Moving an Existing Repository to GitHub |
|
|
436 | (1) |
|
Installing Packages from GitHub |
|
|
437 | (1) |
|
|
437 | (3) |
|
|
438 | (1) |
|
Forking Repositories Instead of Cloning |
|
|
438 | (2) |
|
|
440 | (1) |
|
Chapter 16 Profiling and Optimizing |
|
|
441 | (30) |
|
|
441 | (15) |
|
|
442 | (14) |
|
|
456 | (5) |
|
|
461 | (5) |
|
|
466 | (3) |
|
|
469 | (2) |
|
Chapter 17 Project 2: Bayesian Linear Regression |
|
|
471 | (30) |
|
Bayesian Linear Regression |
|
|
471 | (7) |
|
Exercises: Priors and Posteriors |
|
|
473 | (3) |
|
Predicting Target Variables for New Predictor Values |
|
|
476 | (2) |
|
Formulas and Their Model Matrix |
|
|
478 | (9) |
|
Working with Model Matrices in R |
|
|
480 | (5) |
|
|
485 | (1) |
|
Model Matrices Without Response Variables |
|
|
485 | (2) |
|
|
487 | (1) |
|
|
487 | (10) |
|
|
488 | (1) |
|
Updating Distributions: An Example Interface |
|
|
489 | (5) |
|
|
494 | (1) |
|
|
494 | (3) |
|
Building an R Package for blm |
|
|
497 | (3) |
|
Deciding on the Package Interface |
|
|
497 | (1) |
|
Organization of Source Files |
|
|
498 | (1) |
|
Document Your Package Interface Well |
|
|
498 | (1) |
|
Adding README and NEWS Files to Your Package |
|
|
499 | (1) |
|
|
500 | (1) |
|
|
500 | (1) |
|
|
501 | (4) |
|
|
501 | (1) |
|
|
501 | (1) |
|
|
502 | (1) |
|
|
502 | (1) |
|
|
503 | (2) |
Index |
|
505 | |