This in-depth advanced guide shows you how to conduct data analysis using the popular R language and how with some practical programming, you can make your work more efficient by writing functions or packages, and how to automate running code and the creation of reports to share your results.
This book is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, this is designed to be a practical guide moving beyond merely using R to programming in R to automate tasks.
This book will also show how to do data manipulation in R including connecting R to data bases such as SQL and a variety of advanced statistical analyses including generalized additive models, mixed effects models, multiple imputation, and machine learning techniques.
The book closes with a hands-on section to get R running in the cloud. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics.
What You'll Learn:
• How to write and document R functions
• How to make an R package and share it via GitHub or privately
• How to add tests to R code to insure it works as intended
• How to add automatic package building with GitHub
• How to have R talk directly to data bases and do complex data management
• How to conduct advanced analyses in R including: generalized linear models, generalized additive models, and mixed effects models
• How to address missing data using multiple imputation in R
• How to run R in the Amazon cloud
• How to generate presentation-ready tables and reports using R
Audience:
Advanced R: Applied Programming and Data Analysis is intended for working professionals, researchers, or students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level, automate repetitive tasks, and use R to speed up their workflow such as reading data in directly from the internet or generating presentation-ready reports and tables directly from their model results.
About the Authors |
|
xiii | |
About the Technical Reviewer |
|
xv | |
Acknowledgments |
|
xvii | |
Introduction |
|
xix | |
|
Chapter 1 Programming Basics |
|
|
1 | (16) |
|
Advanced R Software Choices |
|
|
1 | (1) |
|
|
2 | (1) |
|
|
2 | (3) |
|
Base Operators and Functions |
|
|
5 | (6) |
|
Mathematical Operators and Functions |
|
|
11 | (4) |
|
|
15 | (2) |
|
Chapter 2 Programming Utilities |
|
|
17 | (12) |
|
|
17 | (1) |
|
|
18 | (5) |
|
|
23 | (2) |
|
|
25 | (2) |
|
|
27 | (2) |
|
Chapter 3 Programming Automation |
|
|
29 | (14) |
|
|
29 | (3) |
|
|
32 | (3) |
|
*apply Family of Functions |
|
|
35 | (7) |
|
|
42 | (1) |
|
Chapter 4 Writing Functions |
|
|
43 | (18) |
|
|
43 | (1) |
|
|
44 | (3) |
|
|
47 | (5) |
|
|
52 | (7) |
|
|
59 | (2) |
|
Chapter 5 Writing Classes and Methods |
|
|
61 | (22) |
|
|
61 | (10) |
|
|
61 | (3) |
|
|
64 | (7) |
|
|
71 | (9) |
|
|
72 | (4) |
|
|
76 | (1) |
|
|
77 | (3) |
|
|
80 | (3) |
|
Chapter 6 Writing a Package |
|
|
83 | (32) |
|
|
83 | (6) |
|
|
84 | (5) |
|
|
89 | (9) |
|
Starting a Package by Using DevTools |
|
|
90 | (2) |
|
|
92 | (1) |
|
|
93 | (5) |
|
Documentation Using roxygen2 |
|
|
98 | (9) |
|
|
99 | (3) |
|
|
102 | (1) |
|
|
103 | (1) |
|
|
104 | (3) |
|
Building, Installing, and Distributing an R Package |
|
|
107 | (5) |
|
|
112 | (3) |
|
Chapter 7 Introduction to Data Management Using data.table |
|
|
115 | (26) |
|
Introduction to data.table |
|
|
115 | (5) |
|
Selecting and Subsetting Data |
|
|
120 | (5) |
|
|
120 | (2) |
|
|
122 | (1) |
|
Using the Second and Third Formals |
|
|
123 | (2) |
|
Variable Renaming and Ordering |
|
|
125 | (2) |
|
Computing on Data and Creating Variables |
|
|
127 | (3) |
|
Merging and Reshaping Data |
|
|
130 | (10) |
|
|
130 | (6) |
|
|
136 | (4) |
|
|
140 | (1) |
|
Chapter 8 Data Munging with data.table |
|
|
141 | (18) |
|
|
142 | (8) |
|
|
143 | (5) |
|
|
148 | (2) |
|
|
150 | (2) |
|
|
152 | (5) |
|
|
157 | (2) |
|
Chapter 9 Other Tools for Data Management |
|
|
159 | (22) |
|
|
160 | (2) |
|
|
162 | (6) |
|
Variable Renaming and Ordering |
|
|
168 | (2) |
|
Computing on Data and Creating Variables |
|
|
170 | (3) |
|
Merging and Reshaping Data |
|
|
173 | (5) |
|
|
178 | (3) |
|
Chapter 10 Reading Big Data(bases) |
|
|
181 | (18) |
|
|
182 | (4) |
|
Installing SQLite on Windows |
|
|
182 | (1) |
|
|
183 | (3) |
|
|
186 | (4) |
|
Installing PostgreSQL on Windows |
|
|
186 | (1) |
|
|
187 | (3) |
|
|
190 | (6) |
|
Installing MongoDB on Windows |
|
|
190 | (2) |
|
|
192 | (4) |
|
|
196 | (3) |
|
Chapter 11 Getting a Cloud |
|
|
199 | (12) |
|
|
199 | (1) |
|
Starting Amazon Web Services |
|
|
200 | (5) |
|
Accessing Your Instance's Command Line |
|
|
205 | (2) |
|
Uploading Files to Your Instance |
|
|
207 | (2) |
|
|
209 | (2) |
|
Chapter 12 Cloud Ubuntu for Windows Users |
|
|
211 | (14) |
|
|
211 | (2) |
|
|
213 | (2) |
|
|
215 | (3) |
|
Installing and Using RStudio Server |
|
|
218 | (4) |
|
|
222 | (2) |
|
|
224 | (1) |
|
Installing Shiny on Your Cloud |
|
|
224 | (1) |
|
|
224 | (1) |
|
Chapter 13 Every Cloud has a Shiny Lining |
|
|
225 | (14) |
|
|
225 | (7) |
|
|
232 | (2) |
|
Uploading a User File into Shiny |
|
|
234 | (2) |
|
Hosting Shiny in the Cloud |
|
|
236 | (2) |
|
|
238 | (1) |
|
Chapter 14 Shiny Dashboard Sampler |
|
|
239 | (14) |
|
|
239 | (6) |
|
|
241 | (1) |
|
|
241 | (2) |
|
|
243 | (2) |
|
|
245 | (2) |
|
|
247 | (4) |
|
|
251 | (2) |
|
Chapter 15 Dynamic Reports and the Cloud |
|
|
253 | (18) |
|
|
253 | (1) |
|
|
253 | (1) |
|
|
254 | (1) |
|
|
254 | (4) |
|
Dynamic Documents and Shiny |
|
|
258 | (11) |
|
|
258 | (3) |
|
|
261 | (2) |
|
|
263 | (6) |
|
|
269 | (1) |
|
|
269 | (2) |
References |
|
271 | (4) |
Index |
|
275 | |
Joshua F. Wiley is a lecturer in the Monash Institute for Cognitive and Clinical Neurosciences and School of Psychological Sciences at Monash University and a senior partner at Elkhart Group Limited, a statistical consultancy. He earned his PhD from the University of California, Los Angeles, and his research focuses on using advanced quantitative methods to understand the complex interplays of psychological, social, and physiological processes in relation to psychological and physical health. In statistics and data science, Joshua focuses on biostatistics and is interested in reproducible research and graphical displays of data and statistical models. Through consulting at Elkhart Group Limited and former work at the UCLA Statistical Consulting Group, he has supported a wide array of clients ranging from graduate students, to experienced researchers, and biotechnology companies. He also develops or co-develops a number of R packages including varian, a package to conduct Bayesian scale-location structural equation models, and MplusAutomation, a popular package that links R to the commercial Mplus software. Matt Wiley is a tenured, associate professor of mathematics with awards in both mathematics education and honour student engagement. He earned degrees in pure mathematics, computer science, and business administration through the University of California and Texas A&M systems. He serves as director for Victoria Colleges quality enhancement plan and managing partner at Elkhart Group Limited, a statistical consultancy. With programming experience in R, C++, Ruby, Fortran, and JavaScript, he has always found ways to meld his passion for writing with his joy of logical problem solving and data science. From the boardroom to the classroom, Matt enjoys finding dynamic ways to partner with interdisciplinary and diverse teams to make complex ideas and projects understandable and solvable. iv>