Muutke küpsiste eelistusi

E-raamat: R for Everyone: Advanced Analytics and Graphics

  • Formaat - PDF+DRM
  • Hind: 37,43 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution.   Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you’ll need to accomplish 80 percent of modern data tasks. Lander’s self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You’ll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you’ll construct several complete models, both linear and nonlinear, and use some data mining techniques.   By the time you’re done, you won’t just know how to write R programs, you’ll be ready to tackle the statistical problems you care about most.   Coverage Includes: Exploring R, RStudio, and R packages Using R for math: variable types, vectors, calling functions, and more Exploiting data structures, including data.frames, matrices, and lists Creating attractive, intuitive statistical graphics Writing user-defined functions Controlling program flow with if, ifelse, and complex checks Improving program efficiency with group manipulations Combining and reshaping multiple datasets Manipulating strings using R’s facilities and regular expressions Creating normal, binomial, and Poisson probability distributions Programming basic statistics: mean, standard deviation, and t-tests Building linear, generalized linear, and nonlinear models Assessing the quality of models and variable selection Preventing overfitting, using the Elastic Net and Bayesian methods Analyzing univariate and multivariate time series data Grouping data via K-means and hierarchical clustering Preparing reports, slideshows, and web pages with knitr Building reusable R packages with devtools and Rcpp Getting involved with the R global community
Foreword xv
Preface xvii
Acknowledgments xxi
About the Author xxv
1 Getting R 1(14)
1.1 Downloading R
1(1)
1.2 R Version
2(1)
1.3 32-bit vs. 64-bit
2(1)
1.4 Installing
2(12)
1.5 Microsoft R Open
14(1)
1.6 Conclusion
14(1)
2 The R Environment 15(18)
2.1 Command Line Interface
16(1)
2.2 RStudio
17(14)
2.3 Microsoft Visual Studio
31(1)
2.4 Conclusion
31(2)
3 R Packages 33(6)
3.1 Installing Packages
33(3)
3.2 Loading Packages
36(1)
3.3 Building a Package
37(1)
3.4 Conclusion
37(2)
4 Basics of R 39(18)
4.1 Basic Math
39(1)
4.2 Variables
40(2)
4.3 Data Types
42(5)
4.4 Vectors
47(5)
4.5 Calling Functions
52(1)
4.6 Function Documentation
52(1)
4.7 Missing Data
53(1)
4.8 Pipes
54(1)
4.9 Conclusion
55(2)
5 Advanced Data Structures 57(18)
5.1 data.frames
57(7)
5.2 Lists
64(6)
5.3 Matrices
70(3)
5.4 Arrays
73(1)
5.5 Conclusion
74(1)
6 Reading Data into R 75(18)
6.1 Reading CSVs
75(4)
6.2 Excel Data
79(2)
6.3 Reading from Databases
81(3)
6.4 Data from Other Statistical Tools
84(1)
6.5 R Binary Files
85(2)
6.6 Data Included with R
87(1)
6.7 Extract Data from Web Sites
88(2)
6.8 Reading JSON Data
90(2)
6.9 Conclusion
92(1)
7 Statistical Graphics 93(18)
7.1 Base Graphics
93(3)
7.2 ggplot2
96(14)
7.3 Conclusion
110(1)
8 Writing R functions 111(6)
8.1 Hello, World!
111(1)
8.2 Function Arguments
112(2)
8.3 Return Values
114(1)
8.4 do. call
115(1)
8.5 Conclusion
116(1)
9 Control Statements 117(8)
9.1 if and else
117(3)
9.2 switch
120(1)
9.3 ifelse
121(2)
9.4 Compound Tests
123(1)
9.5 Conclusion
123(2)
10 Loops, the Un-R Way to Iterate 125(4)
10.1 for Loops
125(2)
10.2 while Loops
127(1)
10.3 Controlling Loops
127(1)
10.4 Conclusion
128(1)
11 Group Manipulation 129(22)
11.1 Apply Family
129(3)
11.2 aggregate
132(4)
11.3 plyr
136(4)
11.4 data.table
140(10)
11.5 Conclusion
150(1)
12 Faster Group Manipulation with dplyr 151(28)
12.1 Pipes
151(1)
12.2 tbl
152(1)
12.3 select
153(8)
12.4 filter
161(6)
12.5 slice
167(1)
12.6 mutate
168(3)
12.7 summarize
171(1)
12.8 group by
172(1)
12.9 arrange
173(1)
12.10 do
174(2)
12.11 dplyr with Databases
176(2)
12.12 Conclusion
178(1)
13 Iterating with purrr 179(10)
13.1 map
179(2)
13.2 map with Specified Types
181(5)
13.3 Iterating over a data.frame
186(1)
13.4 map with Multiple Inputs
187(1)
13.5 Conclusion
188(1)
14 Data Reshaping 189(12)
14.1 cbind and rbind
189(1)
14.2 Joins
190(7)
14.3 reshape2
197(3)
14.4 Conclusion
200(1)
15 Reshaping Data in the Tidyverse 201(10)
15.1 Binding Rows and Columns
201(1)
15.2 Joins with dplyr
202(5)
15.3 Converting Data Formats
207(3)
15.4 Conclusion
210(1)
16 Manipulating Strings 211(14)
16.1 paste
211(1)
16.2 sprintf
212(1)
16.3 Extracting Text
213(4)
16.4 Regular Expressions
217(7)
16.5 Conclusion
224(1)
17 Probability Distributions 225(16)
17.1 Normal Distribution
225(5)
17.2 Binomial Distribution
230(5)
17.3 Poisson Distribution
235(3)
17.4 Other Distributions
238(2)
17.5 Conclusion
240(1)
18 Basic Statistics 241(24)
18.1 Summary Statistics
241(3)
18.2 Correlation and Covariance
244(8)
18.3 T-Tests
252(8)
18.4 ANOVA
260(3)
18.5 Conclusion
263(2)
19 Linear Models 265(24)
19.1 Simple Linear Regression
265(5)
19.2 Multiple Regression
270(17)
19.3 Conclusion
287(2)
20 Generalized Linear Models 289(14)
20.1 Logistic Regression
289(4)
20.2 Poisson Regression
293(4)
20.3 Other Generalized Linear Models
297(1)
20.4 Survival Analysis
297(5)
20.5 Conclusion
302(1)
21 Model Diagnostics 303(22)
21.1 Residuals
303(6)
21.2 Comparing Models
309(4)
21.3 Cross-Validation
313(5)
21.4 Bootstrap
318(3)
21.5 Stepwise Variable Selection
321(3)
21.6 Conclusion
324(1)
22 Regularization and Shrinkage 325(22)
22.1 Elastic Net
325(17)
22.2 Bayesian Shrinkage
342(4)
22.3 Conclusion
346(1)
23 Nonlinear Models 347(20)
23.1 Nonlinear Least Squares
347(3)
23.2 Splines
350(3)
23.3 Generalized Additive Models
353(6)
23.4 Decision Trees
359(2)
23.5 Boosted Trees
361(3)
23.6 Random Forests
364(2)
23.7 Conclusion
366(1)
24 Time Series and Autocorrelation 367(22)
24.1 Autoregressive Moving Average
367(7)
24.2 VAR
374(5)
24.3 GARCH
379(9)
24.4 Conclusion
388(1)
25 Clustering 389(20)
25.1 K-means
389(8)
25.2 PAM
397(6)
25.3 Hierarchical Clustering
403(4)
25.4 Conclusion
407(2)
26 Model Fitting with Caret 409(8)
26.1 Caret Basics
409(1)
26.2 Caret Options
409(2)
26.3 Tuning a Boosted Tree
411(4)
26.4 Conclusion
415(2)
27 Reproducibility and Reports with knitr 417(10)
27.1 Installing a LaTeX Program
417(1)
27.2 LaTeX Primer
418(2)
27.3 Using knitr with LaTeX
420(6)
27.4 Conclusion
426(1)
28 Rich Documents with RMarkdown 427(20)
28.1 Document Compilation
427(1)
28.2 Document Header
427(2)
28.3 Markdown Primer
429(1)
28.4 Markdown Code Chunks
430(2)
28.5 htmlwidgets
432(12)
28.6 RMarkdown Slideshows
444(2)
28.7 Conclusion
446(1)
29 Interactive Dashboards with Shiny 447(18)
29.1 Shiny in RMarkdown
447(5)
29.2 Reactive Expressions in Shiny
452(2)
29.3 Server and UI
454(9)
29.4 Conclusion
463(2)
30 Building R Packages 465(20)
30.1 Folder Structure
465(1)
30.2 Package Files
465(7)
30.3 Package Documentation
472(3)
30.4 Tests
475(2)
30.5 Checking, Building and Installing
477(2)
30.6 Submitting to CRAN
479(1)
30.7 C++ Code
479(5)
30.8 Conclusion
484(1)
A Real-Life Resources 485(6)
A.1 Meetups
485(1)
A.2 Stack Overflow
486(1)
A.3 Twitter
487(1)
A.4 Conferences
487(1)
A.5 Web Sites
488(1)
A.6 Documents
488(1)
A.7 Books
488(1)
A.8 Conclusion
489(2)
B Glossary 491(16)
List of Figures 507(6)
List of Tables 513(2)
General Index 515(6)
Index of Functions 521(6)
Index of Packages 527(2)
Index of People 529(2)
Data Index 531
Jared P. Lander is the owner of Lander Analytics, a statistical consulting firm based in New York City, the organizer of the New York Open Statistical Programming Meetup and an adjunct professor of statistics at Columbia University. He is also a tour guide for Scotts Pizza Tours and an advisor to Brewla Bars, a gourmet ice pop startup. With an M.A. from Columbia University in statistics and a B.A. from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations spans politics, tech startups, fund raising, music, finance, healthcare, and humanitarian relief efforts. He specializes in data management, multilevel models, machine learning, generalized linear models, visualization, data management, and statistical computing.