Muutke küpsiste eelistusi

E-raamat: Computational Approach to Statistical Learning

(Yale University, New Haven, Connecticut, USA), , (University of Richmond, Richmond, VA, USA)
  • Formaat - EPUB+DRM
  • Hind: 59,79 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

A Computational Approach to Statistical Learning gives a novel introduction to predictive modeling by focusing on the algorithmic and numeric motivations behind popular statistical methods. The text contains annotated code to over 80 original reference functions. These functions provide minimal working implementations of common statistical learning algorithms. Every chapter concludes with a fully worked out application that illustrates predictive modeling tasks using a real-world dataset.The text begins with a detailed analysis of linear models and ordinary least squares. Subsequent chapters explore extensions such as ridge regression, generalized linear models, and additive models. The second half focuses on the use of general-purpose algorithms for convex optimization and their application to tasks in statistical learning. Models covered include the elastic net, dense neural networks, convolutional neural networks (CNNs), and spectral clustering. A unifying theme throughout the text is the use of optimization theory in the description of predictive models, with a particular focus on the singular value decomposition (SVD). Through this theme, the computational approach motivates and clarifies the relationships between various predictive models.Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities (NEH) and the American Council of Learned Societies (ACLS). His first book, Humanities Data in R, was published in 2015.Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chambers prize for statistical software in 2010.Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.

Arvustused

"As best as I can determine, A Computational Approach to Statistical Learning (CASL) is unique among R books devoted to statistical learning and data science. Other popular textscover much of the same ground, and include extensive R code implementing statistical models. What makes CASL different is the unifying mathematical structure underlying the presentation and the focus on the computations themselvesCASLs great strengths are the use linear algebra to provide a coherent, unifying mathematical framework for explaining a wide class of models, a lucid writing style that appeals to geometric intuition, clear explanations of many details that are mostly glossed over in more superficial treatments, the inclusion of historical references, and R code that is tightly integrated into the text. The R code is extensive, concise without being opaque, and in many cases, elegant. The code illustrates Rs advantages for developing statistical algorithms as well as its power to present versatile and compelling visualizationsCASL ought to appeal to anyone working in data science or machine learning seeking a sophisticated understanding of both the theoretical basis and efficient algorithms underlying a modern approach to computational statistics." ~Joe Rickert, RStudio

"The literate programming style is my favorite part of this book (borrowing the term from Don Knuth). It would be well suited for an engineer seeking to understand the implementations and ideas behind these statistical models. Real code beats pseudocode, because one can easily tweak and experiment with itThe other part I especially like is the development of neural nets based on extending the models previously introduced in the text. This takes some of the mystery out of neural nets and makes them more accessible to a statistician studying them for the first time... I would happily buy this book for my own reference and self-study... Im not aware of any books that are written at this level that combines the motivation, the mathematics and the code in such a nice way. If I ever happen to be teaching a course on this material, then I would definitely teach from this book." ~Clark Fitzgerald, University of California, Davis

"I think the book is quite clearly written and covers really important things to consider that can help optimize model building. The book does a really great job of following its theme throughout and explicitly mentioning why they are explaining something the way they explain it. Reading the book, it is clear they considered how all the parts the included (at least the chapters I read) fit into the broader scope of the book's goal." ~Justin Post, North Carolina State University "As best as I can determine, A Computational Approach to Statistical Learning (CASL) is unique among R books devoted to statistical learning and data science. Other popular textscover much of the same ground, and include extensive R code implementing statistical models. What makes CASL different is the unifying mathematical structure underlying the presentation and the focus on the computations themselvesCASLs great strengths are the use linear algebra to provide a coherent, unifying mathematical framework for explaining a wide class of models, a lucid writing style that appeals to geometric intuition, clear explanations of many details that are mostly glossed over in more superficial treatments, the inclusion of historical references, and R code that is tightly integrated into the text. The R code is extensive, concise without being opaque, and in many cases, elegant. The code illustrates Rs advantages for developing statistical algorithms as well as its power to present versatile and compelling visualizationsCASL ought to appeal to anyone working in data science or machine learning seeking a sophisticated understanding of both the theoretical basis and efficient algorithms underlying a modern approach to computational statistics." ~Joe Rickert, RStudio

"The literate programming style is my favorite part of this book (borrowing the term from Don Knuth). It would be well suited for an engineer seeking to understand the implementations and ideas behind these statistical models. Real code beats pseudocode, because one can easily tweak and experiment with itThe other part I especially like is the development of neural nets based on extending the models previously introduced in the text. This takes some of the mystery out of neural nets and makes them more accessible to a statistician studying them for the first time... I would happily buy this book for my own reference and self-study... Im not aware of any books that are written at this level that combines the motivation, the mathematics and the code in such a nice way. If I ever happen to be teaching a course on this material, then I would definitely teach from this book." ~Clark Fitzgerald, University of California, Davis

"I think the book is quite clearly written and covers really important things to consider that can help optimize model building. The book does a really great job of following its theme throughout and explicitly mentioning why they are explaining something the way they explain it. Reading the book, it is clear they considered how all the parts the included (at least the chapters I read) fit into the broader scope of the book's goal." ~Justin Post, North Carolina State University

Preface xi
1 Introduction
1(10)
1.1 Computational approach
1(1)
1.2 Statistical learning
2(1)
1.3 Example
3(2)
1.4 Prerequisites
5(1)
1.5 How to read this book
6(1)
1.6 Supplementary materials
7(1)
1.7 Formalisms and terminology
7(2)
1.8 Exercises
9(2)
2 Linear Models
11(32)
2.1 Introduction
11(2)
2.2 Ordinary least squares
13(2)
2.3 The normal equations
15(2)
2.4 Solving least squares with the singular value decomposition
17(2)
2.5 Directly solving the linear system
19(3)
2.6 (*) Solving linear models using the QR decomposition
22(2)
2.7 (?) Sensitivity analysis
24(4)
2.8 (*) Relationship between numerical and statistical error
28(3)
2.9 Implementation and notes
31(1)
2.10 Application: Cancer incidence rates
32(8)
2.11 Exercises
40(3)
3 Ridge Regression and Principal Component Analysis
43(32)
3.1 Variance in OLS
43(3)
3.2 Ridge regression
46(7)
3.3 (*) A Bayesian perspective
53(3)
3.4 Principal component analysis
56(7)
3.5 Implementation and notes
63(2)
3.6 Application: NYC taxicab data
65(7)
3.7 Exercises
72(3)
4 Linear Smoothers
75(48)
4.1 Non-Linearity
75(1)
4.2 Basis expansion
76(5)
4.3 Kernel regression
81(4)
4.4 Local regression
85(4)
4.5 Regression splines
89(6)
4.6 (*) Smoothing splines
95(5)
4.7 (*) B-splines
100(4)
4.8 Implementation and notes
104(1)
4.9 Application: U.S. census tract data
105(15)
4.10 Exercises
120(3)
5 Generalized Linear Models
123(28)
5.1 Classification with linear models
123(5)
5.2 Exponential families
128(3)
5.3 Iteratively reweighted GLMs
131(4)
5.4 (*) Numerical issues
135(3)
5.5 (*) Multi-Class regression
138(1)
5.6 Implementation and notes
139(1)
5.7 Application: Chicago crime prediction
140(8)
5.8 Exercises
148(3)
6 Additive Models
151(28)
6.1 Multivariate linear smoothers
151(4)
6.2 Curse of dimensionality
155(3)
6.3 Additive models
158(5)
6.4 (*) Additive models as linear models
163(3)
6.5 (*) Standard errors in additive models
166(4)
6.6 Implementation and notes
170(2)
6.7 Application: NYC nights data
172(6)
6.8 Exercises
178(1)
7 Penalized Regression Models
179(28)
7.1 Variable selection
179(1)
7.2 Penalized regression with the l0- and l1-norms
180(2)
7.3 Orthogonal data matrix
182(4)
7.4 Convex optimization and the elastic net
186(2)
7.5 Coordinate descent
188(5)
7.6 (*) Active set screening using the KKT conditions
193(5)
7.7 (*) The generalized elastic net model
198(2)
7.8 Implementation and notes
200(1)
7.9 Application: Amazon product reviews
201(5)
7.10 Exercises
206(1)
8 Neural Networks
207(54)
8.1 Dense neural network architecture
207(4)
8.2 Stochastic gradient descent
211(2)
8.3 Backward propagation of errors
213(3)
8.4 Implementing backpropagation
216(8)
8.5 Recognizing handwritten digits
224(2)
8.6 (?) Improving SGD and regularization
226(6)
8.7 (*) Classification with neural networks
232(7)
8.8 (*) Convolutional neural networks
239(10)
8.9 Implementation and notes
249(1)
8.10 Application: Image classification with EMNIST
249(10)
8.11 Exercises
259(2)
9 Dimensionality Reduction
261(36)
9.1 Unsupervised learning
261(1)
9.2 Kernel functions
262(4)
9.3 Kernel principal component analysis
266(6)
9.4 Spectral clustering
272(5)
9.5 t-Distributed stochastic neighbor embedding (t-SNE)
277(5)
9.6 Autoencoders
282(1)
9.7 Implementation and notes
283(1)
9.8 Application: Classifying and visualizing fashion MNIST
284(11)
9.9 Exercises
295(2)
10 Computation in Practice
297(34)
10.1 Reference implementations
297(1)
10.2 Sparse matrices
298(6)
10.3 Sparse generalized linear models
304(3)
10.4 Computation on row chunks
307(4)
10.5 Feature hashing
311(7)
10.6 Data quality issues
318(2)
10.7 Implementation and notes
320(1)
10.8 Application
321(8)
10.9 Exercises
329(2)
A Linear algebra and matrices
331(6)
A.1 Vector spaces
331(2)
A.2 Matrices
333(4)
B Floating Point Arithmetic and Numerical Computation
337(6)
B.1 Floating point arithmetic
337(3)
B.2 Computational effort
340(3)
Bibliography 343(16)
Index 359
Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities (NEH) and the American Council of Learned Societies (ACLS). His first book, Humanities Data in R, was published in 2015.

Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chamber's prize for statistical software in 2010.

Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.