Klienditugi: 7440010 (E-R 10-18)

Statistical Learning with Sparsity: The Lasso and Generalizations [Pehme köide]

4.16/5 (37 hinnangut Goodreads-ist)

Robert Tibshirani (Stanford University, California, USA), Martin Wainwright (Department of Statistics, University of California, Berkeley), Trevor Hastie (Stanford University, California, USA)

Formaat: Paperback / softback, 368 pages, kõrgus x laius: 234x156 mm, kaal: 640 g
Sari: Chapman & Hall/CRC Monographs on Statistics and Applied Probability
Ilmumisaeg: 18-Dec-2020
Kirjastus: Chapman & Hall/CRC
ISBN-10: 0367738333
ISBN-13: 9780367738334

Teised raamatud teemal:

Automatic control engineering
Probability & statistics - (Hetkel poes: 2 nimetust)
Sociology - (Hetkel poes: 14 nimetust)

Pehme köide
Hind: 62,19 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 368 pages, kõrgus x laius: 234x156 mm, kaal: 640 g
Sari: Chapman & Hall/CRC Monographs on Statistics and Applied Probability
Ilmumisaeg: 18-Dec-2020
Kirjastus: Chapman & Hall/CRC
ISBN-10: 0367738333
ISBN-13: 9780367738334

Teised raamatud teemal:

Automatic control engineering
Probability & statistics - (Hetkel poes: 2 nimetust)
Sociology - (Hetkel poes: 14 nimetust)

Püsilink: https://www.kriso.ee/db/9780367738334.html

Märksõnad:

Discover New Methods for Dealing with High-Dimensional Data

A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data.

Top experts in this rapidly evolving field, the authors describe the lasso for linear regression and a simple coordinate descent algorithm for its computation. They discuss the application of l₁ penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization. They also present statistical inference methods for fitted (lasso) models, including the bootstrap, Bayesian methods, and recently developed approaches. In addition, the book examines matrix decomposition, sparse multivariate analysis, graphical models, and compressed sensing. It concludes with a survey of theoretical results for the lasso.

In this age of big data, the number of features measured on a person or object can be large and might be larger than the number of observations. This book shows how the sparsity assumption allows us to tackle these problems and extract useful and reproducible patterns from big datasets. Data analysts, computer scientists, and theorists will appreciate this thorough and up-to-date treatment of sparse statistical modeling.

Arvustused

"The authors study and analyze methods using the sparsity property of some statistical models in order to recover the underlying signal in a dataset. They focus on the Lasso technique as an alternative to the standard least-squares method." Zentralblatt MATH 1319

Preface

1 Introduction

(6)

2 The Lasso for Linear Models

(22)

2.1 Introduction

(1)

2.2 The Lasso Estimator

(5)

2.3 Cross-Validation and Inference

(1)

2.4 Computation of the Lasso Solution

(3)

2.4.1 Single Predictor: Soft Thresholding

(1)

2.4.2 Multiple Predictors: Cyclic Coordinate Descent

(2)

2.4.3 Soft-Thresholding and Orthogonal Bases

(1)

2.5 Degrees of Freedom

(2)

2.6 Uniqueness of the Lasso Solutions

(1)

2.7 A Glimpse at the Theory

(1)

2.8 The Nonnegative Garrote

(2)

2.9 Q Penalties and Bayes Estimates

(1)

2.10 Some Perspective

(6)

Exercises

(5)

3 Generalized Linear Models

(26)

3.1 Introduction

(2)

3.2 Logistic Regression

(5)

3.2.1 Example: Document Classification

(3)

3.2.2 Algorithms

(1)

3.3 Multiclass Logistic Regression

(4)

3.3.1 Example: Handwritten Digits

(2)

3.3.2 Algorithms

(1)

3.3.3 Grouped-Lasso Multinomial

(1)

3.4 Log-Linear Models and the Poisson GLM

(2)

3.4.1 Example: Distribution Smoothing

(2)

3.5 Cox Proportional Hazards Models

(4)

3.5.1 Cross-Validation

(2)

3.5.2 Pre-Validation

(1)

3.6 Support Vector Machines

(4)

3.6.1 Logistic Regression with Separable Data

(1)

3.7 Computational Details and glmnet

(5)

Bibliographic Notes

(1)

Exercises

(2)

4 Generalizations of the Lasso Penalty

(40)

4.1 Introduction

(1)

4.2 The Elastic Net

(2)

4.3 The Group Lasso

(11)

4.3.1 Computation for the Group Lasso

(2)

4.3.2 Sparse Group Lasso

(1)

4.3.3 The Overlap Group Lasso

(4)

4.4 Sparse Additive Models and the Group Lasso

(7)

4.4.1 Additive Models and Backfitting

(1)

4.4.2 Sparse Additive Models and Backfitting

(2)

4.4.3 Approaches Using Optimization and the Group Lasso

(2)

4.4.4 Multiple Penalization for Sparse Additive Models

(2)

4.5 The Fused Lasso

(8)

4.5.1 Fitting the Fused Lasso

(1)

4.5.1.1 Reparametrization

(1)

4.5.1.2 A Path Algorithm

(1)

4.5.1.3 A Dual Path Algorithm

(1)

4.5.1.4 Dynamic Programming for the Fused Lasso

(1)

4.5.2 Trend Filtering

(2)

4.5.3 Nearly Isotonic Regression

(1)

4.6 Nonconvex Penalties

(11)

Bibliographic Notes

(2)

Exercises

(7)

5 Optimization Methods

(44)

5.1 Introduction

(1)

5.2 Convex Optimality Conditions

(5)

5.2.1 Optimality for Differentiable Problems

(3)

5.2.2 Nondifferentiable Functions and Subgradients

(2)

5.3 Gradient Descent

100

(9)

5.3.1 Unconstrained Gradient Descent

101

(1)

5.3.2 Projected Gradient Methods

102

(1)

5.3.3 Proximal Gradient Methods

103

(4)

5.3.4 Accelerated Gradient Methods

107

(2)

5.4 Coordinate Descent

109

(8)

5.4.1 Separability and Coordinate Descent

110

(2)

5.4.2 Linear Regression and the Lasso

112

(3)

5.4.3 Logistic Regression and Generalized Linear Models

115

(2)

5.5 A Simulation Study

117

(1)

5.6 Least Angle Regression

118

(3)

5.7 Alternating Direction Method of Multipliers

121

(2)

5.8 Minorization-Maximization Algorithms

123

(1)

5.9 Biconvexity and Alternating Minimization

124

(3)

5.10 Screening Rules

127

(12)

Bibliographic Notes

131

(1)

Appendix

132

(2)

Exercises

134

(5)

6 Statistical Inference

139

(28)

6.1 The Bayesian Lasso

139

(3)

6.2 The Bootstrap

142

(5)

6.3 Post-Selection Inference for the Lasso

147

(11)

6.3.1 The Covariance Test

147

(3)

6.3.2 A General Scheme for Post-Selection Inference

150

(4)

6.3.2.1 Fixed-λ Inference for the Lasso

154

(2)

6.3.2.2 The Spacing Test for LAR

156

(1)

6.3.3 What Hypothesis Is Being Tested?

157

(1)

6.3.4 Back to Forward Stepwise Regression

158

(1)

6.4 Inference via a Debiased Lasso

158

(2)

6.5 Other Proposals for Post-Selection Inference

160

(7)

Bibliographic Notes

161

(1)

Exercises

162

(5)

7 Matrix Decompositions, Approximations, and Completion

167

(34)

7.1 Introduction

167

(2)

7.2 The Singular Value Decomposition

169

(1)

7.3 Missing Data and Matrix Completion

169

(15)

7.3.1 The Netflix Movie Challenge

170

(4)

7.3.2 Matrix Completion Using Nuclear Norm

174

(3)

7.3.3 Theoretical Results for Matrix Completion

177

(4)

7.3.4 Maximum Margin Factorization and Related Methods

181

(3)

7.4 Reduced-Rank Regression

184

(1)

7.5 A General Matrix Regression Framework

185

(2)

7.6 Penalized Matrix Decomposition

187

(3)

7.7 Additive Matrix Decomposition

190

(11)

Bibliographic Notes

195

(1)

Exercises

196

(5)

8 Sparse Multivariate Methods

201

(40)

8.1 Introduction

201

(1)

8.2 Sparse Principal Components Analysis

202

(11)

8.2.1 Some Background

202

(2)

8.2.2 Sparse Principal Components

204

(1)

8.2.2.1 Sparsity from Maximum Variance

204

(2)

8.2.2.2 Methods Based on Reconstruction

206

(1)

8.2.3 Higher-Rank Solutions

207

(2)

8.2.3.1 Illustrative Application of Sparse PCA

209

(1)

8.2.4 Sparse PCA via Fantope Projection

210

(1)

8.2.5 Sparse Autoencoders and Deep Learning

210

(2)

8.2.6 Some Theory for Sparse PCA

212

(1)

8.3 Sparse Canonical Correlation Analysis

213

(4)

8.3.1 Example: Netflix Movie Rating Data

215

(2)

8.4 Sparse Linear Discriminant Analysis

217

(10)

8.4.1 Normal Theory and Bayes' Rule

217

(1)

8.4.2 Nearest Shrunken Centroids

218

(3)

8.4.3 Fisher's Linear Discriminant Analysis

221

(1)

8.4.3.1 Example: Simulated Data with Five Classes

222

(3)

8.4.4 Optimal Scoring

225

(1)

8.4.4.1 Example: Face Silhouettes

226

(1)

8.5 Sparse Clustering

227

(14)

8.5.1 Some Background on Clustering

227

(1)

8.5.1.1 Example: Simulated Data with Six Classes

228

(1)

8.5.2 Sparse Hierarchical Clustering

228

(2)

8.5.3 Sparse K-Means Clustering

230

(1)

8.5.4 Convex Clustering

231

(1)

Bibliographic Notes

232

(2)

Exercises

234

(7)

9 Graphs and Model Selection

241

(28)

9.1 Introduction

241

(1)

9.2 Basics of Graphical Models

241

(5)

9.2.1 Factorization and Markov Properties

241

(1)

9.2.1.1 Factorization Property

242

(1)

9.2.1.2 Markov Property

243

(1)

9.2.1.3 Equivalence of Factorization and Markov Properties

243

(1)

9.2.2 Some Examples

244

(1)

9.2.2.1 Discrete Graphical Models

244

(1)

9.2.2.2 Gaussian Graphical Models

245

(1)

9.3 Graph Selection via Penalized Likelihood

246

(8)

9.3.1 Global Likelihoods for Gaussian Models

247

(1)

9.3.2 Graphical Lasso Algorithm

248

(3)

9.3.3 Exploiting Block-Diagonal Structure

251

(1)

9.3.4 Theoretical Guarantees for the Graphical Lasso

252

(1)

9.3.5 Global Likelihood for Discrete Models

253

(1)

9.4 Graph Selection via Conditional Inference

254

(7)

9.4.1 Neighborhood-Based Likelihood for Gaussians

255

(1)

9.4.2 Neighborhood-Based Likelihood for Discrete Models

256

(3)

9.4.3 Pseudo-Likelihood for Mixed Models

259

(2)

9.5 Graphical Models with Hidden Variables

261

(8)

Bibliographic Notes

261

(2)

Exercises

263

(6)

10 Signal Approximation and Compressed Sensing

269

(20)

10.1 Introduction

269

(1)

10.2 Signals and Sparse Representations

269

(7)

10.2.1 Orthogonal Bases

269

(2)

10.2.2 Approximation in Orthogonal Bases

271

(3)

10.2.3 Reconstruction in Overcomplete Bases

274

(2)

10.3 Random Projection and Approximation

276

(4)

10.3.1 Johnson-Lindenstrauss Approximation

277

(1)

10.3.2 Compressed Sensing

278

(2)

10.4 Equivalence between 0 and 1 Recovery

280

(9)

10.4.1 Restricted Nullspace Property

281

(1)

10.4.2 Sufficient Conditions for Restricted Nullspace

282

(2)

10.4.3 Proofs

284

(1)

10.4.3.1 Proof of Theorem 10.1

284

(1)

10.4.3.2 Proof of Proposition 10.1

284

(1)

Bibliographic Notes

285

(1)

Exercises

286

(3)

11 Theoretical Results for the Lasso

289

(26)

11.1 Introduction

289

(2)

11.1.1 Types of Loss Functions

289

(1)

11.1.2 Types of Sparsity Models

290

(1)

11.2 Bounds on Lasso 2-Error

291

(8)

11.2.1 Strong Convexity in the Classical Setting

291

(2)

11.2.2 Restricted Eigenvalues for Regression

293

(1)

11.2.3 A Basic Consistency Result

294

(5)

11.3 Bounds on Prediction Error

299

(2)

11.4 Support Recovery in Linear Regression

301

(9)

11.4.1 Variable-Selection Consistency for the Lasso

301

(2)

11.4.1.1 Some Numerical Studies

303

(2)

11.4.2 Proof of Theorem 11.3

305

(5)

11.5 Beyond the Basic Lasso

310

(5)

Bibliographic Notes

311

(1)

Exercises

312

(3)

Bibliography

315

(22)

Author Index

337

(6)

Index

343

Trevor Hastie is the John A. Overdeck Professor of Statistics at Stanford University. Prior to joining Stanford University, Professor Hastie worked at AT&T Bell Laboratories, where he helped develop the statistical modeling environment popular in the R computing system. Professor Hastie is known for his research in applied statistics, particularly in the fields of data mining, bioinformatics, and machine learning. He has published five books and over 180 research articles in these areas. In 2014, he received the Emanuel and Carol Parzen Prize for Statistical Innovation. He earned a PhD from Stanford University.

Robert Tibshirani is a professor in the Departments of Statistics and Health Research and Policy at Stanford University. He has authored five books, co-authored three books, and published over 200 research articles. He has made important contributions to the analysis of complex datasets, including the lasso and significance analysis of microarrays (SAM). He also co-authored the first study that linked cell phone usage with car accidents, a widely cited article that has played a role in the introduction of legislation that restricts the use of phones while driving. Professor Tibshirani was a recipient of the prestigious COPSS Presidents Award in 1996 and was elected to the National Academy of Sciences in 2012.

Martin Wainwright is a professor in the Department of Statistics and the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Professor Wainwright is known for theoretical and methodological research at the interface between statistics and computation, with particular emphasis on high-dimensional statistics, machine learning, graphical models, and information theory. He has published over 80 papers and one book in these areas, received the COPSS Presidents Award in 2014, and was a section lecturer at the International Congress of Mathematicians in 2014. He received PhD in EECS from the Massachusetts Institute of Technology (MIT).

Statistical Learning with Sparsity: The Lasso and Generalizations [Pehme köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv