Klienditugi: 7440010 (E-R 10-18)

E-raamat: Foundations of Machine Learning, second edition

4.21/5 (101 hinnangut Goodreads-ist)

Ameet Talwalkar (University of California, Berkeley), Afshin Rostamizadeh (Google, Inc.), Mehryar Mohri (New York University)

Formaat: EPUB+DRM
Sari: Adaptive Computation and Machine Learning series
Ilmumisaeg: 25-Dec-2018
Kirjastus: MIT Press
Keel: eng
ISBN-13: 9780262351362

Teised raamatud teemal:

Machine learning

Formaat - EPUB+DRM
Hind: 82,08 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: EPUB+DRM
Sari: Adaptive Computation and Machine Learning series
Ilmumisaeg: 25-Dec-2018
Kirjastus: MIT Press
Keel: eng
ISBN-13: 9780262351362

Teised raamatud teemal:

Machine learning

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

A new edition of a graduate-level machine learning textbook that focuses on the analysis and theory of algorithms.

This book is a general introduction to machine learning that can serve as a textbook for graduate students and a reference for researchers. It covers fundamental modern topics in machine learning while providing the theoretical basis and conceptual tools needed for the discussion and justification of algorithms. It also describes several key aspects of the application of these algorithms. The authors aim to present novel theoretical tools and concepts while giving concise proofs even for relatively advanced topics.

Foundations of Machine Learning is unique in its focus on the analysis and theory of algorithms. The first four chapters lay the theoretical foundation for what follows; subsequent chapters are mostly self-contained. Topics covered include the Probably Approximately Correct (PAC) learning framework; generalization bounds based on Rademacher complexity and VC-dimension; Support Vector Machines (SVMs); kernel methods; boosting; on-line learning; multi-class classification; ranking; regression; algorithmic stability; dimensionality reduction; learning automata and languages; and reinforcement learning. Each chapter ends with a set of exercises. Appendixes provide additional material including concise probability review.

This second edition offers three new chapters, on model selection, maximum entropy models, and conditional entropy models. New material in the appendixes includes a major section on Fenchel duality, expanded coverage of concentration inequalities, and an entirely new entry on information theory. More than half of the exercises are new to this edition.

A new edition of a graduate-level machine learning textbook that focuses on the analysis and theory of algorithms.

Preface

xiii

1 Introduction

(8)

1.1 What is machine learning?

(1)

1.2 What kind of problems can be tackled using machine learning?

(1)

1.3 Some standard learning tasks

(1)

1.4 Learning stages

(2)

1.5 Learning scenarios

(1)

1.6 Generalization

(2)

2 The PAC Learning Framework

(20)

2.1 The PAC learning model

(6)

2.2 Guarantees for finite hypothesis sets --- consistent case

(4)

2.3 Guarantees for finite hypothesis sets --- inconsistent case

(2)

2.4 Generalities

(2)

2.4.1 Deterministic versus stochastic scenarios

(1)

2.4.2 Bayes error and noise

(1)

2.5
Chapter notes

(1)

2.6 Exercises

(6)

3 Rademacher Complexity and VC-Dimension

(32)

3.1 Rademacher complexity

(4)

3.2 Growth function

(2)

3.3 VC-dimension

(7)

3.4 Lower bounds

(5)

3.5
Chapter notes

(2)

3.6 Exercises

(11)

4 Model Selection

(18)

4.1 Estimation and approximation errors

(1)

4.2 Empirical risk minimization (ERM)

(2)

4.3 Structural risk minimization (SRM)

(4)

4.4 Cross-validation

(3)

4.5 n-Fold cross-validation

(1)

4.6 Regularization-based algorithms

(1)

4.7 Convex surrogate losses

(4)

4.8
Chapter notes

(1)

4.9 Exercises

(1)

5 Support Vector Machines

(26)

5.1 Linear classification

(1)

5.2 Separable case

(7)

5.2.1 Primal optimization problem

(2)

5.2.2 Support vectors

(1)

5.2.3 Dual optimization problem

(2)

5.2.4 Leave-one-out analysis

(2)

5.3 Non-separable case

(4)

5.3.1 Primal optimization problem

(1)

5.3.2 Support vectors

(1)

5.3.3 Dual optimization problem

(1)

5.4 Margin theory

(9)

5.5
Chapter notes

100

(1)

5.6 Exercises

100

(5)

6 Kernel Methods

105

(40)

6.1 Introduction

105

(3)

6.2 Positive definite symmetric kernels

108

(8)

6.2.1 Definitions

108

(2)

6.2.2 Reproducing kernel Hilbert space

110

(2)

6.2.3 Properties

112

(4)

6.3 Kernel-based algorithms

116

(3)

6.3.1 SVMs with PDS kernels

116

(1)

6.3.2 Representer theorem

117

(1)

6.3.3 Learning guarantees

117

(2)

6.4 Negative definite symmetric kernels

119

(2)

6.5 Sequence kernels

121

(9)

6.5.1 Weighted transducers

122

(4)

6.5.2 Rational kernels

126

(4)

6.6 Approximate kernel feature maps

130

(5)

6.7
Chapter notes

135

(2)

6.8 Exercises

137

(8)

7 Boosting

145

(32)

7.1 Introduction

145

(1)

7.2 AdaBoost

146

(8)

7.2.1 Bound on the empirical error

149

(1)

7.2.2 Relationship with coordinate descent

150

(4)

7.2.3 Practical use

154

(1)

7.3 Theoretical results

154

(11)

7.3.1 VC-dimension-based analysis

154

(1)

7.3.2 L1-geometric margin

155

(2)

7.3.3 Margin-based analysis

157

(4)

7.3.4 Margin maximization

161

(1)

7.3.5 Game-theoretic interpretation

162

(3)

7.4 Li-regularization

165

(2)

7.5 Discussion

167

(1)

7.6
Chapter notes

168

(2)

7.7 Exercises

170

(7)

8 On-Line Learning

177

(36)

8.1 Introduction

178

(1)

8.2 Prediction with expert advice

178

(12)

8.2.1 Mistake bounds and Halving algorithm

179

(2)

8.2.2 Weighted majority algorithm

181

(2)

8.2.3 Randomized weighted majority algorithm

183

(3)

8.2.4 Exponential weighted average algorithm

186

(4)

8.3 Linear classification

190

(11)

8.3.1 Perceptron algorithm

190

(8)

8.3.2 Winnow algorithm

198

(3)

8.4 On-line to batch conversion

201

(3)

8.5 Game-theoretic connection

204

(1)

8.6
Chapter notes

205

(1)

8.7 Exercises

206

(7)

9 Multi-Class Classification

213

(26)

9.1 Multi-class classification problem

213

(2)

9.2 Generalization bounds

215

(6)

9.3 Uncombined multi-class algorithms

221

(7)

9.3.1 Multi-class SVMs

221

(1)

9.3.2 Multi-class boosting algorithms

222

(2)

9.3.3 Decision trees

224

(4)

9.4 Aggregated multi-class algorithms

228

(5)

9.4.1 One-versus-all

229

(1)

9.4.2 One-versus-one

229

(2)

9.4.3 Error-correcting output codes

231

(2)

9.5 Structured prediction algorithms

233

(2)

9.6
Chapter notes

235

(2)

9.7 Exercises

237

(2)

10 Ranking

239

(28)

10.1 The problem of ranking

240

(1)

10.2 Generalization bound

241

(2)

10.3 Ranking with SVMs

243

(1)

10.4 RankBoost

244

(7)

10.4.1 Bound on the empirical error

246

(2)

10.4.2 Relationship with coordinate descent

248

(2)

10.4.3 Margin bound for ensemble methods in ranking

250

(1)

10.5 Bipartite ranking

251

(6)

10.5.1 Boosting in bipartite ranking

252

(3)

10.5.2 Area under the ROC curve

255

(2)

10.6 Preference-based setting

257

(5)

10.6.1 Second-stage ranking problem

257

(2)

10.6.2 Deterministic algorithm

259

(1)

10.6.3 Randomized algorithm

260

(2)

10.6.4 Extension to other loss functions

262

(1)

10.7 Other ranking criteria

262

(1)

10.8
Chapter notes

263

(1)

10.9 Exercises

264

(3)

11 Regression

267

(28)

11.1 The problem of regression

267

(1)

11.2 Generalization bounds

268

(7)

11.2.1 Finite hypothesis sets

268

(1)

11.2.2 Rademacher complexity bounds

269

(2)

11.2.3 Pseudo-dimension bounds

271

(4)

11.3 Regression algorithms

275

(15)

11.3.1 Linear regression

275

(1)

11.3.2 Kernel ridge regression

276

(5)

11.3.3 Support vector regression

281

(4)

11.3.4 Lasso

285

(4)

11.3.5 Group norm regression algorithms

289

(1)

11.3.6 On-line regression algorithms

289

(1)

11.4
Chapter notes

290

(2)

11.5 Exercises

292

(3)

12 Maximum Entropy Models

295

(20)

12.1 Density estimation problem

295

(2)

12.1.1 Maximum Likelihood (ML) solution

296

(1)

12.1.2 Maximum a Posteriori (MAP) solution

297

(1)

12.2 Density estimation problem augmented with features

297

(1)

12.3 Maxent principle

298

(1)

12.4 Maxent models

299

(1)

12.5 Dual problem

299

(4)

12.6 Generalization bound

303

(1)

12.7 Coordinate descent algorithm

304

(2)

12.8 Extensions

306

(2)

12.9 L2-regularization

308

(4)

12.10
Chapter notes

312

(1)

12.11 Exercises

313

(2)

13 Conditional Maximum Entropy Models

315

(18)

13.1 Learning problem

315

(1)

13.2 Conditional Maxent principle

316

(1)

13.3 Conditional Maxent models

316

(1)

13.4 Dual problem

317

(2)

13.5 Properties

319

(2)

13.5.1 Optimization problem

320

(1)

13.5.2 Feature vectors

320

(1)

13.5.3 Prediction

321

(1)

13.6 Generalization bounds

321

(4)

13.7 Logistic regression

325

(1)

13.7.1 Optimization problem

325

(1)

13.7.2 Logistic model

325

(1)

13.8 L2-regularization

326

(2)

13.9 Proof of the duality theorem

328

(2)

13.10
Chapter notes

330

(1)

13.11 Exercises

331

(2)

14 Algorithmic Stability

333

(14)

14.1 Definitions

333

(1)

14.2 Stability-based generalization guarantee

334

(2)

14.3 Stability of kernel-based regularization algorithms

336

(6)

14.3.1 Application to regression algorithms: SVR and KRR

339

(2)

14.3.2 Application to classification algorithms: SVMs

341

(1)

14.3.3 Discussion

342

(1)

14.4
Chapter notes

342

(1)

14.5 Exercises

343

(4)

15 Dimensionality Reduction

347

(12)

15.1 Principal component analysis

348

(1)

15.2 Kernel principal component analysis (KPCA)

349

(2)

15.3 KPCA and manifold learning

351

(3)

15.3.1 Isomap

351

(1)

15.3.2 Laplacian eigenmaps

352

(1)

15.3.3 Locally linear embedding (LLE)

353

(1)

15.4 Johnson-Lindenstrauss lemma

354

(2)

15.5
Chapter notes

356

(1)

15.6 Exercises

356

(3)

16 Learning Automata and Languages

359

(20)

16.1 Introduction

359

(1)

16.2 Finite automata

360

(1)

16.3 Efficient exact learning

361

(8)

16.3.1 Passive learning

362

(1)

16.3.2 Learning with queries

363

(1)

16.3.3 Learning automata with queries

364

(5)

16.4 Identification in the limit

369

(6)

16.4.1 Learning reversible automata

370

(5)

16.5
Chapter notes

375

(1)

16.6 Exercises

376

(3)

17 Reinforcement Learning

379

(30)

17.1 Learning scenario

379

(1)

17.2 Markov decision process model

380

(1)

17.3 Policy

381

(6)

17.3.1 Definition

381

(1)

17.3.2 Policy value

382

(1)

17.3.3 Optimal policies

382

(3)

17.3.4 Policy evaluation

385

(2)

17.4 Planning algorithms

387

(6)

17.4.1 Value iteration

387

(3)

17.4.2 Policy iteration

390

(2)

17.4.3 Linear programming

392

(1)

17.5 Learning algorithms

393

(12)

17.5.1 Stochastic approximation

394

(3)

17.5.2 TD(0) algorithm

397

(1)

17.5.3 Q-learning algorithm

398

(4)

17.5.4 SARSA

402

(1)

17.5.5 TD(λ) algorithm

402

(1)

17.5.6 Large state space

403

(2)

17.6
Chapter notes

405

(2)

Conclusion

407

(2)

A Linear Algebra Review

409

(6)

A.1 Vectors and norms

409

(2)

A.1.1 Norms

409

(1)

A.1.2 Dual norms

410

(1)

A.1.3 Relationship between norms

411

(1)

A.2 Matrices

411

(4)

A.2.1 Matrix norms

411

(1)

A.2.2 Singular value decomposition

412

(1)

A.2.3 Symmetric positive semidefinite (SPSD) matrices

412

(3)

B Convex Optimization

415

(14)

B.1 Differentiation and unconstrained optimization

415

(1)

B.2 Convexity

415

(4)

B.3 Constrained optimization

419

(3)

B.4 Fenchel duality

422

(4)

B.4.1 Subgradients

422

(1)

B.4.2 Core

423

(1)

B.4.3 Conjugate functions

423

(3)

B.5
Chapter notes

426

(1)

B.6 Exercises

427

(2)

C Probability Review

429

(8)

C.1 Probability

429

(1)

C.2 Random variables

429

(2)

C.3 Conditional probability and independence

431

(1)

C.4 Expectation and Markov's inequality

431

(1)

C.5 Variance and Chebyshev's inequality

432

(2)

C.6 Moment-generating functions

434

(1)

C.7 Exercises

435

(2)

D Concentration Inequalities

437

(12)

D.1 Hoeffding's inequality

437

(1)

D.2 Sahov's theorem

438

(1)

D.3 Multiplicative Chernoff bounds

439

(1)

D.4 Binomial distribution tails: Upper bounds

440

(1)

D.5 Binomial distribution tails: Lower bound

440

(1)

D.6 Azuma's inequality

441

(1)

D.7 McDiarmid's inequality

442

(1)

D.8 Normal distribution tails: Lower bound

443

(1)

D.9 Khintchine-Kahane inequality

443

(1)

D.10 Maximal inequality

444

(1)

D.11
Chapter notes

445

(1)

D.12 Exercises

445

(4)

E Notions of Information Theory

449

(10)

E.1 Entropy

449

(1)

E.2 Relative entropy

450

(3)

E.3 Mutual information

453

(1)

E.4 Bregman divergences

453

(3)

E.5
Chapter notes

456

(1)

E.6 Exercises

457

(2)

F Notation

459

(2)

Bibliography

461

(14)

Index

475

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97802623513626e.html

Märksõnad:

E-raamat: Foundations of Machine Learning, second edition

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv