Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Machine Learning with Python for Everyone [Pehme köide]

4.43/5 (11 hinnangut Goodreads-ist)

Mark Fenner

Formaat: Paperback / softback, 592 pages, kõrgus x laius x paksus: 229x178x28 mm, kaal: 930 g
Sari: Addison-Wesley Data & Analytics Series
Ilmumisaeg: 17-Dec-2019
Kirjastus: Addison Wesley
ISBN-10: 0134845625
ISBN-13: 9780134845623

Teised raamatud teemal:

Machine learning

Pehme köide
Hind: 54,09 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 592 pages, kõrgus x laius x paksus: 229x178x28 mm, kaal: 930 g
Sari: Addison-Wesley Data & Analytics Series
Ilmumisaeg: 17-Dec-2019
Kirjastus: Addison Wesley
ISBN-10: 0134845625
ISBN-13: 9780134845623

Teised raamatud teemal:

Machine learning

Püsilink: https://www.kriso.ee/db/9780134845623.html

Märksõnad:

The Complete Beginner's Guide to Understanding and Building Machine Learning Systems with Python

Machine Learning with Python for Everyone will help you master the processes, patterns, and strategies you need to build effective learning systems, even if you're an absolute beginner. If you can write some Python code, this book is for you, no matter how little college-level math you know. Principal instructor Mark E. Fenner relies on plain-English stories, pictures, and Python examples to communicate the ideas of machine learning.

Mark begins by discussing machine learning and what it can do; introducing key mathematical and computational topics in an approachable manner; and walking you through the first steps in building, training, and evaluating learning systems. Step by step, you'll fill out the components of a practical learning system, broaden your toolbox, and explore some of the field's most sophisticated and exciting techniques. Whether you're a student, analyst, scientist, or hobbyist, this guide's insights will be applicable to every learning system you ever build or use.

Understand machine learning algorithms, models, and core machine learning concepts Classify examples with classifiers, and quantify examples with regressors Realistically assess performance of machine learning systems Use feature engineering to smooth rough data into useful forms Chain multiple components into one system and tune its performance Apply machine learning techniques to images and text Connect the core concepts to neural networks and graphical models Leverage the Python scikit-learn library and other powerful tools

Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

Foreword

xxi

Preface

xxiii

About the Author

xxvii

I First Steps

(106)

1 Let's Discuss Learning

(16)

1.1 Welcome

(1)

1.2 Scope, Terminology, Prediction, and Data

(3)

1.2.1 Features

(1)

1.2.2 Target Values and Predictions

(1)

1.3 Putting the Machine in Machine Learning

(2)

1.4 Examples of Learning Systems

(2)

1.4.1 Predicting Categories: Examples of Classifiers

(1)

1.4.2 Predicting Values: Examples of Regressors

(1)

1.5 Evaluating Learning Systems

(2)

1.5.1 Correctness

(1)

1.5.2 Resource Consumption

(1)

1.6 A Process for Building Learning Systems

(2)

1.7 Assumptions and Reality of Learning

(2)

1.8 End-of-Chapter Material

(2)

1.8.1 The Road Ahead

(1)

1.8.2 Notes

(2)

2 Some Technical Background

(36)

2.1 About Our Setup

(1)

2.2 The Need for Mathematical Language

(1)

2.3 Our Software for Tackling Machine Learning

(1)

2.4 Probability

(7)

2.4.1 Primitive Events

(1)

2.4.2 Independence

(1)

2.4.3 Conditional Probability

(1)

2.4.4 Distributions

(3)

2.5 Linear Combinations, Weighted Sums, and Dot Products

(6)

2.5.1 Weighted Average

(2)

2.5.2 Sums of Squares

(1)

2.5.3 Sum of Squared Errors

(1)

2.6 A Geometric View: Points in Space

(9)

2.6.1 Lines

(5)

2.6.2 Beyond Lines

(4)

2.7 Notation and the Plus-One Trick

(2)

2.8 Getting Groovy, Breaking the Straight-Jacket, and Nonlinearity

(2)

2.9 NumPy versus "All the Maths"

(5)

2.9.1 Back to 1D versus 2D

(3)

2.10 Floating-Point Issues

(1)

2.11 EOC

(2)

2.11.1 Summary

(1)

2.11.2 Notes

(1)

3 Predicting Categories: Getting Started with Classification

(30)

3.1 Classification Tasks

(1)

3.2 A Simple Classification Dataset

(3)

3.3 Training and Testing: Don't Teach to the Test

(3)

3.4 Evaluation: Grading the Exam

(1)

3.5 Simple Classifier #1: Nearest Neighbors, Long Distance Relationships, and Assumptions

(5)

3.5.1 Defining Similarity

(1)

3.5.2 The k in k-NN

(1)

3.5.3 Answer Combination

(1)

3.5.4 k-NN, Parameters, and Nonparametric Methods

(1)

3.5.5 Building a k-NN Classification Model

(2)

3.6 Simple Classifier #2: Naive Bayes, Probability, and Broken Promises

(2)

3.7 Simplistic Evaluation of Classifiers

(11)

3.7.1 Learning Performance

(1)

3.7.2 Resource Utilization in Classification

(6)

3.7.3 Stand-Alone Resource Evaluation

(4)

3.8 EOC

(4)

3.8.1 Sophomore Warning: Limitations and Open Issues

(1)

3.8.2 Summary

(1)

3.8.3 Notes

(1)

3.8.4 Exercises

(2)

4 Predicting Numerical Values: Getting Started with Regression

(22)

4.1 A Simple Regression Dataset

(2)

4.2 Nearest-Neighbors Regression and Summary Statistics

(4)

4.2.1 Measures of Center: Median and Mean

(2)

4.2.2 Building a k-NN Regression Model

(1)

4.3 Linear Regression and Errors

(7)

4.3.1 No Flat Earth: Why We Need Slope

(2)

4.3.2 Tilting the Field

(3)

4.3.3 Performing Linear Regression

(1)

4.4 Optimization: Picking the Best Answer

(3)

4.4.1 Random Guess

(1)

4.4.2 Random Step

(1)

4.4.3 Smart Step

(1)

4.4.4 Calculated Shortcuts

100

(1)

4.4.5 Application to Linear Regression

101

(1)

4.5 Simple Evaluation and Comparison of Regressors

101

(3)

4.5.1 Root Mean Squared Error

101

(1)

4.5.2 Learning Performance

102

(1)

4.5.3 Resource Utilization in Regression

102

(2)

4.6 EOC

104

(5)

4.6.1 Limitations and Open Issues

104

(1)

4.6.2 Summary

105

(1)

4.6.3 Notes

105

(1)

4.6.4 Exercises

105

(2)

II Evaluation

107

(128)

5 Evaluating and Comparing Learners

109

(50)

5.1 Evaluation and Why Less Is More

109

(1)

5.2 Terminology for Learning Phases

110

(6)

5.2.1 Back to the Machines

110

(3)

5.2.2 More Technically Speaking

113

(3)

5.3 Major Tom, There's Something Wrong: Overfitting and Underfitting

116

(9)

5.3.1 Synthetic Data and Linear Regression

117

(1)

5.3.2 Manually Manipulating Model Complexity

118

(2)

5.3.3 Goldilocks: Visualizing Overfitting, Underfitting, and "Just Right"

120

(4)

5.3.4 Simplicity

124

(1)

5.3.5 Take-Home Notes on Overfitting

124

(1)

5.4 From Errors to Costs

125

(3)

5.4.1 Loss

125

(1)

5.4.2 Cost

126

(1)

5.4.3 Score

127

(1)

5.5 (Re)Sampling: Making More from Less

128

(14)

5.5.1 Cross-Validation

128

(4)

5.5.2 Stratification

132

(1)

5.5.3 Repeated Train-Test Splits

133

(4)

5.5.4 A Better Way and Shuffling

137

(3)

5.5.5 Leave-One-Out Cross-Validation

140

(2)

5.6 Break-It-Down: Deconstructing Error into Bias and Variance

142

(7)

5.6.1 Variance of the Data

143

(1)

5.6.2 Variance of the Model

144

(1)

5.6.3 Bias of the Model

144

(1)

5.6.4 All Together Now

145

(1)

5.6.5 Examples of Bias-Variance Tradeoffs

145

(4)

5.7 Graphical Evaluation and Comparison

149

(5)

5.7.1 Learning Curves: How Much Data Do We Need?

150

(2)

5.7.2 Complexity Curves

152

(2)

5.8 Comparing Learners with Cross-Validation

154

(1)

5.9 EOC

155

(4)

5.9.1 Summary

155

(1)

5.9.2 Notes

155

(2)

5.9.3 Exercises

157

(2)

6 Evaluating Classifiers

159

(46)

6.1 Baseline Classifiers

159

(2)

6.2 Beyond Accuracy: Metrics for Classification

161

(9)

6.2.1 Eliminating Confusion from the Confusion Matrix

163

(1)

6.2.2 Ways of Being Wrong

164

(1)

6.2.3 Metrics from the Confusion Matrix

165

(1)

6.2.4 Coding the Confusion Matrix

166

(2)

6.2.5 Dealing with Multiple Classes: Multiclass Averaging

168

(2)

6.2.6 F1

170

(1)

6.3 ROC Curves

170

(11)

6.3.1 Patterns in the ROC

173

(1)

6.3.2 Binary ROC

174

(3)

6.3.3 AUC: Area-Under-the-(ROC)-Curve

177

(2)

6.3.4 Multiclass Learners, One-versus-Rest, and ROC

179

(2)

6.4 Another Take on Multiclass: One-versus-One

181

(4)

6.4.1 Multiclass AUC Part Two: The Quest for a Single Value

182

(3)

6.5 Precision-Recall Curves

185

(2)

6.5.1 A Note on Precision-Recall Tradeoff

185

(1)

6.5.2 Constructing a Precision-Recall Curve

186

(1)

6.6 Cumulative Response and Lift Curves

187

(3)

6.7 More Sophisticated Evaluation of Classifiers: Take Two

190

(11)

6.7.1 Binary

190

(5)

6.7.2 A Novel Multiclass Problem

195

(6)

6.8 EOC

201

(4)

6.8.1 Summary

201

(1)

6.8.2 Notes

202

(1)

6.8.3 Exercises

203

(2)

7 Evaluating Regressors

205

(30)

7.1 Baseline Regressors

205

(2)

7.2 Additional Measures for Regression

207

(7)

7.2.1 Creating Our Own Evaluation Metric

207

(1)

7.2.2 Other Built-in Regression Metrics

208

(1)

7.2.3 R2

209

(5)

7.3 Residual Plots

214

(7)

7.3.1 Error Plots

215

(2)

7.3.2 Residual Plots

217

(4)

7.4 A First Look at Standardization

221

(4)

7.5 Evaluating Regressors in a More Sophisticated Way: Take Two

225

(7)

7.5.1 Cross-Validated Results on Multiple Metrics

226

(4)

7.5.2 Summarizing Cross-Validated Results

230

(1)

7.5.3 Residuals

230

(2)

7.6 EOC

232

(5)

7.6.1 Summary

232

(1)

7.6.2 Notes

232

(2)

7.6.3 Exercises

234

(1)

III More Methods and Fundamentals

235

(150)

8 More Classification Methods

237

(58)

8.1 Revisiting Classification

237

(2)

8.2 Decision Trees

239

(10)

8.2.1 Tree-Building Algorithms

242

(3)

8.2.2 Let's Go: Decision Tree Time

245

(4)

8.2.3 Bias and Variance in Decision Trees

249

(1)

8.3 Support Vector Classifiers

249

(10)

8.3.1 Performing SVC

253

(3)

8.3.2 Bias and Variance in SVCs

256

(3)

8.4 Logistic Regression

259

(10)

8.4.1 Betting Odds

259

(3)

8.4.2 Probabilities, Odds, and Log-Odds

262

(5)

8.4.3 Just Do It: Logistic Regression Edition

267

(1)

8.4.4 A Logistic Regression: A Space Oddity

268

(1)

8.5 Discriminant Analysis

269

(16)

8.5.1 Covariance

270

(12)

8.5.2 The Methods

282

(1)

8.5.3 Performing DA

283

(2)

8.6 Assumptions, Biases, and Classifiers

285

(2)

8.7 Comparison of Classifiers: Take Three

287

(3)

8.7.1 Digits

287

(3)

8.8 EOC

290

(5)

8.8.1 Summary

290

(1)

8.8.2 Notes

290

(3)

8.8.3 Exercises

293

(2)

9 More Regression Methods

295

(26)

9.1 Linear Regression in the Penalty Box: Regularization

295

(6)

9.1.1 Performing Regularized Regression

300

(1)

9.2 Support Vector Regression

301

(7)

9.2.1 Hinge Loss

301

(4)

9.2.2 From Linear Regression to Regularized Regression to Support Vector Regression

305

(2)

9.2.3 Just Do It-SVR Style

307

(1)

9.3 Piecewise Constant Regression

308

(5)

9.3.1 Implementing a Piecewise Constant Regressor

310

(1)

9.3.2 General Notes on Implementing Models

311

(2)

9.4 Regression Trees

313

(1)

9.4.1 Performing Regression with Trees

313

(1)

9.5 Comparison of Regressors: Take Three

314

(4)

9.6 EOC

318

(3)

9.6.1 Summary

318

(1)

9.6.2 Notes

318

(1)

9.6.3 Exercises

319

(2)

10 Manual Feature Engineering: Manipulating Data for Fun and Profit

321

(38)

10.1 Feature Engineering Terminology and Motivation

321

(3)

10.1.1 Why Engineer Features?

322

(1)

10.1.2 When Does Engineering Happen?

323

(1)

10.1.3 How Does Feature Engineering Occur?

324

(1)

10.2 Feature Selection and Data Reduction: Taking out the Trash

324

(1)

10.3 Feature Scaling

325

(4)

10.4 Discretization

329

(3)

10.5 Categorical Coding

332

(9)

10.5.1 Another Way to Code and the Curious Case of the Missing Intercept

334

(7)

10.6 Relationships and Interactions

341

(9)

10.6.1 Manual Feature Construction

341

(2)

10.6.2 Interactions

343

(5)

10.6.3 Adding Features with Transformers

348

(2)

10.7 Target Manipulations

350

(6)

10.7.1 Manipulating the Input Space

351

(2)

10.7.2 Manipulating the Target

353

(3)

10.8 EOC

356

(3)

10.8.1 Summary

356

(1)

10.8.2 Notes

356

(1)

10.8.3 Exercises

357

(2)

11 Tuning Hyperparameters and Pipelines

359

(26)

11.1 Models, Parameters, Hyperparameters

360

(2)

11.2 Tuning Hyperparameters

362

(8)

11.2.1 A Note on Computer Science and Learning Terminology

362

(1)

11.2.2 An Example of Complete Search

362

(6)

11.2.3 Using Randomness to Search for a Needle in a Haystack

368

(2)

11.3 Down the Recursive Rabbit Hole: Nested Cross-Validation

370

(7)

11.3.1 Cross-Validation, Redux

370

(1)

11.3.2 GridSearch as a Model

371

(1)

11.3.3 Cross-Validation Nested within Cross-Validation

372

(3)

11.3.4 Comments on Nested CV

375

(2)

11.4 Pipelines

377

(3)

11.4.1 A Simple Pipeline

378

(1)

11.4.2 A More Complex Pipeline

379

(1)

11.5 Pipelines and Tuning Together

380

(2)

11.6 EOC

382

(5)

11.6.1 Summary

382

(1)

11.6.2 Notes

382

(1)

11.6.3 Exercises

383

(2)

IV Adding Complexity

385

(144)

12 Combining Learners

387

(22)

12.1 Ensembles

387

(2)

12.2 Voting Ensembles

389

(1)

12.3 Bagging and Random Forests

390

(8)

12.3.1 Bootstrapping

390

(4)

12.3.2 From Bootstrapping to Bagging

394

(2)

12.3.3 Through the Random Forest

396

(2)

12.4 Boosting

398

(3)

12.4.1 Boosting Details

399

(2)

12.5 Comparing the Tree-Ensemble Methods

401

(4)

12.6 EOC

405

(4)

12.6.1 Summary

405

(1)

12.6.2 Notes

405

(1)

12.6.3 Exercises

406

(3)

13 Models That Engineer Features for Us

409

(60)

13.1 Feature Selection

411

(17)

13.1.1 Single-Step Filtering with Metric-Based Feature Selection

412

(11)

13.1.2 Model-Based Feature Selection

423

(3)

13.1.3 Integrating Feature Selection with a Learning Pipeline

426

(2)

13.2 Feature Construction with Kernels

428

(17)

13.2.1 A Kernel Motivator

428

(5)

13.2.2 Manual Kernel Methods

433

(5)

13.2.3 Kernel Methods and Kernel Options

438

(4)

13.2.4 Kernelized SVCs: SVMs

442

(1)

13.2.5 Take-Home Notes on SVM and an Example

443

(2)

13.3 Principal Components Analysis: An Unsupervised Technique

445

(17)

13.3.1 A Warm Up: Centering

445

(3)

13.3.2 Finding a Different Best Line

448

(1)

13.3.3 A First PCA

449

(3)

13.3.4 Under the Hood of PCA

452

(5)

13.3.5 A Finale: Comments on General PCA

457

(1)

13.3.6 Kernel PCA and Manifold Methods

458

(4)

13.4 EOC

462

(7)

13.4.1 Summary

462

(1)

13.4.2 Notes

462

(5)

13.4.3 Exercises

467

(2)

14 Feature Engineering for Domains: Domain-Specific Learning

469

(28)

14.1 Working with Text

470

(9)

14.1.1 Encoding Text

471

(5)

14.1.2 Example of Text Learning

476

(3)

14.2 Clustering

479

(2)

14.2.1 k-Means Clustering

479

(2)

14.3 Working with Images

481

(12)

14.3.1 Bag of Visual Words

481

(1)

14.3.2 Our Image Data

482

(1)

14.3.3 An End-to-End System

483

(8)

14.3.4 Complete Code of BoVW Transformer

491

(2)

14.4 EOC

493

(4)

14.4.1 Summary

493

(1)

14.4.2 Notes

494

(1)

14.4.3 Exercises

495

(2)

15 Connections, Extensions, and Further Directions

497

(32)

15.1 Optimization

497

(3)

15.2 Linear Regression from Raw Materials

500

(4)

15.2.1 A Graphical View of Linear Regression

504

(1)

15.3 Building Logistic Regression from Raw Materials

504

(6)

15.3.1 Logistic Regression with Zero-One Coding

506

(2)

15.3.2 Logistic Regression with Plus-One Minus-One Coding

508

(1)

15.3.3 A Graphical View of Logistic Regression

509

(1)

15.4 SVM from Raw Materials

510

(2)

15.5 Neural Networks

512

(4)

15.5.1 A NN View of Linear Regression

512

(3)

15.5.2 A NN View of Logistic Regression

515

(1)

15.5.3 Beyond Basic Neural Networks

516

(1)

15.6 Probabilistic Graphical Models

516

(9)

15.6.1 Sampling

518

(1)

15.6.2 A PGM View of Linear Regression

519

(4)

15.6.3 A PGM View of Logistic Regression

523

(2)

15.7 EOC

525

(4)

15.7.1 Summary

525

(1)

15.7.2 Notes

526

(1)

15.7.3 Exercises

527

(2)

A mlwpy.py Listing

529

(8)

Index

537

Dr. Mark Fenner, owner of Fenner Training and Consulting, LLC, has taught computing and mathematics to diverse adult audiences since 1999, and holds a PhD in computer science. His research has included design, implementation, and performance of machine learning and numerical algorithms; developing learning systems to detect user anomalies; and probabilistic modeling of protein function.

Machine Learning with Python for Everyone [Pehme köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv