Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Feature Engineering and Selection: A Practical Approach for Predictive Models

4.22/5 (80 hinnangut Goodreads-ist)

Max Kuhn, Kjell Johnson

Formaat: 310 pages
Sari: Chapman & Hall/CRC Data Science Series
Ilmumisaeg: 25-Jul-2019
Kirjastus: CRC Press
ISBN-13: 9781351609470

Teised raamatud teemal:

Probability & statistics

Formaat - PDF+DRM
Hind: 59,79 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 310 pages
Sari: Chapman & Hall/CRC Data Science Series
Ilmumisaeg: 25-Jul-2019
Kirjastus: CRC Press
ISBN-13: 9781351609470

Teised raamatud teemal:

Probability & statistics

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Arvustused

"The book is timely and needed. The interest in all things 'data science' morphed into everybody pretending to do, or know, Machine Learning. Kuhn and Johnson happen to actually know thisas evidenced by their earlier and still-popular tome entitled Applied Predictive Modeling. The proposed Feature Engineering and Selection builds on this and extends it. I expect it to become as popular with a wide reach as both a textbook, self-study material, and reference." ~Dirk Eddelbuettel, University of Illinois at Urbana-Champaign

"As a reviewer, it has been exciting and edifying to see this book develop into what is likely to become one of the foundational works on feature engineering. It is launching propitiously on the current tide of interest in both interpretable models and AutoML." ~Robert Horton, Microsoft

"In recent years, the statistics literature has featured new developments in modeling and predictive analytics. Approaches such as cross-validation and statistical/machine learning techniques have become widespread. The author's previous book ("Applied Predictive Modeling", APM) provided a wide-ranging introduction and integration of these methods and suggested a workflow in R to carry out exploratory and confirmation analyses. With this project, the authors have identified an important and interesting component of these methods that describes building better models by focusing on the predictors (feature engineering)The authors focus on the variables that go into the model (and how they are represented) and argue that such issues are as important (or more important) than the particular methods that are applied to an analysis...The proposed book is likely to serve as a textbook (for a number of undergraduate and graduate courses in a variety of disciplines) and reference (for a large number of statisticians seeking principled and well-organized modeling)." ~Nicholas Horton, Amherst College

"I think this book is great and a joy to readI like the pragmatic and practical approach taken in the book, and the examples given are very illustrative. The emphasis on how and when to use resampling is refreshing and something that the community needs to hear more." ~Andreas C. Muller, Columbia University "The book is timely and needed. The interest in all things 'data science' morphed into everybody pretending to do, or know, Machine Learning. Kuhn and Johnson happen to actually know thisas evidenced by their earlier and still-popular tome entitled Applied Predictive Modeling. The proposed Feature Engineering and Selection builds on this and extends it. I expect it to become as popular with a wide reach as both a textbook, self-study material, and reference." ~Dirk Eddelbuettel, University of Illinois at Urbana-Champaign

"As a reviewer, it has been exciting and edifying to see this book develop into what is likely to become one of the foundational works on feature engineering. It is launching propitiously on the current tide of interest in both interpretable models and AutoML." ~Robert Horton, Microsoft

"In recent years, the statistics literature has featured new developments in modeling and predictive analytics. Approaches such as cross-validation and statistical/machine learning techniques have become widespread. The author's previous book ("Applied Predictive Modeling", APM) provided a wide-ranging introduction and integration of these methods and suggested a workflow in R to carry out exploratory and confirmation analyses. With this project, the authors have identified an important and interesting component of these methods that describes building better models by focusing on the predictors (feature engineering)The authors focus on the variables that go into the model (and how they are represented) and argue that such issues are as important (or more important) than the particular methods that are applied to an analysis...The proposed book is likely to serve as a textbook (for a number of undergraduate and graduate courses in a variety of disciplines) and reference (for a large number of statisticians seeking principled and well-organized modeling)." ~Nicholas Horton, Amherst College

"I think this book is great and a joy to readI like the pragmatic and practical approach taken in the book, and the examples given are very illustrative. The emphasis on how and when to use resampling is refreshing and something that the community needs to hear more." ~Andreas C. Muller, Columbia University

Preface

Author Bios

1 Introduction

(20)

1.1 A Simple Example

(3)

1.2 Important Concepts

(8)

1.3 A More Complex Example

(2)

1.4 Feature Selection

(1)

1.5 An Outline of the Book

(2)

1.6 Computing

(1)

2 Illustrative Example: Predicting Risk of Ischemic Stroke

(14)

2.1 Splitting

(1)

2.2 Preprocessing

(3)

2.3 Exploration

(4)

2.4 Predictive Modeling across Sets

(4)

2.5 Other Considerations

(1)

2.6 Computing

(1)

3 A Review of the Predictive Modeling Process

(30)

3.1 Illustrative Example: OkCupid Profile Data

(1)

3.2 Measuring Performance

(10)

3.3 Data Splitting

(1)

3.4 Resampling

(9)

3.5 Tuning Parameters and Overfitting

(1)

3.6 Model Optimization and Tuning

(4)

3.7 Comparing Models Using the Training Set

(1)

3.8 Feature Engineering without Overfitting

(2)

3.9 Summary

(1)

3.10 Computing

(1)

4 Exploratory Visualizations

(28)

4.1 Introduction to the Chicago Train Ridership Data

(3)

4.2 Visualizations for Numeric Data: Exploring Train Ridership Data

(14)

4.3 Visualizations for Categorical Data: Exploring the OkCupid Data

(5)

4.4 Postmodeling Exploratory Visualizations

(4)

4.5 Summary

(1)

4.6 Computing

(1)

5 Encoding Categorical Predictors

(28)

5.1 Creating Dummy Variables for Unordered Categories

(2)

5.2 Encoding Predictors with Many Categories

(6)

5.3 Approaches for Novel Categories

102

(1)

5.4 Supervised Encoding Methods

102

(5)

5.5 Encodings for Ordered Data

107

(2)

5.6 Creating Features from Text Data

109

(5)

5.7 Factors versus Dummy Variables in Tree-Based Models

114

(5)

5.8 Summary

119

(1)

5.9 Computing

120

(1)

6 Engineering Numeric Predictors

121

(36)

6.1 1:1 Transformations

122

(4)

6.2 l:Many Transformations

126

(7)

6.3 Many: Many Transformations

133

(21)

6.4 Summary

154

(1)

6.5 Computing

155

(2)

7 Detecting Interaction Effects

157

(30)

7.1 Guiding Principles in the Search for Interactions

161

(3)

7.2 Practical Considerations

164

(1)

7.3 The Brute-Force Approach to Identifying Predictive Interactions

165

(7)

7.4 Approaches when Complete Enumeration Is Practically Impossible

172

(12)

7.5 Other Potentially Useful Tools

184

(1)

7.6 Summary

185

(1)

7.7 Computing

186

(1)

8 Handling Missing Data

187

(18)

8.1 Understanding the Nature and Severity of Missing Information

189

(6)

8.2 Models that Are Resistant to Missing Values

195

(1)

8.3 Deletion of Data

196

(1)

8.4 Encoding Missingness

197

(1)

8.5 Imputation Methods

198

(5)

8.6 Special Cases

203

(1)

8.7 Summary

203

(1)

8.8 Computing

204

(1)

9 Working with Profile Data

205

(22)

9.1 Illustrative Data: Pharmaceutical Manufacturing Monitoring

209

(1)

9.2 What Are the Experimental Unit and the Unit of Prediction?

210

(4)

9.3 Reducing Background

214

(1)

9.4 Reducing Other Noise

215

(2)

9.5 Exploiting Correlation

217

(2)

9.6 Impacts of Data Processing on Modeling

219

(5)

9.7 Summary

224

(1)

9.8 Computing

225

(2)

10 Feature Selection Overview

227

(14)

10.1 Goals of Feature Selection

227

(1)

10.2 Classes of Feature Selection Methodologies

228

(4)

10.3 Effect of Irrelevant Features

232

(3)

10.4 Overfitting to Predictors and External Validation

235

(3)

10.5 A Case Study

238

(2)

10.6 Next Steps

240

(1)

10.7 Computing

240

(1)

11 Greedy Search Methods

241

(16)

11.1 Illustrative Data: Predicting Parkinson's Disease

241

(1)

11.2 Simple Filters

241

(7)

11.3 Recursive Feature Elimination

248

(4)

11.4 Stepwise Selection

252

(2)

11.5 Summary

254

(1)

11.6 Computing

255

(2)

12 Global Search Methods

257

(26)

12.1 Naive Bayes Models

257

(3)

12.2 Simulated Annealing

260

(10)

12.3 Genetic Algorithms

270

(10)

12.4 Test Set Results

280

(1)

12.5 Summary

281

(1)

12.6 Computing

282

(1)

Bibliography

283

(12)

Index

295

Max Kuhn, Ph.D., is a software engineer at RStudio. He worked in 18 years in drug discovery and medical diagnostics applying predictive models to real data. He has authored numerous R packages for predictive modeling and machine learning.

Kjell Johnson, Ph.D., is the owner and founder of Stat Tenacity, a firm that provides statistical and predictive modeling consulting services. He has taught short courses on predictive modeling for the American Society for Quality, American Chemical Society, International Biometric Society, and for many corporations.

Kuhn and Johnson have also authored Applied Predictive Modeling, which is a comprehensive, practical guide to the process of building a predictive model. The text won the 2014 Technometrics Ziegel Prize for Outstanding Book.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97813516094702e.html

Märksõnad:

E-raamat: Feature Engineering and Selection: A Practical Approach for Predictive Models

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv