Muutke küpsiste eelistusi

E-raamat: Automated Machine Learning for Business

(Associate Professor of Information Systems, Leeds School of Business, University of Colorado Boulder), (Data Scientist, Google)
  • Formaat: 400 pages
  • Ilmumisaeg: 27-May-2021
  • Kirjastus: Oxford University Press Inc
  • Keel: eng
  • ISBN-13: 9780190941680
  • Formaat - EPUB+DRM
  • Hind: 41,51 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 400 pages
  • Ilmumisaeg: 27-May-2021
  • Kirjastus: Oxford University Press Inc
  • Keel: eng
  • ISBN-13: 9780190941680

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 


Teaches the machine learning process for business students and professionals using automated machine learning, a new development in data science that requires only a few weeks to learn instead of years of training


Though the concept of computers learning to solve a problem may still conjure thoughts of futuristic artificial intelligence, the reality is that machine learning algorithms now exist within most major software, including Websites and even word processors. These algorithms are transforming society
in the most radical way since the Industrial Revolution, primarily through automating tasks such as deciding which users to advertise to, which machines are likely to break down, and which stock to buy and sell. While this work no longer always requires advanced technical expertise, it is crucial
that practitioners and students alike understand the world of machine learning.

In this book, Kai R. Larsen and Daniel S. Becker teach the machine learning process using a new development in data science: automated machine learning (AutoML). AutoML, when implemented properly, makes machine learning accessible by removing the need for years of experience in the most arcane
aspects of data science, such as math, statistics, and computer science. Larsen and Becker demonstrate how anyone trained in the use of AutoML can use it to test their ideas and support the quality of those ideas during presentations to management and stakeholder groups. Because the requisite
investment is a few weeks rather than a few years of training, these tools will likely become a core component of undergraduate and graduate programs alike.

With first-hand examples from the industry-leading DataRobot platform, Automated Machine Learning for Business provides a clear overview of the process and engages with essential tools for the future of data science.
Preface xi
Automated Machine Learning (AutoML) xii
A Note to Instructors xii
Acknowledgments xiii
Book Outline xiii
Dataset Download xvii
Copyrights xvii
SECTION I WHY USE AUTOMATED MACHINE LEARNING?
1 What Is Machine Learning?
3(8)
1.1 Why Learn This?
3(1)
1.2 Machine Learning Is Everywhere
4(2)
1.3 What Is Machine Learning?
6(2)
1.4 Data for Machine Learning
8(2)
1.5 Exercises
10(1)
2 Automating Machine Learning
11(14)
2.1 What Is Automated Machine Learning?
12(2)
2.2 What Automated Machine Learning Is Not
14(1)
2.3 Available Tools and Platforms
15(2)
2.4 Eight Criteria for AutoML Excellence
17(3)
2.5 How Do the Fundamental Principles of Machine Learning and Artificial Intelligence Transferto AutoML? A Point-by-Point Evaluation
20(1)
2.6 Exercises
21(4)
SECTION II DEFINING PROJECT OBJECTIVES
3 Specify Business Problem
25(6)
3.1 Why Start with a Business Problem?
25(1)
3.2 Problem Statements
26(3)
3.3 Exercises
29(2)
4 Acquire Subject Matter Expertise
31(2)
4.1 Importance of Subject Matter Expertise
31(1)
4.2 Exercises
32(1)
5 Define Prediction Target
33(4)
5.1 What Is a Prediction Target?
33(2)
5.2 How Is the Target Important for Machine Learning?
35(1)
5.3 Exercises/Discussion
36(1)
6 Decide On Unit Of Analysis
37(3)
6.1 What Is a Unit of Analysis?
37(1)
6.2 How to Determine Unit of Analysis
38(1)
6.3 Exercises
39(1)
7 Success, Risk, And Continuation
40(11)
7.1 Identify Success Criteria
40(1)
7.2 Foresee Risks
41(3)
7.3 Decide Whether to Continue
44(1)
7.4 Exercises
45(6)
SECTION III ACQUIRE AND INTEGRATE DATA
8 Accessing And Storing Data
51(8)
8.1 Track Down Relevant Data
51(3)
8.2 Examine Data and Remove Columns
54(1)
8.3 Example Dataset
55(3)
8.4 Exercises
58(1)
9 Data Integration
59(11)
9.1 Joins
60(9)
9.2 Exercises
69(1)
10 Data Transformations
70(10)
10.1 Splitting and Extracting New Columns
70(8)
10.1.1 IF-THEN Statements and One-hot Encoding
70(2)
10.1.2 Regular Expressions (RegEx)
72(6)
10.2 Transformations
78(1)
10.3 Exercises
79(1)
11 Summarization
80(8)
11.1 Summarize
80(4)
11.2 Crosstab
84(3)
11.3 Exercises
87(1)
12 Data Reduction And Splitting
88(9)
12.1 Unique Rows
88(3)
12.2 Filtering
91(1)
12.3 Combining the data
92(2)
12.4 Exercises
94(3)
SECTION IV MODEL DATA
13 Startup Processes
97(6)
13.1 Uploading Data
97(5)
13.2 Exercise
102(1)
14 Feature Understanding And Selection
103(11)
14.1 Descriptive Statistics
103(4)
14.2 Data Types
107(3)
14.3 Evaluations of Feature Content
110(2)
14.4 Missing Values
112(1)
14.5 Exercises
113(1)
15 Build Candidate Models
114(20)
15.1 Starting the Process
114(2)
15.2 Advanced Options
116(5)
15.3 Starting the Analytical Process
121(6)
15.4 Model Selection Process
127(6)
15.4.1 Tournament Round 1:32% Sample
128(3)
15.4.2 Tournament Round 2:64% Sample
131(1)
15.4.3 Tournament Round 3: Cross Validation
131(1)
15.4.4 Tournament Round 4: Blending
132(1)
15.5 Exercises
133(1)
16 Understanding The Process
134(23)
16.1 Learning Curves and Speed
134(4)
16.2 Accuracy Tradeoffs
138(1)
16.3 Blueprints
139(15)
16.3.1 Numeric Data Cleansing (Imputation)
140(2)
16.3.2 Standardization
142(1)
16.3.3 One-hot Encoding
143(4)
16.3.4 Ordinal Encoding -
147(2)
16.3.5 Matrix of Word-gram Occurrences
149(2)
16.3.6 Classification
151(3)
16.4 Hyperparameter Optimization (Advanced Content)
154(2)
16.5 Exercises
156(1)
17 Evaluate Model Performance
157(23)
17.1 Introduction
157(2)
17.2 A Sample Algorithm and Model
159(5)
17.3 ROC Curve
164(12)
17.4 Using the Lift Chart and Profit Curve for Business Decisions
176(3)
17.5 Exercises
179(1)
18 Comparing Model Pairs
180(11)
18.1 Model Comparison
180(5)
18.2 Prioritizing Modeling Criteria and Selecting a Model
185(2)
18.3 Exercises
187(4)
SECTION V INTERPRET AND COMMUNICATE
19 Interpret Model
191(15)
19.1 Feature Impacts on Target
191(1)
19.2 The Overall Impact of Features on the Target without Consideration of Other Features
192(1)
19.3 The Overall Impact of a Feature Adjusted for the Impact of Other Features
193(1)
19.4 The Directional Impact of Features on Target
194(1)
19.5 The Partial Impact of Features on Target
195(3)
19.6 The Power of Language
198(3)
19.7 Hotspots
201(2)
19.8 Prediction Explanations
203(2)
19.9 Exercises
205(1)
20 Communicate Model Insights
206(15)
20.1 Unlocking Holdout
207(2)
20.2 Business Problem First
209(1)
20.3 Pre-processing and Model Quality Metrics
210(3)
20.4 Areas Where the Model Struggles
213(1)
20.5 Most Predictive Features
214(1)
20.6 Not All Features Are Created Equal
214(3)
20.7 Recommended Business Actions
217(1)
20.8 Exercises
218(3)
SECTION VI IMPLEMENT, DOCUMENT AND MAINTAIN
21 Set Up Prediction System
221(7)
21.1 Retraining Model
221(1)
21.2 Choose Deployment Strategy
222(5)
21.3 Exercises
227(1)
22 Document Modeling Process For Reproducibility
228(2)
22.1 Model Documentation
228(1)
22.2 Exercises
229(1)
23 Create Model Monitoring And Maintenance Plan
230(3)
23.1 Potential Problems
230(1)
23.2 Strategies
230(2)
23.3 Exercises
232(1)
24 Seven Types Of Target Leakage In Machine Learning And An Exercise
233(7)
24.1 Types of Target Lea kage
233(3)
24.2 A Hands-on Exercise in Detecting Target Leakage
236(3)
24.3 Exercises
239(1)
25 Time-Aware Modeling
240(19)
25.1 An Example of Time-Aware Modeling
240(18)
25.1.1 Problem Statement
240(1)
25.1.2 Data
241(1)
25.1.3 Initialize Analysis
241(1)
25.1.4 Time-Aware Modeling Background
241(3)
25.1.5 Data Preparation
244(3)
25.1.6 Model Building and Residuals
247(1)
25.1.7 Candidate Models
247(2)
25.1.8 Selecting and Examining a Model
249(4)
25.1.9 A Small Detour into Residuals
253(3)
25.1.10 Model Value
256(1)
25.1.11 Learning about Avocado Price Drivers
256(2)
25.2 Exercises
258(1)
26 Time-Series Modeling
259(18)
26.1 The Assumptions of Time-Series Machine Learning
259(1)
26.2 A Hands-on Exercise in Time-Series Analysis
260(15)
26.2.1 Problem Context
260(2)
26.2.2 Loading Data
262(1)
26.2.3 Specify Time Unit and Generate Features
262(6)
26.2.3 Examine Candidate Models
268(2)
26.2.4 Digging into the Preferred Model
270(3)
26.2.5 Predicting
273(2)
26.3 Exercises
275(2)
Appendix A Datasets
277(31)
A.1 Diabetes Patients Readmissions
277(3)
Summary
277(1)
Business Goal
277(1)
Datasets
277(3)
Exercises
280(1)
Rights
280(1)
A.2 Luxury Shoes
280(3)
Summary
280(1)
Business Goal
281(1)
Datasets
281(2)
Exercises
283(1)
A.3 Boston Airbnb
283(4)
Summary
283(1)
Business Goal
284(1)
Datasets
284(3)
Rights
287(1)
A.4 Part Backorders
287(2)
Summary
287(1)
Business Goal
287(1)
Datasets
287(1)
Exercises
288(1)
Rights
289(1)
A.5 Student Grades Portuguese
289(4)
Summary
289(1)
Business Goal
289(1)
Datasets
289(1)
Exercises
290(3)
Rights
293(1)
A.6 Lending Club
293(7)
Summary
293(1)
Business Goal
294(1)
Dataset
294(6)
Rights
300(1)
A.7 College Starting Salaries
300(1)
Summary
300(1)
Business Goal
300(1)
Datasets
300(1)
Exercises
301(1)
Rights
301(1)
A.8 HR Attrition
301(4)
Summary
301(1)
Business Goal
302(1)
Datasets
302(2)
Exercises
304(1)
Rights
305(1)
A.9 Avocadopocalypse Now?
305(3)
Summary
305(1)
Business Goal
306(1)
Datasets
306(1)
Exercises
307(1)
Rights
307(1)
Appendix B Optimization and Sorting Measures
308(3)
Appendix C More on Cross Validation
311(4)
References 315(4)
Index 319
Kai R. Larsen is an Associate Professor of Information Systems in the division of Organizational Leadership and Information Analytics, Leeds School of Business, University of Colorado Boulder. He is a courtesy faculty member in the Department of Information Science of the College of Media, Communication and Information, a Research Advisor to Gallup, and a Fellow of the Institute of Behavioral Science.

Daniel S. Becker is a Data Scientist for Google's Kaggle division and founder of Kaggle Learn and Decision.ai.