Muutke küpsiste eelistusi

E-raamat: Supervised Machine Learning: Optimization Framework and Applications with SAS and R [Taylor & Francis e-raamat]

  • Formaat: 160 pages, 59 Tables, black and white; 22 Line drawings, black and white; 22 Illustrations, black and white
  • Ilmumisaeg: 22-Sep-2020
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9780429297595
  • Taylor & Francis e-raamat
  • Hind: 193,88 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 276,97 €
  • Säästad 30%
  • Formaat: 160 pages, 59 Tables, black and white; 22 Line drawings, black and white; 22 Illustrations, black and white
  • Ilmumisaeg: 22-Sep-2020
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9780429297595
AI framework intended to solve a problem of bias-variance tradeoff for supervised learning methods in real-life applications. The AI framework comprises of bootstrapping to create multiple training and testing data sets with various characteristics, design and analysis of statistical experiments to identify optimal feature subsets and optimal hyper-parameters for ML methods, data contamination to test for the robustness of the classifiers.

Key Features:











Using ML methods by itself doesnt ensure building classifiers that generalize well for new data





Identifying optimal feature subsets and hyper-parameters of ML methods can be resolved using design and analysis of statistical experiments





Using a bootstrapping approach to massive sampling of training and tests datasets with various data characteristics (e.g.: contaminated training sets) allows dealing with bias





Developing of SAS-based table-driven environment allows managing all meta-data related to the proposed AI framework and creating interoperability with R libraries to accomplish variety of statistical and machine-learning tasks





Computer programs in R and SAS that create AI framework are available on GitHub
Acknowledgments xiii
Authors xv
Introduction: Challenges in the Application of Machine Learning Classification Methods xvii
Part I
1 Introduction To The Ai Framework
3(6)
1.1 Components Of The Ai Framework And Their Interaction
3(2)
1.2 Ai Framework In Detail
5(2)
1.2.1 Creating Training and Test Datasets
5(1)
1.2.2 Design of Experiments for a Classifier
6(1)
1.2.3 Firth Logistic Regression
6(1)
1.2.4 Data Contamination
6(1)
1.2.5 Best Classifiers
7(1)
1.3 Sas Procedures For The Ai Framework Components
7(1)
1.4 R Libraries For The Ai Framework Components
7(1)
References
8(1)
2 Supervised Machine Learning And Its Deployment In Sas And R
9(20)
2.1 Introduction
9(1)
2.2 Principles Of Supervised Machine Learning
10(2)
2.3 Neural Network
12(4)
2.3.1 Introduction
12(1)
2.3.2 Neural Network Components
13(1)
2.3.2.1 Activation Function
13(1)
2.3.2.2 Neurons
14(1)
2.3.2.3 Networks
15(1)
2.3.3 R for Neural Networks
16(1)
2.4 Support Vector Machine
16(6)
2.4.1 Introduction
16(2)
2.4.2 Kernel
18(1)
2.4.3 Margin
19(1)
2.4.4 Optimization
20(1)
2.4.5 Bias--Variance Trade-off and SVM Hyperparameters
20(1)
2.4.6 R for SVM
21(1)
2.5 Svm Modification Using Firth's Regression
22(5)
2.5.1 Introduction
22(1)
2.5.2 Logistic Regression
23(1)
2.5.3 Problem of Separation
23(2)
2.5.4 R for Firth's Regression
25(1)
2.5.5 SAS for Firth's Regression
25(2)
2.6 Summary
27(1)
References
27(2)
3 Bootstrap Methods And Their Deployment In Sas And R
29(12)
3.1 Introduction
29(1)
3.2 Overview Of Bootstrap Methods
30(7)
3.2.1 The Basic Bootstrap
31(1)
3.2.2 Hypothesis Tests, Estimates, and Confidence Intervals
32(2)
3.2.3 Bias Reduction
34(1)
3.2.4 The Parametric Bootstrap
35(1)
3.2.5 m-out-of-n Bootstrap
36(1)
3.2.6 Bootstrap Samples Similarity
36(1)
3.3 Implementation Of Bootstrap In Sas And R
37(2)
3.3.1 m-out-of-n in SAS
37(1)
3.3.2 m-out-of-n in R
38(1)
3.4 Summary
39(1)
References
40(1)
4 Outliers Detection And Its Deployment In Sas And R
41(6)
4.1 Introduction
41(1)
4.2 Outliers Detection And Treatment
42(2)
4.2.1 Minimum Covariance Determinant Method
42(1)
4.2.2 MCD in SAS
43(1)
4.3 Bias Reduction
44(1)
4.4 Summary
45(1)
References
45(2)
5 Design Of Experiments And Its Deployment In Sas And R
47(24)
5.1 Introduction
47(1)
5.2 Application Of Doe In Ai Framework
48(18)
5.2.1 Terminology of DoE
49(1)
5.2.1.1 Experiment
49(1)
5.2.1.2 Experimental Unit
49(1)
5.2.1.3 Factor
49(1)
5.2.1.4 Treatment
49(1)
5.2.2 Principles of DoE
49(1)
5.2.2.1 Randomization
49(1)
5.2.2.2 Statistical Replication
50(1)
5.2.2.3 Blocking
50(1)
5.2.2.4 Orthogonality
50(1)
5.2.3 Full-Factorial Experiment
50(7)
5.2.4 Fractional Factorial Experiment
57(1)
5.2.5 Linear Mixed Models
58(1)
5.2.6 Factors and Response Variables in the AI Framework
59(1)
5.2.7 Example
60(2)
5.2.8 Analysis of Linear Mixed Model Using SAS
62(3)
5.2.9 Analysis of Linear Mixed Model Using R
65(1)
5.3 Summary
66(1)
References
67(4)
Part II
6 Introduction To The Sas- And R-Based Table-Driven Environment
71(18)
6.1 Principles Of Code-Free Design
71(1)
6.2 The Data Dictionary Components For The Ai Framework
72(4)
6.2.1 Relational Model
72(1)
6.2.2 Table
73(1)
6.2.3 Data Aspects
73(1)
6.2.4 Relational Data Structure
73(1)
6.2.5 Domains
74(1)
6.2.6 Relations and Tables
74(1)
6.2.7 Functions
74(1)
6.2.8 One-to-one Relationship
75(1)
6.2.9 One-to-many Relationship
75(1)
6.2.10 Primary Key
75(1)
6.2.11 Foreign Key
75(1)
6.2.12 Missing Values
75(1)
6.2.13 Data Dictionary
76(1)
6.3 Properties Of The Data Dictionary
76(5)
6.3.1 The Library Table
76(1)
6.3.2 The Object Table
77(1)
6.3.3 The Location Table
77(1)
6.3.4 The Message Table
77(1)
6.3.5 The Property Table
78(1)
6.3.6 Meaning
79(1)
6.3.7 The Link Table
79(1)
6.3.8 Process of Application Data Model Definition
79(1)
6.3.9 Features of the Data Dictionary
80(1)
6.3.10 The Components of the Optimization Framework and Their Definitions in the Data Dictionary
81(1)
6.4 Deployment Of Code-Free Design With Sas And R
81(7)
6.4.1 How to Generate Application Objects
81(3)
6.4.2 Generating R Datasets from the Data Dictionary Metadata
84(2)
6.4.3 SAS and R Interoperability
86(2)
6.5 Summary
88(1)
Reference
88(1)
7 Input Data Component
89(10)
7.1 Overview Of Data Management
89(9)
7.1.1 Data Dictionary
89(1)
7.1.1.1 The Input Data Dictionary
89(1)
7.1.1.2 Input and Structure Tables
90(1)
7.1.1.3 Outlier_Detection and Bias_Correction Tables
91(1)
7.1.1.4 Bootstrap Table
92(1)
7.1.1.5 Output Table
93(2)
7.1.2 SAS Macro Program
95(3)
7.1.3 R Program
98(1)
7.2 Summary
98(1)
8 Design of Experiment for Machine Learning Component
99(8)
8.1 Data Dictionary
99(6)
8.1.1 Experiment Table
100(1)
8.1.2 Features Table
100(1)
8.1.3 Metrics Table
101(1)
8.1.4 ML_Method Table
102(1)
8.1.5 Hyperparameters_Domain Table
102(1)
8.1.6 Results Table
102(1)
8.1.7 Results_Metrics Table
103(2)
8.2 Sas Macro Program
105(1)
8.3 R Programs
105(1)
8.4 Summary
106(1)
Reference
106(1)
9 "Contaminated" Training Datasets Component
107(8)
9.1 Data Dictionary
107(3)
9.1.1 Contamination Table
108(1)
9.1.2 Cont_Experiment Table
109(1)
9.1.3 Cont_Results Table
109(1)
9.1.4 Cont_Metric Table
110(1)
9.2 Sas Macro Program
110(1)
9.3 R Programs
110(1)
9.4 Summary
111(1)
Reference
111(4)
Part III
10 Insurance Industry: Underwriters' Decision-Making Process
115(20)
10.1 Introduction
115(1)
10.2 Review Of Underwriters' Performance
116(7)
10.2.1 Metrics of Underwriters' Performance
116(1)
10.2.1.1 Hit Ratio
116(1)
10.2.1.2 Conversion Rate
116(1)
10.2.1.3 Dynamic Conversion Rate
117(1)
10.2.1.4 Time-to-Deal
118(1)
10.2.2 Analysis of Underwriters' Performance
119(1)
10.2.2.1 Data Description
119(1)
10.2.2.2 Application Flow
119(2)
10.2.2.3 Dynamic Conversion Rate per Underwriter
121(1)
10.2.2.4 Time-to-Deal per Underwriter
122(1)
10.3 Traditional Approach To Knowledge Delivery
123(1)
10.4 Anatomy Of Artificial Intelligence Solution
124(8)
10.4.1 Data Structure
124(1)
10.4.2 Classification Approach
125(1)
10.4.3 Bias-Variance Trade-Off and SVM Hyperparameters
125(2)
10.4.4 Building the Classifier
127(3)
10.4.5 "Contamination" of Training Datasets
130(1)
10.4.6 Experimental Results
130(2)
10.5 Summary
132(1)
References
132(3)
11 Insurance Industry: Claims Modeling And Prediction
135(20)
11.1 Introduction
135(1)
11.2 Data
136(1)
11.3 The Cox Model For Claims Event Analysis
136(2)
11.4 Application Of The Cox Model For Claims Analysis
138(14)
11.4.1 Data Transformation
139(2)
11.4.2 Cox Model Assumption Validation
141(3)
11.4.3 Bayesian Machine Learning Approach
144(1)
11.4.4 Deployment with SAS
144(2)
11.4.5 Interpretation of Results
146(6)
11.5 Summary
152(1)
References
153(2)
Index 155
Tanya Kolosova is a statistician, software engineer, an educator, and a co-author of two books on statistical analysis and metadata-based applications development using SAS. Tanya is an actionable analytics expert, she has extensive knowledge of software development methods and technologies, artificial intelligence methods and algorithms, and statistically designed experiments.

Samuel Berestizhevsky is a statistician, researcher and software engineer. Together with Tanya, Samuel co-authored two books on statistical analysis and metadata-based applications development using SAS. Samuel is an innovator and an expert in the area of automated actionable analytics and artificial intelligence solutions. His extensive knowledge of software development methods, technologies and algorithms allows him to develop solutions on the cutting edge of science.