Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Statistical Data Mining Using SAS Applications 2nd edition [Taylor & Francis e-raamat]

George Fernandez (University of Nevada, Reno, USA)

Formaat: 478 pages, 164 Tables, black and white; 151 Illustrations, black and white
Sari: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
Ilmumisaeg: 18-Jun-2010
Kirjastus: CRC Press Inc
ISBN-13: 9780429131875

Teised raamatud teemal:

Databases

Taylor & Francis e-raamat
Hind: 216,96 €*
* hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
Tavahind: 309,94 €
Säästad 30%

Formaat: 478 pages, 164 Tables, black and white; 151 Illustrations, black and white
Sari: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
Ilmumisaeg: 18-Jun-2010
Kirjastus: CRC Press Inc
ISBN-13: 9780429131875

Teised raamatud teemal:

Databases

Rohkem infot Taylor & Francis e-raamatute kohta

Statistical Data Mining Using SAS Applications, Second Edition describes statistical data mining concepts and demonstrates the features of user-friendly data mining SAS tools. Integrating the statistical and graphical analysis tools available in SAS systems, the book provides complete statistical data mining solutions without writing SAS program codes or using the point-and-click approach. Each chapter emphasizes step-by-step instructions for using SAS macros and interpreting the results. Compiled data mining SAS macro files are available for download on the authors website. By following the step-by-step instructions and downloading the SAS macros, analysts can perform complete data mining analysis fast and effectively.

New to the Second EditionGeneral Features

Access to SAS macros directly from desktop Compatible with SAS version 9, SAS Enterprise Guide, and SAS Learning Edition

Reorganization of all help files to an appendix Ability to create publication quality graphics Macro-call error check

New Features in These SAS-Specific Macro Applications

Converting PC data files to SAS data (EXLSAS2 macro) Randomly splitting data (RANSPLIT2) Frequency analysis (FREQ2) Univariate analysis (UNIVAR2) PCA and factor analysis (FACTOR2) Multiple linear regressions (REGDIAG2) Logistic regression (LOGIST2) CHAID analysis (CHAID2)

Requiring no experience with SAS programming, this resource supplies instructions and tools for quickly performing exploratory statistical methods, regression analysis, logistic regression multivariate methods, and classification analysis. It presents an accessible, SAS macro-oriented approach while offering comprehensive data mining solutions.

Preface

xiii

Acknowledgments

xxi

About the Author

xxiii

1 Data Mining: A Gentle Introduction

(14)

1.1 Introduction

(1)

1.2 Data Mining: Why It Is Successful in the IT World

(2)

1.2.1 Availability of Large Databases: Data Warehousing

(1)

1.2.2 Price Drop in Data Storage and Efficient Computer Processing

(1)

1.2.3 New Advancements in Analytical Methodology

(1)

1.3 Benefits of Data Mining

(1)

1.4 Data Mining: Users

(2)

1.5 Data Mining: Tools

(1)

1.6 Data Mining: Steps

(4)

1.6.1 Identification of Problem and Defining the Data Mining Study Goal

(1)

1.6.2 Data Processing

(1)

1.6.3 Data Exploration and Descriptive Analysis

(1)

1.6.4 Data Mining Solutions: Unsupervised Learning Methods

(1)

1.6.5 Data Mining Solutions: Supervised Learning Methods

(1)

1.6.6 Model Validation

(1)

1.6.7 Interpret and Make Decision,

(1)

1.7 Problems in the Data Mining Process

(1)

1.8 SAS Software the Leader in Data Mining

(2)

1.8.1 SEM MA: The SAS Data Mining Process

(1)

1.8.2 SAS Enterprise Miner for Comprehensive Data Mining Solution

(1)

1.9 Introduction of User-Friendly SAS Macros for Statistical Data Mining

(1)

1.9.1 Limitations of These SAS Macros

(1)

1.10 Summary

(1)

References

(2)

2 Preparing Data for Data Mining

(20)

2.1 Introduction

(1)

2.2 Data Requirements in Data Mining

(1)

2.3 Ideal Structures of Data for Data Mining

(1)

2.4 Understanding the Measurement Scale of Variables

(1)

2.5 Entire Database or Representative Sample

(1)

2.6 Sampling for Data Mining

(1)

2.6.1 Sample Size

(1)

2.7 User-Friendly SAS Applications Used in Data Preparation

(15)

2.7.1 Preparing PC Data Files before Importing into SAS Data

(2)

2.7.2 Converting PC Data Files to SAS Datasets Using the SAS Import Wizard

(1)

2.7.3 EXLSAS2 SAS Macro Application to Convert PC Data Formats to SAS Datasets

(1)

2.7.4 Steps Involved in Running the EXLSAS2 Macro

(2)

2.7.5 Case Study 1: Importing an Excel File Called "Fraud" to a Permanent SAS Dataset Called "Fraud"

(1)

2.7.6 SAS Macro Applications—RANSPLIT2: Random Sampling from the Entire Database

(1)

2.7.7 Steps Involved in Running the RANSPLIT2 Macro

(4)

2.7.8 Case Study 2: Drawing Training (400), Validation (300), and Test (All Left-Over Observations) Samples from the SAS Data Called "Fraud"

(3)

2.8 Summary

(1)

References

(2)

3 Exploratory Data Analysis

(32)

3.1 Introduction

(1)

3.2 Exploring Continuous Variables

(7)

3.2.1 Descriptive Statistics

(4)

3.2.1.1 Measures of Location or Central Tendency

(1)

3.2.1.2 Robust Measures of Location

(1)

3.2.1.3 Five-Number Summary Statistics

(1)

3.2.1.4 Measures of Dispersion

(1)

3.2.1.5 Standard Errors and Confidence Interval Estimates

(1)

3.2.1.6 Detecting Deviation from Normally Distributed Data

(1)

3.2.2 Graphical Techniques Used in EDA of Continuous Data

(3)

3.3 Data Exploration: Categorical Variable

(2)

3.3.1 Descriptive Statistical Estimates of Categorical Variables

(1)

3.3.2 Graphical Displays for Categorical Data

(1)

3.4 SAS Macro Applications Used in Data Exploration

(20)

3.4.1 Exploring Categorical Variables Using the SAS Macro FREQ2

(3)

3.4.1.1 Steps Involved in Running the FREQ2 Macro

(1)

3.4.2 Case Study 1: Exploring Categorical Variables in a SAS Dataset

(2)

3.4.3 EDA Analysis of Continuous Variables Using SAS Macro UNIVAR2

(4)

3.4.3.1 Steps Involved in Running the UNIVAR2 Macro

(2)

3.4.4 Case Study 2: Data Exploration of a Continuous Variable Using UNIVAR2

(5)

3.4.5 Case Study 3: Exploring Continuous Data by a Group Variable Using UNIVAR2

(11)

3.4.5.1 Data Descriptions

(6)

3.5 Summary

(1)

References

(3)

4 Unsupervised Learning Methods

(76)

4.1 Introduction

(1)

4.2 Applications of Unsupervised Learning Methods

(1)

4.3 Principal Component Analysis

(2)

4.3.1 PCA Terminology

(1)

4.4 Exploratory Factor Analysis

(9)

4.4.1 Exploratory Factor Analysis versus Principal Component Analysis

(1)

4.4.2 Exploratory Factor Analysis Terminology

(7)

4.4.2.1 Communalities and Uniqueness

(1)

4.4.2.2 Heywood Case

(1)

4.4.2.3 Cronbach Coefficient Alpha

(1)

4.4.2.4 Factor Analysis Methods

(1)

4.4.2.5 Sampling Adequacy Check in Factor Analysis

(1)

4.4.2.6 Estimating the Number of Factors

(1)

4.4.2.7 Eigenvalues

(1)

4.4.2.8 Factor Loadings

(1)

4.4.2.9 Factor Rotation

(1)

4.4.2.10 Confidence Intervals and the Significance of Factor Loading Converge

(1)

4.4.2.11 Standardized Factor Score

(2)

4.5 Disjoint Cluster Analysis

(2)

4.5.1 Types of Cluster Analysis

(1)

4.5.2 FASTCLUS: SAS Procedure to Perform Disjoint Cluster Analysis

(1)

4.6 Biplot Display of PCA, EFA, and DCA Results

(1)

4.7 PCA and EFA Using SAS Macro FACTOR2

(39)

4.7.1 Steps Involved in Running the FACTOR2 Macro

(1)

4.7.2 Case Study 1: Principal Component Analysis of 1993 Car Attribute Data

(13)

4.7.2.1 Study Objectives

(1)

4.7.2.2 Data Descriptions

(12)

4.7.3 Case Study 2: Maximum Likelihood FACTOR Analysis with VARIMAX Rotation of 1993 Car Attribute Data

(19)

4.7.3.1 Study Objectives

(1)

4.7.3.2 Data Descriptions

(19)

4.7.3 CASE Study 3: Maximum Likelihood FACTOR Analysis with VARIMAX Rotation Using a Multivariate Data in the Form of Correlation Matrix

116

(5)

4.7.3.1 Study Objectives

116

(1)

4.7.3.2 Data Descriptions

117

(4)

4.8 Disjoint Cluster Analysis Using SAS Macro DISJCLS2

121

(19)

4.8.1 Steps Involved in Running the DISJCLS2 Macro

124

(1)

4.8.2 Case Study 4: Disjoint Cluster Analysis of 1993 Car Attribute Data

125

(20)

4.8.2.1 Study Objectives

125

(1)

4.8.2.2 Data Descriptions

126

(14)

4.9 Summary

140

(1)

References

140

(3)

5 Supervised Learning Methods: Prediction

143

(162)

5.1 Introduction

143

(1)

5.2 Applications of Supervised Predictive Methods

144

(1)

5.3 Multiple Linear Regression Modeling

145

(13)

5.3.1 Multiple Linear Regressions: Key Concepts and Terminology

145

(3)

5.3.2 Model Selection in Multiple Linear Regression

148

(2)

5.3.2.1 Best Candidate Models Selected Based on AICC and SBC

149

(1)

5.3.2.2 Model Selection Based on the New SAS PROC GLMSELECT

149

(1)

5.3.3 Exploratory Analysis Using Diagnostic Plots

150

(4)

5.3.4 Violations of Regression Model Assumptions

154

(2)

5.3.4.1 Model Specification Error

154

(1)

5.3.4.2 Serial Correlation among the Residual

154

(1)

5.3.4.3 Influential Outliers

155

(1)

5.3.4.4 Multicollinearity

155

(1)

5.3.4.5 Heteroscedasticity in Residual Variance

155

(1)

5.3.4.6 Nonnormality of Residuals

156

(1)

5.3.5 Regression Model Validation

156

(1)

5.3.6 Robust Regression

156

(1)

5.3.7 Survey Regression

157

(1)

5.4 Binary Logistic Regression Modeling

158

(7)

5.4.1 Terminology and Key Concepts

158

(3)

5.4.2 Model Selection in Logistic Regression

161

(1)

5.4.3 Exploratory Analysis Using Diagnostic Plots

162

(2)

5.4.3.1 Interpretation

163

(1)

5.4.3.2 Two-Factor Interaction Plots between Continuous Variables

164

(1)

5.4.4 Checking for Violations of Regression Model Assumptions

164

(3)

5.4.4.1 Model Specification Error

164

(1)

5.4.4.2 Influential Outlier

164

(1)

5.4.4.3 Multicollinearity

165

(1)

5.4.4.4 Overdispersion

165

(1)

5.5 Ordinal Logistic Regression

165

(1)

5.6 Survey Logistic Regression

166

(1)

5.7 Multiple Linear Regression Using SAS Macro REGDIAG?

167

(2)

5.7.1 Steps Involved in Running the REGDIAG2 Macro

168

(1)

5.8 Lift Chart Using SAS Macro LIFT2

169

(1)

5.8.1 Steps Involved in Running the LIFT2 Macro

170

(1)

5.9 Scoring New Regression Data Using the SAS Macro RSCORE2

170

(2)

5.9.1 Steps Involved in Running the RSCORE2 Macro

171

(1)

5.10 Logistic Regression Using SAS Macro LOGEST7

172

(1)

5.11 Scoring New Logistic Regression Data Using the SAS Macro RSCORE

173

(1)

5.12 Case Study 1: Modeling Multiple Linear Regressions

173

(33)

5.12.1 Study Objectives

173

(33)

5.12.1.1 Step 1: Preliminary Model Selection

175

(4)

5.12.1.2 Step 2: Graphical Exploratory Analysis and Regression Diagnostic Plots

179

(12)

5.12.1.3 Step 3: Fitting the Regression Model and Checking for the Violations of Regression Assumptions

191

(12)

5.12.1.4 Remedial Measure: Robust Regression to Adjust the Regression Parameter Estimates to Extreme Outliers

203

(3)

5.13 Case Study 2: If—Then Analysis and Lift Charts

206

(6)

5.13.1 Data Descriptions

208

(4)

5.14 Case Study 3: Modeling Multiple Linear Regression with Categorical Variables

212

(20)

5.14.1 Study Objectives

212

(1)

5.14.2 Data Descriptions

212

(20)

5.15 Case Study 4: Modeling Binary Logistic Regression

232

(28)

5.15.1 Study Objectives

232

(2)

5.15.2 Data Descriptions

234

(26)

5.15.2.1 Step 1: Best Candidate Model Selection

235

(2)

5.15.2.2 Step 2: Exploratory Analysis/Diagnostic Plots

237

(2)

5.15.2.3 Step 3: Fitting Binary Logistic Regression

239

(21)

5.16 Case Study: 5 Modeling Binary Multiple Logistic Regression

260

(26)

5.16.1 Study Objectives

260

(1)

5.16.2 Data Descriptions

261

(25)

5.17 Case Study: 6 Modeling Ordinal Multiple Logistic Regression

286

(15)

5.17.1 Study Objectives

286

(1)

5.17.2 Data Descriptions

286

(15)

5.18 Summary

301

(1)

References

301

(4)

6 Supervised Learning Methods: Classification

305

(72)

6.1 Introduction

305

(1)

6.2 Discriminant Analysis

306

(1)

6.3 Stepwise Discriminant Analysis

306

(2)

6.4 Canonical Discriminant Analysis

308

(2)

6.4.1 Canonical Discriminant Analysis Assumptions

308

(1)

6.4.2 Key Concepts and Terminology in Canonical Discriminant Analysis

309

(1)

6.5 Discriminant Function Analysis

310

(3)

6.5.1 Key Concepts and Terminology in Discriminant Function Analysis

310

(3)

6.6 Applications of Discriminant Analysis

313

(1)

6.7 Classification Tree Based on CHAID

313

(3)

6.7.1 Key Concepts and Terminology in Classification Tree Methods

314

(2)

6.8 Applications of CHAID

316

(1)

6.9 Discriminant Analysis Using SAS Macro DISCRIM2

316

(2)

6.9.1 Steps Involved in Running the DISCRIM2 Macro

317

(1)

6.10 Decision Tree Using SAS Macro CHAID2

318

(2)

6.10.1 Steps Involved in Running the CHAID2 Macro

319

(1)

6.11 Case Study 1: Canonical Discriminant Analysis and Parametric Discriminant Function Analysis

320

(26)

6.11.1 Study Objectives

320

(1)

6.11.2 Case Study 1: Parametric Discriminant Analysis

321

(25)

6.11.2.1 Canonical Discriminant Analysis (CDA)

328

(18)

6.12 Case Study 2: Nonparametric Discriminant Function Analysis

346

(17)

6.12.1 Study Objectives

346

(1)

6.12.2 Data Descriptions

347

(16)

6.13 Case Study 3: Classification Tree Using CH AID

363

(12)

6.13.1 Study Objectives

364

(1)

6.13.2 Data Descriptions

364

(11)

6.14 Summary

375

(1)

References

376

(1)

7 Advanced Analytics and Other SAS Data Mining Resources

377

(6)

7.1 Introduction

377

(1)

7.2 Artificial Neural Network Methods

378

(1)

7.3 Market Basket Analysis

379

(2)

7.3.1 Benefits of MBA

380

(1)

7.3.2 Limitations of Market Basket Analysis

380

(1)

7.4 SAS Software: The Leader in Data Mining

381

(1)

7.5 Summary

382

(1)

References

382

(1)

Appendix I: Instruction for Using the SAS Macros

383

(4)

Appendix II: Data Mining SAS Macro Help Files

387

(54)

Appendix III: Instruction for Using the SAS Macros with Enterprise Guide Code Window

441

(2)

Index

443

George Fernandez is a professor of applied statistical methods and the director of the Center for Research Design and Analysis at the University of Nevada in Reno.

Püsilink: https://www.kriso.ee/db/9780429131875_pe.html

Märksõnad:

E-raamat: Statistical Data Mining Using SAS Applications 2nd edition [Taylor & Francis e-raamat]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Kirjastuste teemad

Vali ostukorv