Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Data Mining With Decision Trees: Theory And Applications (2nd Edition)

3.40/5 (6 hinnangut Goodreads-ist)

Lior Rokach (Ben-gurion Univ Of The Negev, Israel), Oded Z Maimon (Tel-aviv Univ, Israel)

Formaat: 328 pages
Sari: Series In Machine Perception And Artificial Intelligence 81
Ilmumisaeg: 03-Sep-2014
Kirjastus: World Scientific Publishing Co Pte Ltd
Keel: eng
ISBN-13: 9789814590099

Teised raamatud teemal:

Data mining

Formaat - EPUB+DRM
Hind: 42,12 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 328 pages
Sari: Series In Machine Perception And Artificial Intelligence 81
Ilmumisaeg: 03-Sep-2014
Kirjastus: World Scientific Publishing Co Pte Ltd
Keel: eng
ISBN-13: 9789814590099

Teised raamatud teemal:

Data mining

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

The textbook is for graduate and undergraduate courses in data mining for students in the information sciences. Nearly a quarter of the second edition is new material, say Rokach and Maimon, including four new chapters. Their topics include training decision trees, evaluating classification trees, splitting criteria, popular decision trees induction algorithms, a walk-through guide for using decision trees software, the cost-sensitive active and proactive learning of decision trees, feature selection, and decision trees and recommender systems. Annotation ©2015 Ringgold, Inc., Portland, OR (protoview.com)

Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining; it is the science of exploring large and complex bodies of data in order to discover useful patterns. Decision tree learning continues to evolve over time. Existing methods are constantly being improved and new methods introduced.This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of our first edition. In this new edition, all chapters have been revised and new topics brought in. New topics include Cost-Sensitive Active Learning, Learning with Uncertain and Imbalanced Data, Using Decision Trees beyond Classification Tasks, Privacy Preserving Decision Tree Learning, Lessons Learned from Comparative Studies, and Learning Decision Trees for Big Data. A walk-through guide to existing open-source data mining software is also included in this edition.This book invites readers to explore the many benefits in data mining that decision trees offer: Self-explanatory and easy to follow when compacted Able to handle a variety of input data: nominal, numeric and textual Scales well to big data Able to process datasets that may have errors or missing values High predictive performance for a relatively small computational effort Available in many open source data mining packages over a variety of platforms Useful for various tasks, such as classification, regression, clustering and feature selection

About the Authors

Preface for the Second Edition

vii

Preface for the First Edition

1 Introduction to Decision Trees

(16)

1.1 Data Science

(1)

1.2 Data Mining

(1)

1.3 The Four-Layer Model

(1)

1.4 Knowledge Discovery in Databases (KDD)

(4)

1.5 Taxonomy of Data Mining' Methods

(1)

1.6 Supervised Methods

(1)

1.6.1 Overview

(1)

1.7 Classification Trees

(2)

1.8 Characteristics of Classification Trees

(3)

1.8.1 Tree Size

(1)

1.8.2 The Hierarchical Nature of Decision Trees

(1)

1.9 Relation to Rule Induction

(2)

2 Training Decision Trees

(6)

2.1 What is Learning?

(1)

2.2 Preparing the Training Set

(2)

2.3 Training the Decision Tree

(4)

3 A Generic Algorithm for Top-Down Induction of Decision Trees

(8)

3.1 Training Set

(2)

3.2 Definition of the Classification Problem

(1)

3.3 Induction Algorithms

(1)

3.4 Probability Estimation in Decision Trees

(2)

3.4.1 Laplace Correction

(1)

3.4.2 No Match

(1)

3.5 Algorithmic Framework for Decision Trees

(2)

3.6 Stopping Criteria

(1)

4 Evaluation of Classification Trees

(30)

4.1 Overview

(1)

4.2 Generalization Error

(21)

4.2.1 Theoretical Estimation of Generalization Error

(1)

4.2.2 Empirical Estimation of Generalization Error

(2)

4.2.3 Alternatives to the Accuracy Measure

(1)

4.2.4 The F-Measure

(1)

4.2.5 Confusion Matrix

(1)

4.2.6 Classifier Evaluation under Limited Resources

(11)

4.2.6.1 ROC Curves

(1)

4.2.6.2 Hit-Rate Curve

(1)

4.2.6.3 Qrecall (Quota Recall)

(1)

4.2.6.4 Lift Curve

(1)

4.2.6.5 Pearson Correlation Coefficient

(2)

4.2.6.6 Area Under Curve (AUC)

(1)

4.2.6.7 Average Hit-Rate

(1)

4.2.6.8 Average Qrecall

(1)

4.2.6.9 Potential Extract Measure (PEM)

(3)

4.2.7 Which Decision Tree Classifier is Better?

(13)

4.2.7.1 McNemar's Test

(2)

4.2.7.2 A Test for the Difference of Two Proportions

(1)

4.2.7.3 The Resampled Paired t Test

(1)

4.2.7.4 The k-fold Cross-validated Paired t Test

(1)

4.3 Computational Complexity

(1)

4.4 Comprehensibility

(1)

4.5 Scalability to Large Datasets

(2)

4.6 Robustness

(1)

4.7 Stability

(1)

4.8 Interestingness Measures

(1)

4.9 Overfitting and Underfitting

(1)

4.10 "No Free Lunch" Theorem

(3)

5 Splitting Criteria

(8)

5.1 Univariate Splitting Criteria

(6)

5.1.1 Overview

(1)

5.1.2 Impurity-based Criteria

(1)

5.1.3 Information Gain

(1)

5.1.4 Gini Index

(1)

5.1.5 Likelihood Ratio Chi-squared Statistics

(1)

5.1.6 DKM Criterion

(1)

5.1.7 Normalized Impurity-based Criteria

(1)

5.1.8 Gain Ratio

(1)

5.1.9 Distance Measure

(1)

5.1.10 Binary Criteria

(1)

5.1.11 Twoing Criterion

(1)

5.1.12 Orthogonal Criterion

(1)

5.1.13 Kolmogorov—Smirnov Criterion

(1)

5.1.14 AUC Splitting Criteria

(1)

5.1.15 Other Univariate Splitting Criteria

(1)

5.1.16 Comparison of Univariate Splitting Criteria

(1)

5.2 Handling Missing Values

(2)

6 Pruning Trees

(8)

6.1 Stopping Criteria

(1)

6.2 Heuristic Pruning

(5)

6.2.1 Overview

(1)

6.2.2 Cost Complexity Pruning

(1)

6.2.3 Reduced Error Pruning

(1)

6.2.4 Minimum Error Pruning (MEP)

(1)

6.2.5 Pessimistic Pruning

(1)

6.2.6 Error-Based Pruning (EBP)

(1)

6.2.7 Minimum Description Length (MDL) Pruning

(1)

6.2.8 Other Pruning Methods

(1)

6.2.9 Comparison of Pruning Methods

(1)

6.3 Optimal Pruning

(3)

7 Popular Decision Trees Induction Algorithms

(8)

7.1 Overview

(1)

7.2 ID3

(1)

7.3 C4.5

(1)

7.4 CART

(1)

7.5 CHAID

(1)

7.6 QUEST

(1)

7.7 Reference to Other Algorithms

(1)

7.8 Advantages and Disadvantages of DecIsion Trees

(4)

8 Beyond Classification Tasks

(14)

8.1 Introduction

(1)

8.2 Regression Trees

(1)

8.3 Survival Trees

(3)

8.4 Clustering Tree

(5)

8.4.1 Distance Measures

(1)

8.4.2 Minkowski: Distance Measures for Numeric Attributes

(2)

8.4.2.1 Distance Measures for Binary Attributes

(1)

8.4.2.2 Distance Measures for Nominal Attributes

(1)

8.4.2.3 Distance Metrics for Ordinal Attributes

(1)

8.4.2.4 Distance Metrics for Mixed-Type Attributes

(1)

8.4.3 Similarity Functions

(1)

8.4.3.1 Cosine Measure

(1)

8.4.3.2 Pearson Correlation Measure

(1)

8.4.3.3 Extended Jaccard Measure

(1)

8.4.3.4 Dice Coefficient Measure

(1)

8.4.4 The OCCT Algorithm

(1)

8.5 Hidden Markov Model Trees

(5)

9 Decision Forests

(52)

9.1 Introduction

(1)

9.2 Back to the Roots

(9)

9.3 Combination Methods

108

(10)

9.3.1 Weighting Methods

108

(5)

9.3.1.1 Majority Voting

108

(1)

9.3.1.2 Performance Weighting

109

(1)

9.3.1.3 Distribution Summation

109

(1)

9.3.1.4 Bayesian Combination

109

(1)

9.3.1.5 Dempster—Shafer

110

(1)

9.3.1.6 Vogging

110

(1)

9.3.1.7 Naive Bayes

110

(1)

9.3.1.8 Entropy Weighting

110

(1)

9.3.1.9 Density-based Weighting

111

(1)

9.3.1.10 DEA Weighting Method

111

(1)

9.3.1.11 Logarithmic Opinion Pool

111

(1)

9.3.1.12 Gating Network

112

(1)

9.3.1.13 Order Statistics

113

(1)

9.3.2 Meta-combination Methods

113

(5)

9.3.2.1 Stacking

113

(1)

9.3.2.2 Arbiter Trees

114

(2)

9.3.2.3 Combiner Trees

116

(1)

9.3.2.4 Grading

117

(1)

9.4 Classifier Dependency

118

(12)

9.4.1 Dependent Methods

118

(4)

9.4.1.1 Model-guided Instance Selection

118

(4)

9.4.1.2 Incremental Batch Learning

122

(1)

9.4.2 Independent Methods

122

(8)

9.4.2.1 Bagging

122

(2)

9.4.2.2 Wagging

124

(1)

9.4.2.3 Random Forest

125

(1)

9.4.2.4 Rotation Forest

126

(3)

9.4.2.5 Cross-validated Committees

129

(1)

9.5 Ensemble Diversity

130

(14)

9.5.1 Manipulating the Inducer

131

(2)

9.5.1.1 Manipulation of the Inducer's Parameters

131

(1)

9.5.1.2 Starting Point in Hypothesis Space

132

(1)

9.5.1.3 Hypothesis Space Traversal

132

(1)

9.5.1.3.1 Random-based Strategy

132

(1)

9.5.1.3.2 Collective-Performance-based Strategy

132

(1)

9.5.2 Manipulating the Training Samples

133

(1)

9.5.2.1 Resampling

133

(1)

9.5.2.2 Creation

133

(1)

9.5.2.3 Partitioning

134

(1)

9.5.3 Manipulating the Target Attribute Representation

134

(2)

9.5.4 Partitioning the Search Space

136

(6)

9.5.4.1 Divide and Conquer

136

(1)

9.5.4.2 Feature Subset-based Ensemble Methods

137

(9)

9.5.4.2.1 Random based Strategy

138

(1)

9.5.4.2.2 Reduct-based Strategy

138

(1)

9.5.4.2.3 Collective-Performance-based Strategy

139

(1)

9.5.4.2.4 Feature Set Partitioning

139

(3)

9.5.5 Multi-Inducers

142

(1)

9.5.6 Measuring the Diversity

143

(1)

9.6 Ensemble Size

144

(3)

9.6.1 Selecting the Ensemble Size

144

(1)

9.6.2 Pre-selection of the Ensemble Size

145

(1)

9.6.3 Selection of the Ensemble Size while Training

145

(1)

9.6.4 Pruning — Post Selection of the Ensemble Size

146

(6)

9.6.4.1 Pre-combining Pruning

146

(1)

9.6.4.2 Post-combining Pruning

146

(1)

9.7 Cross-Inducer

147

(1)

9.8 Multistrategy Ensemble Learning

148

(1)

9.9 Which Ensemble Method Should be Used?

148

(1)

9.10 Open Source for Decision Trees Forests

149

(2)

10 A Walk-through-guide for Using Decision Trees Software

151

(16)

10.1 Introduction

151

(1)

10.2 Weka

152

(7)

10.2.1 Training a Classification Tree

153

(5)

10.2.2 Building a Forest

158

(1)

10.3 R

159

(8)

10.3.1 Party Package

159

(3)

10.3.2 Forest

162

(1)

10.3.3 Other Types of Trees

163

(1)

10.3.4 The Rpart Package

164

(1)

10.3.5 RandomForest

165

(2)

11 Advanced Decision Trees

167

(16)

11.1 Oblivious Decision Trees

167

(1)

11.2 Online Adaptive Decision Trees

168

(1)

11.3 Lazy Tree

168

(1)

11.4 Option Tree

169

(3)

11.5 Lookahead

172

(1)

11.6 Oblique Decision Trees

172

(3)

11.7 Incremental Learning of Decision Trees

175

(4)

11.7.1 The Motives for Incremental Learning

175

(1)

11.7.2 The Inefficiency Challenge

176

(1)

11.7.3 The Concept Drift Challenge

177

(2)

11.8 Decision Trees Inducers for Large Datasets

179

(4)

11.8.1 Accelerating Tree Induction

180

(2)

11.8.2 Parallel Induction of Tree

182

(1)

12 Cost-sensitive Active and Proactive Learning of Decision Trees

183

(20)

12.1 Overview

183

(1)

12.2 Type of Costs

184

(1)

12.3 Learning with Costs

185

(3)

12.4 Induction of Cost Sensitive Decision Trees

188

(1)

12.5 Active Learning

189

(7)

12.6 Proactive Data Mining

196

(7)

12.6.1 Changing the Input Data

197

(1)

12.6.2 Attribute Changing Cost and Benefit Functions

198

(1)

12.6.3 Maximizing Utility

199

(1)

12.6.4 An Algorithmic Framework for Proactive Data Mining

200

(3)

13 Feature Selection

203

(22)

13.1 Overview

203

(1)

13.2 The "Curse of Dimensionality"

203

(3)

13.3 Techniques for Feature Selection

206

(5)

13.3.1 Feature Filters

207

(2)

13.3.1.1 FOCUS

207

(1)

13.3.1.2 LVF

207

(1)

13.3.1.3 Using a Learning Algorithm as a Filter

207

(1)

13.3.1.4 An Information Theoretic Feature Filter

208

(1)

13.3.1.5 RELIEF Algorithm

208

(1)

13.3.1.6 Simba and G-flip

208

(1)

13.3.1.7 Contextual Merit (CM) Algorithm

209

(1)

13.3.2 Using Traditional Statistics for Filtering

209

(2)

13.3.2.1 Mallows Cp

209

(1)

13.3.2.2 AIC, BIC and F-ratio

209

(1)

13.3.2.3 Principal Component Analysis (PCA)

210

(1)

13.3.2.4 Factor Analysis (FA)

210

(1)

13.3.2.5 Projection Pursuit (PP)

210

(1)

13.3.3 Wrappers

211

(2)

13.3.3.1 Wrappers for Decision Tree Learners

211

(1)

13.4 Feature Selection as a means of Creating Ensembles

211

(2)

13.5 Ensemble Methodology for Improving Feature Selection

213

(8)

13.5.1 Independent Algorithmic Framework

215

(1)

13.5.2 Combining Procedure

216

(4)

13.5.2.1 Simple Weighted Voting

216

(2)

13.5.2.2 Using Artificial Contrasts

218

(2)

13.5.3 Feature Ensemble Generator

220

(10)

13.5.3.1 Multiple Feature Selectors

220

(1)

13.5.3.2 Bagging

221

(1)

13.6 Using Decision Trees for Feature Selection

221

(1)

13.7 Limitation of Feature Selection Methods

222

(3)

14 Fuzzy Decision Trees

225

(12)

14.1 Overview

225

(1)

14.2 Membership Function

226

(1)

14.3 Fuzzy Classification Problems

227

(1)

14.4 Fuzzy Set Operations

228

(1)

14.5 Fuzzy Classification Rules

229

(1)

14.6 Creating Fuzzy Decision Tree

230

(4)

14.6.1 Fuzzifying Numeric Attributes

230

(2)

14.6.2 Inducing of Fuzzy Decision Tree

232

(2)

14.7 Simplifying the Decision Tree

234

(1)

14.8 Classification of New Instances

234

(1)

14.9 Other Fuzzy Decision Tree Inducers

234

(3)

15 Hybridization of Decision Trees with other Techniques

237

(14)

15.1 Introduction

237

(1)

15.2 A Framework for Instance-Space Decomposition

237

(5)

15.2.1 Stopping Rules

240

(1)

15.2.2 Splitting Rules

241

(1)

15.2.3 Split Validation Examinations

241

(1)

15.3 The Contrasted Population Miner (CPOM) Algorithm

242

(4)

15.3.1 CPOM Outline

242

(2)

15.3.2 The Grouped Gain Ratio Splitting Rule

244

(2)

15.4 Induction of Decision Trees by an Evolutionary Algorithm (EA)

246

(5)

16 Decision Trees and Recommender Systems

251

(22)

16.1 Introduction

251

(1)

16.2 Using Decision Trees for Recommending Items

252

(7)

16.2.1 RS-Adapted Decision Tree

253

(4)

16.2.2 Least Probable Intersections

257

(2)

16.3 Using Decision Trees for Preferences Elicitation

259

(14)

16.3.1 Static Methods

261

(1)

16.3.2 Dynamic Methods and Decision Trees

262

(1)

16.3.3 SVD-based CF Method

263

(1)

16.3.4 Pairwise Comparisons

264

(2)

16.3.5 Profile Representation

266

(1)

16.3.6 Selecting the Next Pairwise Comparison

267

(2)

16.3.7 Clustering the Items

269

(1)

16.3.8 Training a Lazy Decision Tree

270

(3)

Bibliography

273

(30)

Index

303

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97898145900996e.html

Märksõnad:

E-raamat: Data Mining With Decision Trees: Theory And Applications (2nd Edition)

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv