Muutke küpsiste eelistusi

E-raamat: Feature Selection for High-Dimensional Data

  • Formaat - PDF+DRM
  • Hind: 55,56 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

This book offers a coherent and comprehensive approach to feature subset selection in the scope of classification problems, explaining the foundations, real application problems and the challenges of feature selection for high-dimensional data.

The authors first focus on the analysis and synthesis of feature selection algorithms, presenting a comprehensive review of basic concepts and experimental results of the most well-known algorithms.

They then address different real scenarios with high-dimensional data, showing the use of feature selection algorithms in different contexts with different requirements and information: microarray data, intrusion detection, tear film lipid layer classification and cost-based features. The book then delves into the scenario of big dimension, paying attention to important problems under high-dimensional spaces, such as scalability, distributed processing and real-time processing, scenarios that open up new and interesting challenges for researchers.

The book is useful for practitioners, researchers and graduate students in the areas of machine learning and data mining.
1 Introduction to High-Dimensionality
1(12)
1.1 The Need for Feature Selection
2(1)
1.2 When Features Are Born
3(1)
1.3 Intrinsic Characteristics of Data
4(5)
1.3.1 Small Sample Size
4(1)
1.3.2 Class Imbalance
5(1)
1.3.3 Data Complexity
6(1)
1.3.4 Dataset Shift
7(1)
1.3.5 Noisy Data
8(1)
1.3.6 Outliers
9(1)
1.3.7 Feature Cost
9(1)
1.4 A Guide for the Reader
9(4)
References
10(3)
2 Foundations of Feature Selection
13(16)
2.1 Feature Selection
14(1)
2.1.1 Feature Relevance
14(1)
2.1.2 Feature Redundancy
15(1)
2.2 Feature Selection Methods
15(11)
2.2.1 Filter Methods
17(7)
2.2.2 Embedded Methods
24(1)
2.2.3 Wrapper Methods
25(1)
2.2.4 Other Approaches
26(1)
2.3 Summary
26(3)
References
26(3)
3 A Critical Review of Feature Selection Methods
29(32)
3.1 Existing Reviews of Feature Selection Methods
30(1)
3.2 Experimental Settings
31(2)
3.3 Experimental Results
33(11)
3.3.1 Dealing with Correlation and Redundancy: CorrAL
34(1)
3.3.2 Dealing with Nonlinearity: XOR and Parity
35(1)
3.3.3 Dealing with Noise in the Inputs: Led
35(5)
3.3.4 Dealing with Noise in the Target: Monk3
40(3)
3.3.5 Dealing with a Complex Dataset: Madelon
43(1)
3.4 Case Studies
44(6)
3.4.1 Case Study I: Different Kernels for SVM-RFE
44(2)
3.4.2 Case Study II: mRMR vs Md
46(1)
3.4.3 Case Study III: Subset Filters
47(1)
3.4.4 Case Study IV: Different Levels of Noise in the Input
48(2)
3.5 Analysis and Discussion
50(6)
3.5.1 Analysis of Success Index
50(2)
3.5.2 Analysis of Classification Accuracy
52(4)
3.6 Summary
56(5)
References
57(4)
4 Feature Selection in DNA Microarray Classification
61(34)
4.1 Background: The Problem and First Attempts
63(1)
4.2 Intrinsic Characteristics of Microarray Data
64(3)
4.2.1 Small Sample Size
64(1)
4.2.2 Class Imbalance
64(1)
4.2.3 Data Complexity
65(1)
4.2.4 Dataset Shift
65(2)
4.2.5 Outliers
67(1)
4.3 Algorithms for Feature Selection on Microarray Data: A Review
67(9)
4.3.1 Filters
68(2)
4.3.2 Wrappers
70(2)
4.3.3 Embedded
72(1)
4.3.4 Other Algorithms
73(3)
4.4 A Framework for Feature Selection Evaluation in Microarray Datasets
76(3)
4.4.1 Validation Techniques
77(1)
4.4.2 On the Datasets Characteristics
78(1)
4.4.3 Feature Selection Methods
79(1)
4.4.4 Evaluation Measures
79(1)
4.5 A Practical Evaluation: Analysis of Results
79(9)
4.5.1 Holdout Validation Study
80(3)
4.5.2 Cross-validation Study
83(5)
4.6 Summary
88(7)
References
91(4)
5 Application of Feature Selection to Real Problems
95(30)
5.1 Classification in Intrusion Detection Systems
96(9)
5.1.1 Results on the Binary Case
98(3)
5.1.2 Results on the Multiple Class Case
101(4)
5.2 Tear Film Lipid Layer Classification
105(12)
5.2.1 Classification Accuracy
110(1)
5.2.2 Robustness to Noise
110(1)
5.2.3 Feature Extraction Time
111(1)
5.2.4 Overall Analysis
112(2)
5.2.5 The Concatenation of All Methods with CFS: A Case Study
114(3)
5.3 Cost-Based Feature Selection
117(5)
5.3.1 Description of the Method
117(2)
5.3.2 Experimental Results
119(3)
5.4 Summary
122(3)
References
123(2)
6 Emerging Challenges
125(8)
6.1 Millions of Dimensions
125(1)
6.2 Scalability
126(1)
6.3 Distributed Feature Selection
127(2)
6.4 Real-Time Processing
129(1)
6.5 Summary
130(3)
References
130(3)
A Experimental Framework Used in This Book
133
A.1 Software Tools
133(1)
A.2 Datasets
133(7)
A.2.1 Data Repositories
134(1)
A.2.2 Synthetic Datasets
135(4)
A.2.3 DNA Microarray Datasets
139(1)
A.3 Validation Techniques
140(1)
A.3.1 k-Fold Cross-validation
140(1)
A.3.2 Leave-One-Out Cross-validation
140(1)
A.3.3 Bootstrap
141(1)
A.3.4 Holdout Validation
141(1)
A.4 Statistical Tests
141(1)
A.5 Discretization Algorithms
142(1)
A.6 Classification Algorithms
143(2)
A.6.1 Support Vector Machine, SVM
143(1)
A.6.2 Proximal Support Vector Machine, PSVM
143(1)
A.6.3 C4.5
144(1)
A.6.4 Naive Bayes, NB
144(1)
A.6.5 k-Nearest Neighbors, k-NN
144(1)
A.6.6 One-Layer Feedfoward Neural Network, One-Layer NN
145(1)
A.7 Evaluation Measures
145
A.7.1 Multiple-Criteria Decision-Making
145(1)
References
146
Dr. Verónica Bolón-Canedo received her PhD in Computer Science from the University of A Coruña, where she is currently a postdoctoral researcher. Her research interests include data mining, feature selection and machine learning. 





Dr. Noelia Sánchez-Maroño received her PhD in 2005 from the University of A Coruña, where she is currently a lecturer. Her research interests include agent-based modeling, machine learning and feature selection.

Prof. Amparo Alonso-Betanzos received her PhD in 1988 from the University of Santiago de Compostela, she is a Chair Professor in the Dept. of Computer Science at the University of A Coruña (Spain) and coordinator of the Laboratory for Research and Development in Artificial Intelligence. Her areas of expertise are machine learning, feature selection, knowledge-based systems, and their applications to fields such as predictive maintenance in engineering or predicting gene expression in bioinformatics.