Preface |
|
xi | |
|
|
1 | (98) |
|
Chapter 1 Explanatory Tools for Machine Learning in the Symbolic Data Analysis Framework |
|
|
3 | (28) |
|
|
|
4 | (2) |
|
1.2 Introduction to Symbolic Data Analysis |
|
|
6 | (4) |
|
1.2.1 What are complex data? |
|
|
6 | (1) |
|
1.2.2 What are "classes" and "class of complex data"? |
|
|
7 | (1) |
|
1.2.3 Which kind of class variability? |
|
|
7 | (1) |
|
1.2.4 What are "symbolic variables" and "symbolic data tables"? |
|
|
7 | (2) |
|
1.2.5 Symbolic Data Analysis (SDA) |
|
|
9 | (1) |
|
1.3 Symbolic data tables from Dynamic Clustering Method and EM |
|
|
10 | (6) |
|
1.3.1 The "dynamical clustering method" (DCM) |
|
|
10 | (1) |
|
1.3.2 Examples of DCM applications |
|
|
10 | (2) |
|
1.3.3 Clustering methods by mixture decomposition |
|
|
12 | (1) |
|
1.3.4 Symbolic data tables from clustering |
|
|
13 | (2) |
|
1.3.5 A general way to compare results of clustering methods by the "explanatory power" of their associated symbolic data table |
|
|
15 | (1) |
|
1.3.6 Quality criteria of classes and variables based on the cells of the symbolic data table containing intervals or inferred distributions |
|
|
15 | (1) |
|
1.4 Criteria for ranking individuals, classes and their bar chart descriptive symbolic variables |
|
|
16 | (7) |
|
1.4.1 A theoretical framework for SDA |
|
|
16 | (2) |
|
1.4.2 Characterization of a category and a class by a measure of discordance |
|
|
18 | (1) |
|
1.4.3 Link between a characterization by the criteria W and the standard Tf-Idf |
|
|
19 | (2) |
|
1.4.4 Ranking the individuals, the symbolic variables and the classes of a bar chart symbolic data table |
|
|
21 | (2) |
|
1.5 Two directions of research |
|
|
23 | (4) |
|
1.5.1 Parametrization of concordance and discordance criteria |
|
|
23 | (2) |
|
1.5.2 Improving the explanatory power of any machine learning tool by a filtering process |
|
|
25 | (2) |
|
|
27 | (1) |
|
|
28 | (3) |
|
Chapter 2 Likelihood in the Symbolic Context |
|
|
31 | (18) |
|
|
|
|
31 | (1) |
|
2.2 Probabilistic setting |
|
|
32 | (6) |
|
2.2.1 Description variable and class variable |
|
|
32 | (1) |
|
2.2.2 Conditional distributions |
|
|
33 | (1) |
|
|
33 | (2) |
|
|
35 | (2) |
|
2.2.5 Probability measures on (C, C), likelihood |
|
|
37 | (1) |
|
2.3 Parametric models for p = 1 |
|
|
38 | (7) |
|
|
38 | (3) |
|
|
41 | (1) |
|
2.3.3 Interval-valued variables |
|
|
42 | (1) |
|
2.3.4 Probability vectors and histogram-valued variables |
|
|
42 | (3) |
|
2.4 Nonparametric estimation for p = 1 |
|
|
45 | (1) |
|
2.4.1 Multihistograms and multivariate polygons |
|
|
45 | (1) |
|
2.4.2 Dirichlet kernel mixtures |
|
|
45 | (1) |
|
2.4.3 Dirichlet Process Mixture (DPM) |
|
|
45 | (1) |
|
2.5 Density models for p ≥ 2 |
|
|
46 | (1) |
|
|
46 | (1) |
|
|
47 | (2) |
|
Chapter 3 Dimension Reduction and Visualization of Symbolic Interval-Valued Data Using Sliced Inverse Regression |
|
|
49 | (30) |
|
|
|
|
|
49 | (2) |
|
3.2 PCA for interval-valued data and the sliced inverse regression |
|
|
51 | (2) |
|
3.2.1 PCA for interval-valued data |
|
|
51 | (1) |
|
|
52 | (1) |
|
3.3 SIR for interval-valued data |
|
|
53 | (5) |
|
3.3.1 Quantification approaches |
|
|
54 | (2) |
|
3.3.2 Distributional approaches |
|
|
56 | (2) |
|
3.4 Projections and visualization in DR subspace |
|
|
58 | (3) |
|
3.4.1 Linear combinations of intervals |
|
|
58 | (1) |
|
3.4.2 The graphical representation of the projected intervals in the 2D DR subspace |
|
|
59 | (2) |
|
3.5 Some computational issues |
|
|
61 | (2) |
|
3.5.1 Standardization of interval-valued data |
|
|
61 | (1) |
|
3.5.2 The slicing schemes for iSIR |
|
|
62 | (1) |
|
3.5.3 The evaluation of DR components |
|
|
62 | (1) |
|
|
63 | (2) |
|
3.6.1 Scenario 1: aggregated data |
|
|
63 | (1) |
|
3.6.2 Scenario 2: data based on interval arithmetic |
|
|
63 | (1) |
|
|
64 | (1) |
|
3.7 A real data example: face recognition data |
|
|
65 | (8) |
|
3.8 Conclusion and discussion |
|
|
73 | (1) |
|
|
74 | (5) |
|
Chapter 4 On the "Complexity" of Social Reality. Some Reflections About the Use of Symbolic Data Analysis in Social Sciences |
|
|
79 | (20) |
|
|
|
79 | (1) |
|
4.2 Social sciences facing "complexity" |
|
|
80 | (3) |
|
4.2.1 The total social fact, a designation of "complexity" in social sciences |
|
|
80 | (1) |
|
4.2.2 Two families of answers |
|
|
80 | (1) |
|
4.2.3 The contemporary deepening of the two approaches, "reductionist" and "encompassing" |
|
|
81 | (1) |
|
4.2.4 Issues of scale and heterogeneity |
|
|
82 | (1) |
|
4.3 Symbolic data analysis in the social sciences: an example |
|
|
83 | (12) |
|
4.3.1 Symbolic data analysis |
|
|
83 | (1) |
|
4.3.2 An exploratory case study on European data |
|
|
83 | (11) |
|
4.3.3 A sociological interpretation |
|
|
94 | (1) |
|
|
95 | (1) |
|
|
96 | (3) |
|
|
99 | (40) |
|
Chapter 5 A Spatial Dependence Measure and Prediction of Georeferenced Data Streams Summarized by Histograms |
|
|
101 | (18) |
|
|
|
|
101 | (2) |
|
|
103 | (1) |
|
|
104 | (2) |
|
5.4 Online summarization of a data stream through CluStream for Histogram data |
|
|
106 | (1) |
|
5.5 Spatial dependence monitoring: a variogram for histogram data |
|
|
107 | (3) |
|
5.6 Ordinary kriging for histogram data |
|
|
110 | (2) |
|
5.7 Experimental results on real data |
|
|
112 | (4) |
|
|
116 | (1) |
|
|
116 | (3) |
|
Chapter 6 Incremental Calculation Framework for Complex Data |
|
|
119 | (20) |
|
|
|
|
|
119 | (3) |
|
|
122 | (2) |
|
6.2.1 The basic data space |
|
|
122 | (1) |
|
6.2.2 Sample covariance matrix |
|
|
123 | (1) |
|
6.3 Incremental calculation of complex data |
|
|
124 | (7) |
|
6.3.1 Transformation of complex data |
|
|
124 | (1) |
|
6.3.2 Online decomposition of covariance matrix |
|
|
125 | (3) |
|
|
128 | (3) |
|
|
131 | (4) |
|
6.4.1 Functional linear regression |
|
|
131 | (2) |
|
|
133 | (2) |
|
|
135 | (1) |
|
|
135 | (1) |
|
|
135 | (4) |
|
|
139 | (48) |
|
Chapter 7 Recommender Systems and Attributed Networks |
|
|
141 | (28) |
|
Francoise Fogelman-Soulie |
|
|
|
|
|
|
|
|
|
141 | (1) |
|
|
142 | (8) |
|
|
143 | (2) |
|
7.2.2 Model-based collaborative filtering |
|
|
145 | (1) |
|
7.2.3 Neighborhood-based collaborative filtering |
|
|
145 | (3) |
|
|
148 | (2) |
|
|
150 | (4) |
|
|
150 | (1) |
|
7.3.2 Definition of a social network |
|
|
150 | (1) |
|
7.3.3 Properties of social networks |
|
|
151 | (1) |
|
|
152 | (1) |
|
7.3.5 Multilayer networks |
|
|
153 | (1) |
|
7.4 Using social networks for recommendation |
|
|
154 | (2) |
|
|
154 | (1) |
|
7.4.2 Extension to use attributes |
|
|
155 | (1) |
|
|
156 | (1) |
|
|
156 | (7) |
|
7.5.1 Performance evaluation |
|
|
156 | (1) |
|
|
157 | (1) |
|
7.5.3 Analysis of one-mode projected networks |
|
|
158 | (2) |
|
|
160 | (1) |
|
|
160 | (3) |
|
|
163 | (1) |
|
|
163 | (6) |
|
Chapter 8 Attributed Networks Partitioning Based on Modularity Optimization |
|
|
169 | (18) |
|
|
|
|
Francoise Fogelman-Soulie |
|
|
|
|
169 | (2) |
|
|
171 | (1) |
|
8.3 Inertia based modularity |
|
|
172 | (2) |
|
|
174 | (2) |
|
8.5 Incremental computation of the modularity gain |
|
|
176 | (3) |
|
8.6 Evaluation of I-Louvain method |
|
|
179 | (2) |
|
8.6.1 Performance of I-Louvain on artificial datasets |
|
|
179 | (1) |
|
8.6.2 Run-time of I-Louvain |
|
|
180 | (1) |
|
|
181 | (1) |
|
|
182 | (5) |
|
|
187 | (42) |
|
Chapter 9 A Novel Clustering Method with Automatic Weighting of Tables and Variables |
|
|
189 | (20) |
|
|
Francisco de Assis Tenorio de Carvalho |
|
|
|
|
189 | (1) |
|
|
190 | (1) |
|
9.3 Definitions, notations and objective |
|
|
191 | (5) |
|
9.3.1 Choice of distances |
|
|
192 | (1) |
|
9.3.2 Criterion W measures the homogeneity of the partition P on the set of tables |
|
|
193 | (2) |
|
9.3.3 Optimization of the criterion W |
|
|
195 | (1) |
|
9.4 Hard clustering with automated weighting of tables and variables |
|
|
196 | (5) |
|
9.4.1 Clustering algorithms MND--W and MND--WT |
|
|
196 | (5) |
|
9.5 Applications: UCI data sets |
|
|
201 | (5) |
|
9.5.1 Application I: Iris plant |
|
|
201 | (3) |
|
9.5.2 Application II: multi-features dataset |
|
|
204 | (2) |
|
|
206 | (1) |
|
|
206 | (3) |
|
Chapter 10 Clustering and Generalized ANOVA for Symbolic Data Constructed from Open Data |
|
|
209 | (20) |
|
|
|
|
|
209 | (1) |
|
10.2 Data description based on discrete (membership) distributions |
|
|
210 | (2) |
|
|
212 | (9) |
|
10.3.1 TIMSS -- study of teaching approaches |
|
|
215 | (2) |
|
10.3.2 Clustering countries based on age--sex distributions of their populations |
|
|
217 | (4) |
|
|
221 | (4) |
|
|
225 | (1) |
|
|
226 | (3) |
List of Authors |
|
229 | (4) |
Index |
|
233 | |