Klienditugi: 7440010 (E-R 10-18)

E-raamat: Advances in Data Science: Symbolic, Complex, and Network Data

Edited by Huiwen Wang, Edited by Edwin Diday (Universite de Paris IX - Dauphine, France), Edited by Gilbert Saporta, Edited by Rong Guan

Teised formaadid

Other digital carrier (Hind: 172,90 €) - 15-Jan-2020

Formaat: EPUB+DRM
Ilmumisaeg: 09-Jan-2020
Kirjastus: ISTE Ltd and John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781119694960

Teised raamatud teemal:

Formaat - EPUB+DRM
Hind: 171,60 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
Raamatukogudele

Formaat: EPUB+DRM
Ilmumisaeg: 09-Jan-2020
Kirjastus: ISTE Ltd and John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781119694960

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Data science unifies statistics, data analysis and machine learning to achieve a better understanding of the masses of data which are produced today, and to improve prediction. Special kinds of data (symbolic, network, complex, compositional) are increasingly frequent in data science. These data require specific methodologies, but there is a lack of reference work in this field.

Advances in Data Science fills this gap. It presents a collection of up-to-date contributions by eminent scholars following two international workshops held in Beijing and Paris. The 10 chapters are organized into four parts: Symbolic Data, Complex Data, Network Data and Clustering. They include fundamental contributions, as well as applications to several domains, including business and the social sciences.

Preface

Part 1 Symbolic Data

(98)

Chapter 1 Explanatory Tools for Machine Learning in the Symbolic Data Analysis Framework

(28)

Edwin Diday

1.1 Introduction

(2)

1.2 Introduction to Symbolic Data Analysis

(4)

1.2.1 What are complex data?

(1)

1.2.2 What are "classes" and "class of complex data"?

(1)

1.2.3 Which kind of class variability?

(1)

1.2.4 What are "symbolic variables" and "symbolic data tables"?

(2)

1.2.5 Symbolic Data Analysis (SDA)

(1)

1.3 Symbolic data tables from Dynamic Clustering Method and EM

(6)

1.3.1 The "dynamical clustering method" (DCM)

(1)

1.3.2 Examples of DCM applications

(2)

1.3.3 Clustering methods by mixture decomposition

(1)

1.3.4 Symbolic data tables from clustering

(2)

1.3.5 A general way to compare results of clustering methods by the "explanatory power" of their associated symbolic data table

(1)

1.3.6 Quality criteria of classes and variables based on the cells of the symbolic data table containing intervals or inferred distributions

(1)

1.4 Criteria for ranking individuals, classes and their bar chart descriptive symbolic variables

(7)

1.4.1 A theoretical framework for SDA

(2)

1.4.2 Characterization of a category and a class by a measure of discordance

(1)

1.4.3 Link between a characterization by the criteria W and the standard Tf-Idf

(2)

1.4.4 Ranking the individuals, the symbolic variables and the classes of a bar chart symbolic data table

(2)

1.5 Two directions of research

(4)

1.5.1 Parametrization of concordance and discordance criteria

(2)

1.5.2 Improving the explanatory power of any machine learning tool by a filtering process

(2)

1.6 Conclusion

(1)

1.7 References

(3)

Chapter 2 Likelihood in the Symbolic Context

(18)

Richard Emilion

Edwin Diday

2.1 Introduction

(1)

2.2 Probabilistic setting

(6)

2.2.1 Description variable and class variable

(1)

2.2.2 Conditional distributions

(1)

2.2.3 Symbolic variables

(2)

2.2.4 Examples

(2)

2.2.5 Probability measures on (C, C), likelihood

(1)

2.3 Parametric models for p = 1

(7)

2.3.1 LDA model

(3)

2.3.2 BLS method

(1)

2.3.3 Interval-valued variables

(1)

2.3.4 Probability vectors and histogram-valued variables

(3)

2.4 Nonparametric estimation for p = 1

(1)

2.4.1 Multihistograms and multivariate polygons

(1)

2.4.2 Dirichlet kernel mixtures

(1)

2.4.3 Dirichlet Process Mixture (DPM)

(1)

2.5 Density models for p ≥ 2

(1)

2.6 Conclusion

(1)

2.7 References

(2)

Chapter 3 Dimension Reduction and Visualization of Symbolic Interval-Valued Data Using Sliced Inverse Regression

(30)

Han-Ming Wu

Chiun-How Kao

Chun-houh Chen

3.1 Introduction

(2)

3.2 PCA for interval-valued data and the sliced inverse regression

(2)

3.2.1 PCA for interval-valued data

(1)

3.2.2 Classic SIR

(1)

3.3 SIR for interval-valued data

(5)

3.3.1 Quantification approaches

(2)

3.3.2 Distributional approaches

(2)

3.4 Projections and visualization in DR subspace

(3)

3.4.1 Linear combinations of intervals

(1)

3.4.2 The graphical representation of the projected intervals in the 2D DR subspace

(2)

3.5 Some computational issues

(2)

3.5.1 Standardization of interval-valued data

(1)

3.5.2 The slicing schemes for iSIR

(1)

3.5.3 The evaluation of DR components

(1)

3.6 Simulation studies

(2)

3.6.1 Scenario 1: aggregated data

(1)

3.6.2 Scenario 2: data based on interval arithmetic

(1)

3.6.3 Results

(1)

3.7 A real data example: face recognition data

(8)

3.8 Conclusion and discussion

(1)

3.9 References

(5)

Chapter 4 On the "Complexity" of Social Reality. Some Reflections About the Use of Symbolic Data Analysis in Social Sciences

(20)

Frederic Lebaron

4.1 Introduction

(1)

4.2 Social sciences facing "complexity"

(3)

4.2.1 The total social fact, a designation of "complexity" in social sciences

(1)

4.2.2 Two families of answers

(1)

4.2.3 The contemporary deepening of the two approaches, "reductionist" and "encompassing"

(1)

4.2.4 Issues of scale and heterogeneity

(1)

4.3 Symbolic data analysis in the social sciences: an example

(12)

4.3.1 Symbolic data analysis

(1)

4.3.2 An exploratory case study on European data

(11)

4.3.3 A sociological interpretation

(1)

4.4 Conclusion

(1)

4.5 References

(3)

Part 2 Complex Data

(40)

Chapter 5 A Spatial Dependence Measure and Prediction of Georeferenced Data Streams Summarized by Histograms

101

(18)

Rosanna Verde

Antonio Balzanella

5.1 Introduction

101

(2)

5.2 Processing setup

103

(1)

5.3 Main definitions

104

(2)

5.4 Online summarization of a data stream through CluStream for Histogram data

106

(1)

5.5 Spatial dependence monitoring: a variogram for histogram data

107

(3)

5.6 Ordinary kriging for histogram data

110

(2)

5.7 Experimental results on real data

112

(4)

5.8 Conclusion

116

(1)

5.9 References

116

(3)

Chapter 6 Incremental Calculation Framework for Complex Data

119

(20)

Huiwen Wang

Yuan Wei

Siyang Wang

6.1 Introduction

119

(3)

6.2 Basic data

122

(2)

6.2.1 The basic data space

122

(1)

6.2.2 Sample covariance matrix

123

(1)

6.3 Incremental calculation of complex data

124

(7)

6.3.1 Transformation of complex data

124

(1)

6.3.2 Online decomposition of covariance matrix

125

(3)

6.3.3 Adopted algorithms

128

(3)

6.4 Simulation studies

131

(4)

6.4.1 Functional linear regression

131

(2)

6.4.2 Compositional PCA

133

(2)

6.5 Conclusion

135

(1)

6.6 Acknowledgment

135

(1)

6.7 References

135

(4)

Part 3 Network Data

139

(48)

Chapter 7 Recommender Systems and Attributed Networks

141

(28)

Francoise Fogelman-Soulie

Lanxiang Mei

Jianyu Zhang

Yiming Li

Wen Ge

Yinglan Li

Qiaofei Ye

7.1 Introduction

141

(1)

7.2 Recommender systems

142

(8)

7.2.1 Data used

143

(2)

7.2.2 Model-based collaborative filtering

145

(1)

7.2.3 Neighborhood-based collaborative filtering

145

(3)

7.2.4 Hybrid models

148

(2)

7.3 Social networks

150

(4)

7.3.1 Non-independence

150

(1)

7.3.2 Definition of a social network

150

(1)

7.3.3 Properties of social networks

151

(1)

7.3.4 Bipartite networks

152

(1)

7.3.5 Multilayer networks

153

(1)

7.4 Using social networks for recommendation

154

(2)

7.4.1 Social filtering

154

(1)

7.4.2 Extension to use attributes

155

(1)

7.4.3 Remarks

156

(1)

7.5 Experiments

156

(7)

7.5.1 Performance evaluation

156

(1)

7.5.2 Datasets

157

(1)

7.5.3 Analysis of one-mode projected networks

158

(2)

7.5.4 Models evaluated

160

(1)

7.5.5 Results

160

(3)

7.6 Perspectives

163

(1)

7.7 References

163

(6)

Chapter 8 Attributed Networks Partitioning Based on Modularity Optimization

169

(18)

David Combe

Christine Largeron

Baptiste Jeudy

Francoise Fogelman-Soulie

Jing Wang

8.1 Introduction

169

(2)

8.2 Related work

171

(1)

8.3 Inertia based modularity

172

(2)

8.4 I-Louvain

174

(2)

8.5 Incremental computation of the modularity gain

176

(3)

8.6 Evaluation of I-Louvain method

179

(2)

8.6.1 Performance of I-Louvain on artificial datasets

179

(1)

8.6.2 Run-time of I-Louvain

180

(1)

8.7 Conclusion

181

(1)

8.8 References

182

(5)

Part 4 Clustering

187

(42)

Chapter 9 A Novel Clustering Method with Automatic Weighting of Tables and Variables

189

(20)

Rodrigo C. de Araujo

Francisco de Assis Tenorio de Carvalho

Yves Lechevallier

9.1 Introduction

189

(1)

9.2 Related Work

190

(1)

9.3 Definitions, notations and objective

191

(5)

9.3.1 Choice of distances

192

(1)

9.3.2 Criterion W measures the homogeneity of the partition P on the set of tables

193

(2)

9.3.3 Optimization of the criterion W

195

(1)

9.4 Hard clustering with automated weighting of tables and variables

196

(5)

9.4.1 Clustering algorithms MND--W and MND--WT

196

(5)

9.5 Applications: UCI data sets

201

(5)

9.5.1 Application I: Iris plant

201

(3)

9.5.2 Application II: multi-features dataset

204

(2)

9.6 Conclusion

206

(1)

9.7 References

206

(3)

Chapter 10 Clustering and Generalized ANOVA for Symbolic Data Constructed from Open Data

209

(20)

Simona Korenjak-Cerne

Natasa Kejzar

Vladimir Batageu

10.1 Introduction

209

(1)

10.2 Data description based on discrete (membership) distributions

210

(2)

10.3 Clustering

212

(9)

10.3.1 TIMSS -- study of teaching approaches

215

(2)

10.3.2 Clustering countries based on age--sex distributions of their populations

217

(4)

10.4 Generalized ANOVA

221

(4)

10.5 Conclusion

225

(1)

10.6 References

226

(3)

List of Authors

229

(4)

Index

233

Edwin Diday is Emeritus Professor at Paris-Dauphine University-PSL. He helped to introduce the symbolic data analysis paradigm and the dynamic clustering method (opening the path to local models), as well as pyramidal clustering for spatial representation of overlapping clusters.

Rong Guan is Associate Professor at the School of Statistics and Mathematics, Central University of Finance and Economics, Beijing. Her research covers complex and symbolic data analysis and financial distress diagnosis.

Gilbert Saporta is Emeritus Professor at Conservatoire National des Arts et Métiers, France. His current research focuses on functional data analysis and clusterwise and sparse methods. He is Honorary President of the French Statistical Society.

Huiwen Wang is Professor at the School of Economics and Management, Beihang University, Beijing. Her research covers dimension reduction, PLS regression, symbolic data analysis, compositional data analysis, functional data analysis and statistical modeling methods for mixed data.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97811196949606e.html

Märksõnad:

E-raamat: Advances in Data Science: Symbolic, Complex, and Network Data

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv