Muutke küpsiste eelistusi

E-raamat: Symbolic Data Analysis - Conceptual Statistics and Data Mining: Conceptual Statistics and Data Mining [Wiley Online]

(University of Georgia, USA), (Universite de Paris IX - Dauphine, France)
  • Wiley Online
  • Hind: 134,28 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
Billard (statistics, U. of Georgia) and Diday (computer science and mathematics, U. Paris Dauphine) offer a textbook introducing symbolic data and methods for analyzing it. The readers they have in mind are primarily working statisticians and data analysts, but also scientists working with large volumes of data from a range of disciplines and graduate students in statistical data analysis courses. The example data-sets and the computational software are both available free on the internet. Annotation ©2007 Book News, Inc., Portland, OR (booknews.com)

With the advent of computers, very large datasets have become routine. Standard statistical methods don’t have the power or flexibility to analyse these efficiently, and extract the required knowledge. An alternative approach is to summarize a large dataset in such a way that the resulting summary dataset is of a manageable size and yet retains as much of the knowledge in the original dataset as possible. One consequence of this is that the data may no longer be formatted as single values, but be represented by lists, intervals, distributions, etc. The summarized data have their own internal structure, which must be taken into account in any analysis.

This text presents a unified account of symbolic data, how they arise, and how they are structured. The reader is introduced to symbolic analytic methods described in the consistent statistical framework required to carry out such a summary and subsequent analysis.

  • Presents a detailed overview of the methods and applications of symbolic data analysis.
  • Includes numerous real examples, taken from a variety of application areas, ranging from health and social sciences, to economics and computing.
  • Features exercises at the end of each chapter, enabling the reader to develop their understanding of the theory.
  • Provides a supplementary website featuring links to download the SODAS software developed exclusively for symbolic data analysis, data sets, and further material.

Primarily aimed at statisticians and data analysts, Symbolic Data Analysis is also ideal for scientists working on problems involving large volumes of data from a range of disciplines, including computer science, health and the social sciences. There is also much of use to graduate students of statistical data analysis courses.

1. Introduction.
References.
2. Symbolic Data.
2.1 Symbolic and Classical Data.
2.1.1 Types of Data.
2.1.2 Dependencies in the Data.
2.2 Categories, Concepts and Symbolic Objects.
2.2.1 Preliminaries.
2.2.2 Descriptions, Assertions, Extents.
2.2.3 Concepts of Concepts.
2.2.4 Some Philosophical Aspects.
2.2.5 Fuzzy, Imprecise, and Conjunctive Data.
2.3 Comparison of Symbolic and Classical Analysis.
Exercises.
References.
Tables.
Figures.
3. Basic Descriptive Statistics: One Variate.
3.1 Some Preliminaries.
3.2 Multi-valued Variables.
3.3 Interval-valued Variables.
3.4 Multi-valued Modal variables.
3.5 Interval-valued Modal Variables.
Exercises.
References.
Tables.
Figures.
4. Basic Descriptive Statistics: Two or More Variates.
4.1 Multi-valued Variables.
4.2 Interval-valued Variables.
4.3 Modal Multi-valued Variables.
4.4 Modal Interval-valued Variables.
4.5 Baseball Interval-valued Dataset.
4.5.1 The Data: Actual and Virtual Datasets.
4.5.2 Joint Histograms.
4.5.3 Guiding Principles.
4.6 Measures of Dependence.
4.6.1 Moment Dependence.
4.6.2 Spearman’s rho and copulas.
Exercises.
References.
Tables.
Figures.
5. Principal Component Analysis.
5.1 Vertices Method.
5.2 Centers Method.
5.3 Comparison of the Methods.
Exercises.
References.
Tables.
Figures.
6. Regression Analysis.
6.1 Classical Multiple Regression Model.
6.2 Multi-valued Variables.
6.2.1 Single Dependent Variable.
6.2.2 Multi-valued Dependent Variable.
6.3 Interval-valued Variables.
6.4 Histogram-valued Variables.
6.5 Taxonomy Variables.
6.6 Hierarchical Variables.
Exercises.
References.
Tables.
Figures.
7. Cluster Analysis.
7.1 Dissimilarity and Distance Measures.
7.1.1 Basic Definitions.
7.1.2 Multi-valued Variables.
7.1.3 Interval-valued Variables.
7.1.4 Mixed-valued Variables.
7.2 Clustering Structures.
7.2.1 Types of Clusters: Definitions.
7.2.2 Construction of Clusters: Building Algorithms.
7.3 Partitions.
7.4 Hierarchy-Divisive Clustering.
7.4.1 Some Basics.
7.4.2 Multi-valued Variables.
7.4.3 Interval-valued Variables.
7.5 Hierarchy-Pyramid Clusters.
7.5.1 Some Basics.
7.5.2 Comparison of Hierarchy and Pyramid Structures.
7.5.3 Construction of Pyramids.
Exercises.
References.
Tables.
Figures.


Lynne Billard is a multi award winning University Professor of Statistics at the University of Georgia, USA. Her areas of interest include epidemic theory, AIDS, time series, sequential analysis, and symbolic data. A former President of the American Statistical Association as well as the ENAR Regional President and International President of the International Biometric Society, Professor Billard has co-edited 6 books, published over150 papers and been actively involved in many statistical societies and national committees.

Edwin Diday is a Professor in Computer Science and Mathematics, at the Université Paris Dauphine, France. He is the author or editor of 14 previous books. He is also the founder of the symbolic data analysis field, and has led numerous international research teams in the area.