Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Robust Methods for Data Reduction

5.00/5 (1 hinnangut Goodreads-ist)

Alessio Farcomeni (Sapienza -- University of Rome, Rome, Italy), Luca Greco (University of Sannio, Benevento, Italy)

Formaat: 297 pages
Ilmumisaeg: 13-Jan-2016
Kirjastus: CRC Press Inc
Keel: eng
ISBN-13: 9781466590632

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 59,79 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 297 pages
Ilmumisaeg: 13-Jan-2016
Kirjastus: CRC Press Inc
Keel: eng
ISBN-13: 9781466590632

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis.

The first part of the book illustrates how dimension reduction techniques synthesize available information by reducing the dimensionality of the data. The second part focuses on cluster and discriminant analysis. The authors explain how to perform sample reduction by finding groups in the data.

Despite considerable theoretical achievements, robust methods are not often used in practice. This book fills the gap between theoretical robust techniques and the analysis of real data sets in the area of data reduction. Using real examples, the authors show how to implement the procedures in R. The code and data for the examples are available on the book’s CRC Press web page.

Arvustused

" this book tries to avoid technicalities and focuses on illustrating the power of robust techniques in action. Additionally, it covers some novel techniques, involving data reduction An important concept addressed in Part 2 of the book is independent cell-wise contamination. A large number of variables and a relatively small number of cases are commonplace in modern statistical applications. The proposed snipping methodology is tailored to be applied in the presence of cell-wise contamination, and from my point of view, is one of the principal methodological contributions of the book. In summary, this book is interesting and useful. The book is not an attempt to systematically review all the literature in robust data reduction. However, it proposes a selection of techniques that are simple to understand or to use in practice." Luis Angel Garcķa Escudero, Dpto. de Estadķstica e I. O., Universidad de Valladolid, in Biometrics, June 2017

"'Robust Methods for Data Reduction' makes it easy for practitioners of big-data analytics to conduct robust and efficient data reduction. It is a timely topic in which recently prescribed algorithms and methodological research findings are properly assimilated and presented in a lucid fashion. The book serves as a good introductory book that motivates and teaches the art of developing robust frameworks for synthesis and reduction of large, complex datasetsThe most appealing aspect of this book is that all of the concepts and algorithms described are inspired by real-data examples. All of the methods presented in this book are accompanied by extensive codes and exhaustive documentation on how to implement them in the R computing environment. Readers can download the data and the computer code used in the book from the publishers webpageThe collection of data examples and the pedagogical writing style make it an ideal text for instructors aiming to quickly train students on proper data-reduction techniquesThis book will be particularly useful for courses with R labs. It is bound to find a wide and enduring readership and will be a valuable addition to the library of any data scientist." Gourab Mukherjee, University of Southern California, in Journal of the American Statistical Association, Volume 111, 2016 " this book tries to avoid technicalities and focuses on illustrating the power of robust techniques in action. Additionally, it covers some novel techniques, involving data reduction An important concept addressed in Part 2 of the book is independent cell-wise contamination. A large number of variables and a relatively small number of cases are commonplace in modern statistical applications. The proposed snipping methodology is tailored to be applied in the presence of cell-wise contamination, and from my point of view, is one of the principal methodological contributions of the book. In summary, this book is interesting and useful. The book is not an attempt to systematically review all the literature in robust data reduction. However, it proposes a selection of techniques that are simple to understand or to use in practice." Luis Angel Garcķa Escudero, Dpto. de Estadķstica e I. O., Universidad de Valladolid, in Biometrics, June 2017

"'Robust Methods for Data Reduction' makes it easy for practitioners of big-data analytics to conduct robust and efficient data reduction. It is a timely topic in which recently prescribed algorithms and methodological research findings are properly assimilated and presented in a lucid fashion. The book serves as a good introductory book that motivates and teaches the art of developing robust frameworks for synthesis and reduction of large, complex datasetsThe most appealing aspect of this book is that all of the concepts and algorithms described are inspired by real-data examples. All of the methods presented in this book are accompanied by extensive codes and exhaustive documentation on how to implement them in the R computing environment. Readers can download the data and the computer code used in the book from the publishers webpageThe collection of data examples and the pedagogical writing style make it an ideal text for instructors aiming to quickly train students on proper data-reduction techniquesThis book will be particularly useful for courses with R labs. It is bound to find a wide and enduring readership and will be a valuable addition to the library of any data scientist." Gourab Mukherjee, University of Southern California, in Journal of the American Statistical Association, Volume 111, 2016

Preface

Authors

List of Figures

xvii

List of Tables

xxi

List of Examples and R illustrations

xxv

Symbol Description

xxvii

1 Introduction and Overview

(28)

1.1 What is contamination?

(2)

1.2 Evaluating robustness

(6)

1.2.1 Consistency

(1)

1.2.2 Local robustness: the influence function

(2)

1.2.3 Global robustness: the breakdown point

(1)

1.2.4 Global robustness: the maximum bias

(1)

1.3 What is data reduction?

(3)

1.3.1 Dimension reduction

(1)

1.3.2 Sample reduction

(1)

1.4 An overview of robust dimension reduction

(3)

1.5 An overview of robust sample reduction

(4)

1.6 Example datasets

(7)

1.6.1 G8 macroeconomic data

(1)

1.6.2 Handwritten digits data

(1)

1.6.3 Automobile data

(1)

1.6.4 Metallic oxide data

(1)

1.6.5 Spam detection data

(1)

1.6.6 Video surveillance data

(1)

1.6.7 Water treatment plant data

(2)

2 Multivariate Estimation Methods

(42)

2.1 Robust univariate methods

(12)

2.1.1 M estimators

(1)

2.1.2 Huber estimator

(1)

2.1.3 Redescending M estimators

(2)

2.1.4 Scale estimators

(5)

2.1.5 Measuring outlyingness

(2)

2.2 Classical multivariate estimation

(1)

2.3 Robust multivariate estimation

(12)

2.3.1 Multivariate M estimators

(2)

2.3.2 Multivariate S estimators

(1)

2.3.3 Multivariate MM estimators

(1)

2.3.4 Minimum Covariance Determinant

(2)

2.3.5 Reweighted MCD

(2)

2.3.6 Other multivariate estimators

(2)

2.4 Identification of multivariate outliers

(4)

2.4.1 Multiple testing strategy

(3)

2.5 Examples

(12)

2.5.1 Italian demographics data

(2)

2.5.2 Star cluster CYG OB1 data

(7)

2.5.3 Butterfly data

(3)

Part I Dimension Reduction

(74)

Introduction to Dimension Reduction

(2)

3 Principal Component Analysis

(26)

3.1 Classical PCA

(4)

3.2 PCA based on robust covariance estimation

(2)

3.3 PCA based on projection pursuit

(1)

3.4 Spherical PCA

(1)

3.5 PCA in high dimensions

(1)

3.6 Outlier identification using principal components

(2)

3.7 Examples

(14)

3.7.1 Automobile data

(6)

3.7.2 Octane data

(3)

3.7.3 Video surveillance data

(5)

4 Sparse Robust PCA

101

(16)

4.1 Basic concepts and sPCA

102

(3)

4.2 Robust sPCA

105

(2)

4.3 Choice of the degree of sparsity

107

(1)

4.4 Sparse projection pursuit

108

(1)

4.5 Examples

109

(8)

4.5.1 Automobile data

109

(4)

4.5.2 Octane data

113

(4)

5 Canonical Correlation Analysis

117

(16)

5.1 Classical canonical correlation analysis

117

(4)

5.1.1 Interpretation of the results

119

(1)

5.1.2 Selection of the number of canonical variables

120

(1)

5.2 CCA based on robust covariance estimation

121

(1)

5.3 Other methods

122

(1)

5.4 Examples

122

(11)

5.4.1 Linnerud data

122

(6)

5.4.2 Butterfly data

128

(5)

6 Factor Analysis

133

(12)

6.1 The FA model

133

(5)

6.1.1 Fitting the FA model

135

(3)

6.2 Robust factor analysis

138

(1)

6.3 Examples

138

(7)

6.3.1 Automobile data

138

(4)

6.3.2 Butterfly data

142

(3)

Part II Sample Reduction

145

(86)

Introduction to Sample Reduction

147

(2)

7 k-means and Model-Based Clustering

149

(22)

7.1 A brief overview of applications of cluster analysis

149

(1)

7.2 Basic concepts

150

(1)

7.3 k-means

151

(5)

7.4 Model-based clustering

156

(8)

7.4.1 Likelihood inference

157

(2)

7.4.2 Distribution of component densities

159

(3)

7.4.3 Examples of model-based clustering

162

(2)

7.5 Choosing the number of clusters

164

(7)

8 Robust Clustering

171

(18)

8.1 Partitioning Around Medoids

171

(3)

8.2 Trimmed k-means

174

(3)

8.2.1 The double minimization problem involved with trimmed k-means

175

(2)

8.3 Snipped k-means

177

(4)

8.3.1 Snipping and the component-wise contamination model

178

(1)

8.3.2 Minimization of the loss function for snipped k-means

179

(2)

8.4 Choosing the trimming and snipping levels

181

(3)

8.5 Examples

184

(5)

8.5.1 Metallic oxide data

185

(1)

8.5.2 Handwritten digits data

186

(3)

9 Robust Model-Based Clustering

189

(20)

9.1 Robust heterogeneous clustering based on trimming

190

(5)

9.1.1 A robust CEM for model estimation: the tclust algorithm

191

(2)

9.1.2 Properties

193

(2)

9.2 Robust heterogeneous clustering based on snipping

195

(7)

9.2.1 A robust CEM for model estimation: the sclust algorithm

197

(3)

9.2.2 Properties

200

(2)

9.3 Examples

202

(7)

9.3.1 Metallic oxide data

202

(2)

9.3.2 Water treatment plant data

204

(5)

10 Double Clustering

209

(10)

10.1 Double k-means

210

(2)

10.2 Trimmed double k-means

212

(2)

10.3 Snipped double k-means

214

(1)

10.4 Robustness properties

214

(5)

11 Discriminant Analysis

219

(12)

11.1 Classical discriminant analysis

219

(3)

11.2 Robust discriminant analysis

222

(9)

A Use of the Software R for Data Reduction

231

(14)

A.1 Multivariate estimation methods

231

(4)

A.2 Robust PCA

235

(3)

A.3 Sparse robust PCA

238

(1)

A.4 Canonical correlation analysis

239

(1)

A.5 Factor analysis

240

(1)

A.6 Classical k-means and model based clustering

240

(1)

A.7 Robust clustering

241

(2)

A.8 Robust double clustering

243

(1)

A.9 Discriminant analysis

244

(1)

Bibliography

245

(22)

Index

267

Alessio Farcomeni is an assistant professor in the Department of Public Health and Infectious Diseases at the University of Rome Sapienza. His work focuses on robust statistics, longitudinal models, categorical data analysis, cluster analysis, and multiple testing. He also is involved in clinical, ecological, and econometric research.

Luca Greco is an assistant professor in the Department of Law, Economics, Management and Quantitative Methods at the University of Sannio. His research interests include robust statistics, likelihood asymptotics, pseudolikelihood functions, and skew elliptical distributions.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97814665906322e.html

Märksõnad:

E-raamat: Robust Methods for Data Reduction

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv