Klienditugi: 7440010 (E-R 10-18)

E-raamat: Multivariate Kernel Smoothing and Its Applications

José E. Chacón (Universidad de Extremadura, Departamento de Matemįticas, Badajoz, Spain), Tarn Duong (University of Paris-North - Paris 13, Computer Science Laboratory, Villetaneuse, France)

Formaat: 248 pages
Sari: Chapman & Hall/CRC Monographs on Statistics and Applied Probability
Ilmumisaeg: 08-May-2018
Kirjastus: Chapman & Hall/CRC
Keel: eng
ISBN-13: 9780429939143

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 61,10 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 248 pages
Sari: Chapman & Hall/CRC Monographs on Statistics and Applied Probability
Ilmumisaeg: 08-May-2018
Kirjastus: Chapman & Hall/CRC
Keel: eng
ISBN-13: 9780429939143

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Kernel smoothing has greatly evolved since its inception to become an essential methodology in the data science tool kit for the 21st century. Its widespread adoption is due to its fundamental role for multivariate exploratory data analysis, as well as the crucial role it plays in composite solutions to complex data challenges.

Multivariate Kernel Smoothing and Its Applications offers a comprehensive overview of both aspects. It begins with a thorough exposition of the approaches to achieve the two basic goals of estimating probability density functions and their derivatives. The focus then turns to the applications of these approaches to more complex data analysis goals, many with a geometric/topological flavour, such as level set estimation, clustering (unsupervised learning), principal curves, and feature significance. Other topics, while not direct applications of density (derivative) estimation but sharing many commonalities with the previous settings, include classification (supervised learning), nearest neighbour estimation, and deconvolution for data observed with error.

For a data scientist, each chapter contains illustrative Open data examples that are analysed by the most appropriate kernel smoothing method. The emphasis is always placed on an intuitive understanding of the data provided by the accompanying statistical visualisations. For a reader wishing to investigate further the details of their underlying statistical reasoning, a graduated exposition to a unified theoretical framework is provided. The algorithms for efficient software implementation are also discussed.

José E. Chacón is an associate professor at the Department of Mathematics of the Universidad de Extremadura in Spain. Tarn Duong is a Senior Data Scientist for a start-up which provides short distance carpooling services in France.

Both authors have made important contributions to kernel smoothing research over the last couple of decades.

Arvustused

"I am very impressed with this book. It addresses issues that are not discussed in any detail in any other book on density estimation. Furthermore, it is very well-written and contains a wealth of interesting examples. In fact, this is probably one of the best books I have seen on density estimation. Some topics in this book that are not covered in detail in any other book include: multivariate bandwidth matrices, details of the asymptotic MSE for general bandwidth matrices, derivative estimation, level sets, density clustering and significance testing for modal regions. This makes the book unique. The authors have written the book in such a way that it can be used by two different types of readers: data analysts who are not interested in the mathematical details, and students/researchers who do want the details. The `how to read this monograph' is very useful." ~Larry Wasserman, Carnegie Mellon University

"This book provides a comprehensive overview of the fundamental issues and the numerous extensions of multivariate kernel density estimation. There are three core aspects that are discussed. Firstly, the method of kernel density estimation is thoroughly described in the multivariate setting. Secondly, the problem of selecting a bandwidth matrix is discussed, with a comparison of numerous alternatives. Thirdly, the performance and asymptotic properties of the estimators and bandwidth selections are comprehensively reviewed: there is an abundance of information on the (asymptotic) mean (integrated) squared error of various combinations of estimators and bandwidths.

Having examined the above fundamentals, the authors discuss numerous extensions of multivariate kernel density estimation. These include density derivative estimation, level set estimation, density-based clustering, density ridge estimation, feature significance, density di erence estimation, and classification. For all of these methods, there is a strong focus on asymptotic performance. There are also advice on and examples of providing e ective visual communication of results. Guidance on the application of the methods is limited to descriptions of the R commands available in the fs package.

The structure of the book means that all the above methods have accessible explanations, while detailed and thorough mathematical exposition is maintained. Each chapter or section is structured such that the methods are first described and then illustrated, and then the technical mathematical details (including proofs of theorems) are supplied. This brings about the authors' stated aim of the book being useful for data analysts, undergraduates or postgraduates." ~Andrew Duncan A. C. Smith

"Overall, it was a great joy for me to review this book. It was written beautifully. The authors offered many valuable insights on multivariate kernel smoothing, which I found helpful. I am looking forward to having a copy onmy bookshelf and I have no doubt that it will be my research reference book in the future." ~QingWang, Wellesley College "I am very impressed with this book. It addresses issues that are not discussed in any detail in any other book on density estimation. Furthermore, it is very well-written and contains a wealth of interesting examples. In fact, this is probably one of the best books I have seen on density estimation. Some topics in this book that are not covered in detail in any other book include: multivariate bandwidth matrices, details of the asymptotic MSE for general bandwidth matrices, derivative estimation, level sets, density clustering and significance testing for modal regions. This makes the book unique. The authors have written the book in such a way that it can be used by two different types of readers: data analysts who are not interested in the mathematical details, and students/researchers who do want the details. The `how to read this monograph' is very useful." ~Larry Wasserman, Carnegie Mellon University

"This book provides a comprehensive overview of the fundamental issues and the numerous extensions of multivariate kernel density estimation. There are three core aspects that are discussed. Firstly, the method of kernel density estimation is thoroughly described in the multivariate setting. Secondly, the problem of selecting a bandwidth matrix is discussed, with a comparison of numerous alternatives. Thirdly, the performance and asymptotic properties of the estimators and bandwidth selections are comprehensively reviewed: there is an abundance of information on the (asymptotic) mean (integrated) squared error of various combinations of estimators and bandwidths.

Having examined the above fundamentals, the authors discuss numerous extensions of multivariate kernel density estimation. These include density derivative estimation, level set estimation, density-based clustering, density ridge estimation, feature significance, density di erence estimation, and classification. For all of these methods, there is a strong focus on asymptotic performance. There are also advice on and examples of providing e ective visual communication of results. Guidance on the application of the methods is limited to descriptions of the R commands available in the fs package.

The structure of the book means that all the above methods have accessible explanations, while detailed and thorough mathematical exposition is maintained. Each chapter or section is structured such that the methods are first described and then illustrated, and then the technical mathematical details (including proofs of theorems) are supplied. This brings about the authors' stated aim of the book being useful for data analysts, undergraduates or postgraduates." ~Andrew Duncan A. C. Smith

"Overall, it was a great joy for me to review this book. It was written beautifully. The authors offered many valuable insights on multivariate kernel smoothing, which I found helpful. I am looking forward to having a copy onmy bookshelf and I have no doubt that it will be my research reference book in the future." ~QingWang, Wellesley College

Preface

xiii

List of Figures

List of Tables

xix

List of Algorithms

xxi

1 Introduction

(10)

1.1 Exploratory data analysis with density estimation

(3)

1.2 Exploratory data analysis with density derivatives estimation

(1)

1.3 Clustering/unsupervised learning

(1)

1.4 Classification/supervised learning

(1)

1.5 Suggestions on how to read this monograph

(4)

2 Density estimation

(32)

2.1 Histogram density estimation

(3)

2.2 Kernel density estimation

(5)

2.2.1 Probability contours as multivariate quantiles

(3)

2.2.2 Contour colour scales

(1)

2.3 Gains from unconstrained bandwidth matrices

(4)

2.4 Advice for practical bandwidth selection

(3)

2.5 Squared error analysis

(4)

2.6 Asymptotic squared error formulas

(5)

2.7 Optimal bandwidths

(1)

2.8 Convergence of density estimators

(1)

2.9 Further mathematical analysis of density estimators

(6)

2.9.1 Asymptotic expansion of the mean integrated squared error

(2)

2.9.2 Asymptotically optimal bandwidth

(1)

2.9.3 Vector versus vector half parametrisations

(3)

3 Bandwidth selectors for density estimation

(24)

3.1 Normal scale bandwidths

(1)

3.2 Maximal smoothing bandwidths

(1)

3.3 Normal mixture bandwidths

(1)

3.4 Unbiased cross validation bandwidths

(3)

3.5 Biased cross validation bandwidths

(1)

3.6 Plug-in bandwidths

(3)

3.7 Smoothed cross validation bandwidths

(2)

3.8 Empirical comparison of bandwidth selectors

(6)

3.9 Theoretical comparison of bandwidth selectors

(1)

3.10 Further mathematical analysis of bandwidth selectors

(6)

3.10.1 Relative convergence rates of bandwidth selectors

(3)

3.10.2 Optimal pilot bandwidth selectors

(1)

3.10.3 Convergence rates with data-based bandwidths

(2)

4 Modified density estimation

(22)

4.1 Variable bandwidth density estimators

(6)

4.1.1 Balloon density estimators

(1)

4.1.2 Sample point density estimators

(1)

4.1.3 Bandwidth selectors for variable kernel estimation

(3)

4.2 Transformation density estimators

(3)

4.3 Boundary kernel density estimators

(5)

4.3.1 Beta boundary kernels

(1)

4.3.2 Linear boundary kernels

(4)

4.4 Kernel choice

(2)

4.5 Higher order kernels

(1)

4.6 Further mathematical analysis of modified density estimators

(5)

4.6.1 Asymptotic error for sample point variable bandwidth estimators

(2)

4.6.2 Asymptotic error for linear boundary estimators

(3)

5 Density derivative estimation

(38)

5.1 Kernel density derivative estimators

(7)

5.1.1 Density gradient estimators

(2)

5.1.2 Density Hessian estimators

(1)

5.1.3 General density derivative estimators

(3)

5.2 Gains from unconstrained bandwidth matrices

(4)

5.3 Advice for practical bandwidth selection

100

(2)

5.4 Empirical comparison of bandwidths of different derivative orders

102

(1)

5.5 Squared error analysis

103

(5)

5.6 Bandwidth selection for density derivative estimators

108

(9)

5.6.1 Normal scale bandwidths

109

(1)

5.6.2 Normal mixture bandwidths

110

(1)

5.6.3 Unbiased cross validation bandwidths

111

(1)

5.6.4 Plug-in bandwidths

112

(3)

5.6.5 Smoothed cross validation bandwidths

115

(2)

5.7 Relative convergence rates of bandwidth selectors

117

(1)

5.8 Case study: The normal density

118

(6)

5.8.1 Exact MISE

118

(1)

5.8.2 Curvature matrix

119

(1)

5.8.3 Asymptotic MISE

120

(1)

5.8.4 Normal scale bandwidth

121

(1)

5.8.5 Asymptotic MSE for curvature estimation

122

(2)

5.9 Further mathematical analysis of density derivative estimators

124

(3)

5.9.1 Taylor expansions for vector-valued functions

124

(1)

5.9.2 Relationship between multivariate normal moments

124

(3)

6 Applications related to density and density derivative estimation

127

(28)

6.1 Level set estimation

127

(8)

6.1.1 Modal region and bump estimation

129

(3)

6.1.2 Density support estimation

132

(3)

6.2 Density-based clustering

135

(8)

6.2.1 Stable/unstable manifolds

136

(1)

6.2.2 Mean shift clustering

137

(6)

6.2.3 Choice of the normalising matrix in the mean shift

143

(1)

6.3 Density ridge estimation

143

(6)

6.4 Feature significance

149

(6)

7 Supplementary topics in data analysis

155

(26)

7.1 Density difference estimation and significance testing

155

(4)

7.2 Classification

159

(4)

7.3 Density estimation for data measured with error

163

(8)

7.3.1 Classical density deconvolution estimation

164

(2)

7.3.2 Weighted density deconvolution estimation

166

(4)

7.3.3 Manifold estimation

170

(1)

7.4 Nearest neighbour estimation

171

(7)

7.5 Further mathematical analysis

178

(3)

7.5.1 Squared error analysis for deconvolution kernel density estimators

178

(1)

7.5.2 Optimal selection of the number of nearest neighbours

179

(2)

8 Computational algorithms

181

(18)

8.1 R implementation

181

(4)

8.2 Approximate binned estimation

185

(6)

8.2.1 Approximate density estimation

185

(5)

8.2.2 Approximate density derivative and functional estimation

190

(1)

8.3 Recursive computation of the normal density derivatives

191

(4)

8.4 Recursive computation of the normal functionals

195

(2)

8.5 Numerical optimisation over matrix spaces

197

(2)

A Notation

199

(6)

B Matrix algebra

205

(2)

B.1 The Kronecker product

205

(1)

B.2 The vec operator

206

(1)

B.3 The commutation matrix

206

(1)

Bibliography

207

(18)

Index

225

José E. Chacón is an associate professor at the Department of Mathematics of the Universidad de Extremadura in Spain. Tarn Duong is a Senior Data Scientist for a start-up which provides short distance carpooling services in France.

Both authors have made important contributions to kernel smoothing research over the last couple of decades.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97804299391432e.html

Märksõnad:

E-raamat: Multivariate Kernel Smoothing and Its Applications

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv