Muutke küpsiste eelistusi

E-raamat: Multivariate Kernel Smoothing and Its Applications

(Universidad de Extremadura, Departamento de Matemáticas, Badajoz, Spain), (University of Paris-North - Paris 13, Computer Science Laboratory, Villetaneuse, France)
  • Formaat - PDF+DRM
  • Hind: 61,10 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Kernel smoothing has greatly evolved since its inception to become an essential methodology in the data science tool kit for the 21st century. Its widespread adoption is due to its fundamental role for multivariate exploratory data analysis, as well as the crucial role it plays in composite solutions to complex data challenges.

Multivariate Kernel Smoothing and Its Applications offers a comprehensive overview of both aspects. It begins with a thorough exposition of the approaches to achieve the two basic goals of estimating probability density functions and their derivatives. The focus then turns to the applications of these approaches to more complex data analysis goals, many with a geometric/topological flavour, such as level set estimation, clustering (unsupervised learning), principal curves, and feature significance. Other topics, while not direct applications of density (derivative) estimation but sharing many commonalities with the previous settings, include classification (supervised learning), nearest neighbour estimation, and deconvolution for data observed with error.

For a data scientist, each chapter contains illustrative Open data examples that are analysed by the most appropriate kernel smoothing method. The emphasis is always placed on an intuitive understanding of the data provided by the accompanying statistical visualisations. For a reader wishing to investigate further the details of their underlying statistical reasoning, a graduated exposition to a unified theoretical framework is provided. The algorithms for efficient software implementation are also discussed.

José E. Chacón is an associate professor at the Department of Mathematics of the Universidad de Extremadura in Spain. Tarn Duong is a Senior Data Scientist for a start-up which provides short distance carpooling services in France.

Both authors have made important contributions to kernel smoothing research over the last couple of decades.

Arvustused

"I am very impressed with this book. It addresses issues that are not discussed in any detail in any other book on density estimation. Furthermore, it is very well-written and contains a wealth of interesting examples. In fact, this is probably one of the best books I have seen on density estimation. Some topics in this book that are not covered in detail in any other book include: multivariate bandwidth matrices, details of the asymptotic MSE for general bandwidth matrices, derivative estimation, level sets, density clustering and significance testing for modal regions. This makes the book unique. The authors have written the book in such a way that it can be used by two different types of readers: data analysts who are not interested in the mathematical details, and students/researchers who do want the details. The `how to read this monograph' is very useful." ~Larry Wasserman, Carnegie Mellon University

"This book provides a comprehensive overview of the fundamental issues and the numerous extensions of multivariate kernel density estimation. There are three core aspects that are discussed. Firstly, the method of kernel density estimation is thoroughly described in the multivariate setting. Secondly, the problem of selecting a bandwidth matrix is discussed, with a comparison of numerous alternatives. Thirdly, the performance and asymptotic properties of the estimators and bandwidth selections are comprehensively reviewed: there is an abundance of information on the (asymptotic) mean (integrated) squared error of various combinations of estimators and bandwidths.

Having examined the above fundamentals, the authors discuss numerous extensions of multivariate kernel density estimation. These include density derivative estimation, level set estimation, density-based clustering, density ridge estimation, feature significance, density di erence estimation, and classification. For all of these methods, there is a strong focus on asymptotic performance. There are also advice on and examples of providing e ective visual communication of results. Guidance on the application of the methods is limited to descriptions of the R commands available in the fs package.

The structure of the book means that all the above methods have accessible explanations, while detailed and thorough mathematical exposition is maintained. Each chapter or section is structured such that the methods are first described and then illustrated, and then the technical mathematical details (including proofs of theorems) are supplied. This brings about the authors' stated aim of the book being useful for data analysts, undergraduates or postgraduates." ~Andrew Duncan A. C. Smith

"Overall, it was a great joy for me to review this book. It was written beautifully. The authors offered many valuable insights on multivariate kernel smoothing, which I found helpful. I am looking forward to having a copy onmy bookshelf and I have no doubt that it will be my research reference book in the future." ~QingWang, Wellesley College "I am very impressed with this book. It addresses issues that are not discussed in any detail in any other book on density estimation. Furthermore, it is very well-written and contains a wealth of interesting examples. In fact, this is probably one of the best books I have seen on density estimation. Some topics in this book that are not covered in detail in any other book include: multivariate bandwidth matrices, details of the asymptotic MSE for general bandwidth matrices, derivative estimation, level sets, density clustering and significance testing for modal regions. This makes the book unique. The authors have written the book in such a way that it can be used by two different types of readers: data analysts who are not interested in the mathematical details, and students/researchers who do want the details. The `how to read this monograph' is very useful." ~Larry Wasserman, Carnegie Mellon University

"This book provides a comprehensive overview of the fundamental issues and the numerous extensions of multivariate kernel density estimation. There are three core aspects that are discussed. Firstly, the method of kernel density estimation is thoroughly described in the multivariate setting. Secondly, the problem of selecting a bandwidth matrix is discussed, with a comparison of numerous alternatives. Thirdly, the performance and asymptotic properties of the estimators and bandwidth selections are comprehensively reviewed: there is an abundance of information on the (asymptotic) mean (integrated) squared error of various combinations of estimators and bandwidths.

Having examined the above fundamentals, the authors discuss numerous extensions of multivariate kernel density estimation. These include density derivative estimation, level set estimation, density-based clustering, density ridge estimation, feature significance, density di erence estimation, and classification. For all of these methods, there is a strong focus on asymptotic performance. There are also advice on and examples of providing e ective visual communication of results. Guidance on the application of the methods is limited to descriptions of the R commands available in the fs package.

The structure of the book means that all the above methods have accessible explanations, while detailed and thorough mathematical exposition is maintained. Each chapter or section is structured such that the methods are first described and then illustrated, and then the technical mathematical details (including proofs of theorems) are supplied. This brings about the authors' stated aim of the book being useful for data analysts, undergraduates or postgraduates." ~Andrew Duncan A. C. Smith

"Overall, it was a great joy for me to review this book. It was written beautifully. The authors offered many valuable insights on multivariate kernel smoothing, which I found helpful. I am looking forward to having a copy onmy bookshelf and I have no doubt that it will be my research reference book in the future." ~QingWang, Wellesley College

Preface xiii
List of Figures
xv
List of Tables
xix
List of Algorithms
xxi
1 Introduction
1(10)
1.1 Exploratory data analysis with density estimation
1(3)
1.2 Exploratory data analysis with density derivatives estimation
4(1)
1.3 Clustering/unsupervised learning
5(1)
1.4 Classification/supervised learning
6(1)
1.5 Suggestions on how to read this monograph
7(4)
2 Density estimation
11(32)
2.1 Histogram density estimation
11(3)
2.2 Kernel density estimation
14(5)
2.2.1 Probability contours as multivariate quantiles
16(3)
2.2.2 Contour colour scales
19(1)
2.3 Gains from unconstrained bandwidth matrices
19(4)
2.4 Advice for practical bandwidth selection
23(3)
2.5 Squared error analysis
26(4)
2.6 Asymptotic squared error formulas
30(5)
2.7 Optimal bandwidths
35(1)
2.8 Convergence of density estimators
36(1)
2.9 Further mathematical analysis of density estimators
37(6)
2.9.1 Asymptotic expansion of the mean integrated squared error
37(2)
2.9.2 Asymptotically optimal bandwidth
39(1)
2.9.3 Vector versus vector half parametrisations
40(3)
3 Bandwidth selectors for density estimation
43(24)
3.1 Normal scale bandwidths
44(1)
3.2 Maximal smoothing bandwidths
45(1)
3.3 Normal mixture bandwidths
46(1)
3.4 Unbiased cross validation bandwidths
46(3)
3.5 Biased cross validation bandwidths
49(1)
3.6 Plug-in bandwidths
49(3)
3.7 Smoothed cross validation bandwidths
52(2)
3.8 Empirical comparison of bandwidth selectors
54(6)
3.9 Theoretical comparison of bandwidth selectors
60(1)
3.10 Further mathematical analysis of bandwidth selectors
61(6)
3.10.1 Relative convergence rates of bandwidth selectors
61(3)
3.10.2 Optimal pilot bandwidth selectors
64(1)
3.10.3 Convergence rates with data-based bandwidths
65(2)
4 Modified density estimation
67(22)
4.1 Variable bandwidth density estimators
67(6)
4.1.1 Balloon density estimators
68(1)
4.1.2 Sample point density estimators
69(1)
4.1.3 Bandwidth selectors for variable kernel estimation
70(3)
4.2 Transformation density estimators
73(3)
4.3 Boundary kernel density estimators
76(5)
4.3.1 Beta boundary kernels
76(1)
4.3.2 Linear boundary kernels
77(4)
4.4 Kernel choice
81(2)
4.5 Higher order kernels
83(1)
4.6 Further mathematical analysis of modified density estimators
84(5)
4.6.1 Asymptotic error for sample point variable bandwidth estimators
84(2)
4.6.2 Asymptotic error for linear boundary estimators
86(3)
5 Density derivative estimation
89(38)
5.1 Kernel density derivative estimators
89(7)
5.1.1 Density gradient estimators
90(2)
5.1.2 Density Hessian estimators
92(1)
5.1.3 General density derivative estimators
93(3)
5.2 Gains from unconstrained bandwidth matrices
96(4)
5.3 Advice for practical bandwidth selection
100(2)
5.4 Empirical comparison of bandwidths of different derivative orders
102(1)
5.5 Squared error analysis
103(5)
5.6 Bandwidth selection for density derivative estimators
108(9)
5.6.1 Normal scale bandwidths
109(1)
5.6.2 Normal mixture bandwidths
110(1)
5.6.3 Unbiased cross validation bandwidths
111(1)
5.6.4 Plug-in bandwidths
112(3)
5.6.5 Smoothed cross validation bandwidths
115(2)
5.7 Relative convergence rates of bandwidth selectors
117(1)
5.8 Case study: The normal density
118(6)
5.8.1 Exact MISE
118(1)
5.8.2 Curvature matrix
119(1)
5.8.3 Asymptotic MISE
120(1)
5.8.4 Normal scale bandwidth
121(1)
5.8.5 Asymptotic MSE for curvature estimation
122(2)
5.9 Further mathematical analysis of density derivative estimators
124(3)
5.9.1 Taylor expansions for vector-valued functions
124(1)
5.9.2 Relationship between multivariate normal moments
124(3)
6 Applications related to density and density derivative estimation
127(28)
6.1 Level set estimation
127(8)
6.1.1 Modal region and bump estimation
129(3)
6.1.2 Density support estimation
132(3)
6.2 Density-based clustering
135(8)
6.2.1 Stable/unstable manifolds
136(1)
6.2.2 Mean shift clustering
137(6)
6.2.3 Choice of the normalising matrix in the mean shift
143(1)
6.3 Density ridge estimation
143(6)
6.4 Feature significance
149(6)
7 Supplementary topics in data analysis
155(26)
7.1 Density difference estimation and significance testing
155(4)
7.2 Classification
159(4)
7.3 Density estimation for data measured with error
163(8)
7.3.1 Classical density deconvolution estimation
164(2)
7.3.2 Weighted density deconvolution estimation
166(4)
7.3.3 Manifold estimation
170(1)
7.4 Nearest neighbour estimation
171(7)
7.5 Further mathematical analysis
178(3)
7.5.1 Squared error analysis for deconvolution kernel density estimators
178(1)
7.5.2 Optimal selection of the number of nearest neighbours
179(2)
8 Computational algorithms
181(18)
8.1 R implementation
181(4)
8.2 Approximate binned estimation
185(6)
8.2.1 Approximate density estimation
185(5)
8.2.2 Approximate density derivative and functional estimation
190(1)
8.3 Recursive computation of the normal density derivatives
191(4)
8.4 Recursive computation of the normal functionals
195(2)
8.5 Numerical optimisation over matrix spaces
197(2)
A Notation
199(6)
B Matrix algebra
205(2)
B.1 The Kronecker product
205(1)
B.2 The vec operator
206(1)
B.3 The commutation matrix
206(1)
Bibliography 207(18)
Index 225
José E. Chacón is an associate professor at the Department of Mathematics of the Universidad de Extremadura in Spain. Tarn Duong is a Senior Data Scientist for a start-up which provides short distance carpooling services in France.

Both authors have made important contributions to kernel smoothing research over the last couple of decades.