Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Mathematical Foundations for Data Analysis 2021 ed. [Kõva köide]

Jeff M. Phillips

Formaat: Hardback, 287 pages, kõrgus x laius: 235x155 mm, kaal: 685 g, 108 Illustrations, color; 1 Illustrations, black and white; XVII, 287 p. 109 illus., 108 illus. in color., 1 Hardback
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 30-Mar-2021
Kirjastus: Springer Nature Switzerland AG
ISBN-10: 3030623408
ISBN-13: 9783030623401

Teised raamatud teemal:

Kõva köide
Hind: 53,33 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Tavahind: 62,74 €
Säästad 15%
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 287 pages, kõrgus x laius: 235x155 mm, kaal: 685 g, 108 Illustrations, color; 1 Illustrations, black and white; XVII, 287 p. 109 illus., 108 illus. in color., 1 Hardback
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 30-Mar-2021
Kirjastus: Springer Nature Switzerland AG
ISBN-10: 3030623408
ISBN-13: 9783030623401

Teised raamatud teemal:

Püsilink: https://www.kriso.ee/db/9783030623401.html

Märksõnad:

This textbook, suitable for an early undergraduate up to a graduate course, provides an overview of many basic principles and techniques needed for modern data analysis. In particular, this book was designed and written as preparation for students planning to take rigorous Machine Learning and Data Mining courses. It introduces key conceptual tools necessary for data analysis, including concentration of measure and PAC bounds, cross validation, gradient descent, and principal component analysis. It also surveys basic techniques in supervised (regression and classification) and unsupervised learning (dimensionality reduction and clustering) through an accessible, simplified presentation. Students are recommended to have some background in calculus, probability, and linear algebra.  Some familiarity with programming and algorithms is useful to understand advanced topics on computational techniques.

Arvustused

This is certainly a timely book with large potential impact and appeal. the book is therewith accessible to a broad scientific audience including undergraduate students. Mathematical Foundations for Data Analysis provides a comprehensive exploration of the mathematics relevant to modern data science topics, with a target audience that is looking for an intuitive and accessible presentation rather than a deep dive into mathematical intricacies. (Aretha L. Teckentrup, SIAM Review, Vol. 65 (1), March, 2023) The book is fairly compact, but a lot of information is presented in those pages. the book is pretty much self-contained, but prior knowledge of linear algebra and python programming would benefit anyone. The clear writing is backed in many instances by helpful illustrations. Color is used judiciously throughout the text to help differentiate between objects and highlight items of interest. Phillips book is much more concise, but still discusses many different mathematical aspects of data science. (David R. Gurney, MAA Reviews, September 5, 2021)

1 Probability Review

(22)

1.1 Sample Spaces

(3)

1.2 Conditional Probability and Independence

(1)

1.3 Density Functions

(2)

1.4 Expected Value

(1)

1.5 Variance

(1)

1.6 Joint, Marginal, and Conditional Distributions

(2)

1.7 Bayes' Rule

(4)

1.7.1 Model Given Data

(3)

1.8 Bayesian Inference

(9)

Exercises

(4)

2 Convergence and Sampling

(20)

2.1 Sampling and Estimation

(3)

2.2 Probably Approximately Correct (PAC)

(1)

2.3 Concentration of Measure

(8)

2.3.1 Markov Inequality

(1)

2.3.2 Chebyshev Inequality

(1)

2.3.3 Chernoff-Hoeffding Inequality

(2)

2.3.4 Union Bound and Examples

(3)

2.4 Importance Sampling

(9)

2.4.1 Sampling Without Replacement with Priority Sampling

(2)

Exercises

(2)

3 Linear Algebra Review

(16)

3.1 Vectors and Matrices

(3)

3.2 Addition and Multiplication

(3)

3.3 Norms

(2)

3.4 Linear Independence

(1)

3.5 Rank

(1)

3.6 Square Matrices and Properties

(2)

3.7 Orthogonality

(4)

Exercises

(2)

4 Distances and Nearest Neighbors

(36)

4.1 Metrics

(1)

4.2 Lp Distances and their Relatives

(6)

4.2.1 Lp Distances

(3)

4.2.2 Mahalanobis Distance

(1)

4.2.3 Cosine and Angular Distance

(1)

4.2.4 KL Divergence

(1)

4.3 Distances for Sets and Strings

(4)

4.3.1 Jaccard Distance

(2)

4.3.2 Edit Distance

(1)

4.4 Modeling Text with Distances

(6)

4.4.1 Bag-of-Words Vectors

(3)

4.4.2 k-Grams

(3)

4.5 Similarities

(4)

4.5.1 Set Similarities

(1)

4.5.2 Normed Similarities

(1)

4.5.3 Normed Similarities between Sets

(2)

4.6 Locality Sensitive Hashing

(15)

4.6.1 Properties of Locality Sensitive Hashing

(1)

4.6.2 Prototypical Tasks for LSH

(1)

4.6.3 Banding to Amplify LSH

(3)

4.6.4 LSH for Angular Distance

(2)

4.6.5 LSH for Euclidean Distance

(1)

4.6.6 Min Hashing as LSH for Jaccard Distance

(3)

Exercises

(2)

5 Linear Regression

(30)

5.1 Simple Linear Regression

(4)

5.2 Linear Regression with Multiple Explanatory Variables

(3)

5.3 Polynomial Regression

102

(2)

5.4 Cross-Validation

104

(5)

5.4.1 Other ways to Evaluate Linear Regression Models

108

(1)

5.5 Regularized Regression

109

(16)

5.5.1 Tikhonov Regularization for Ridge Regression

110

(2)

5.5.2 Lasso

112

(1)

5.5.3 Dual Constrained Formulation

113

(2)

5.5.4 Matching Pursuit

115

(7)

Exercises

122

(3)

6 Gradient Descent

125

(18)

6.1 Functions

125

(3)

6.2 Gradients

128

(1)

6.3 Gradient Descent

129

(6)

6.3.1 Learning Rate

129

(6)

6.4 Fitting a Model to Data

135

(8)

6.4.1 Least Mean Squares Updates for Regression

136

(1)

6.4.2 Decomposable Functions

137

(4)

Exercises

141

(2)

7 Dimensionality Reduction

143

(34)

7.1 Data Matrices

143

(4)

7.1.1 Projections

145

(1)

7.1.2 Sum of Squared Errors Goal

146

(1)

7.2 Singular Value Decomposition

147

(8)

7.2.1 Best Rank-k Approximation of a Matrix

152

(3)

7.3 Eigenvalues and Eigenvectors

155

(2)

7.4 The Power Method

157

(3)

7.5 Principal Component Analysis

160

(1)

7.6 Multidimensional Scaling

161

(5)

7.6.1 Why does Classical MDS work?

163

(3)

7.7 Linear Discriminant Analysis

166

(1)

7.8 Distance Metric Learning

167

(2)

7.9 Matrix Completion

169

(2)

7.10 Random Projections

171

(6)

Exercises

173

(4)

8 Clustering

177

(30)

8.1 Voronoi Diagrams

177

(6)

8.1.1 Delaunay Triangulation

180

(2)

8.1.2 Connection to Assignment-Based Clustering

182

(1)

8.2 Gonzalez's Algorithm for k-Center Clustering

183

(2)

8.3 Lloyd's Algorithm for k-Means Clustering

185

(9)

8.3.1 Lloyd's Algorithm

186

(5)

8.3.2 k-Means++

191

(1)

8.3.3 k-Mediod Clustering

192

(1)

8.3.4 Soft Clustering

193

(1)

8.4 Mixture of Gaussians

194

(2)

8.4.1 Expectation-Maximization

196

(1)

8.5 Hierarchical Clustering

196

(3)

8.6 Density-Based Clustering and Outliers

199

(2)

8.6.1 Outliers

200

(1)

8.7 Mean Shift Clustering

201

(6)

Exercises

203

(4)

9 Classification

207

(30)

9.1 Linear Classifiers

207

(6)

9.1.1 Loss Functions

210

(2)

9.1.2 Cross-Validation and Regularization

212

(1)

9.2 Perception Algorithm

213

(4)

9.3 Support Vector Machines and Kernels

217

(5)

9.3.1 The Dual: Mistake Counter

218

(1)

9.3.2 Feature Expansion

219

(2)

9.3.3 Support Vector Machines

221

(1)

9.4 Learnability and VC dimension

222

(3)

9.5 kNN Classifiers

225

(1)

9.6 Decision Trees

225

(3)

9.7 Neural Networks

228

(9)

9.7.1 Training with Back-propagation

230

(3)

Exercises

233

(4)

10 Graph Structured Data

237

(24)

10.1 Markov Chains

239

(7)

10.1.1 Ergodic Markov Chains

242

(3)

10.1.2 Metropolis Algorithm

245

(1)

10.2 PageRank

246

(3)

10.3 Spectral Clustering on Graphs

249

(5)

10.3.1 Laplacians and their EigenStructures

250

(4)

10.4 Communities in Graphs

254

(7)

10.4.1 Preferential Attachment

256

(1)

10.4.2 Betweenness

256

(1)

10.4.3 Modularity

257

(2)

Exercises

259

(2)

11 Big Data and Sketching

261

(22)

11.1 The Streaming Model

262

(3)

11.1.1 Mean and Variance

264

(1)

11.1.2 Reservoir Sampling

264

(1)

11.2 Frequent Items

265

(8)

11.2.1 Warm-Up: Majority

268

(1)

11.2.2 Misra-Gries Algorithm

269

(1)

11.2.3 Count-Min Sketch

270

(2)

11.2.4 Count Sketch

272

(1)

11.3 Matrix Sketching

273

(10)

11.3.1 Covariance Matrix Summation

274

(1)

11.3.2 Frequent Directions

275

(2)

11.3.3 Row Sampling

277

(1)

11.3.4 Random Projections and Count Sketch Hashing

278

(2)

Exercises

280

(3)

Index

283

Jeff M. Phillips is an Associate Professor in the School of Computing within the University of Utah. He directs the Utah Center for Data Science as well as the Data Science curriculum within the School of Computing. His research is on algorithms for big data analytics, a domain with spans machine learning, computational geometry, data mining, algorithms, and databases, and his work regularly appears in top venues in each of these fields. He focuses on a geometric interpretation of problems, striving for simple, geometric, and intuitive techniques with provable guarantees and solve important challenges in data science. His research is supported by numerous NSF awards including an NSF Career Award.

Mathematical Foundations for Data Analysis 2021 ed. [Kõva köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv