Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Statistical Analysis Techniques in Particle Physics: Fits, Density Estimation and Supervised Learning

Ilya Narsky (MathWorks, Natick, USA), Frank C. Porter (California Institute of Technology, Pasadena, USA)

Teised formaadid

Other digital carrier (Hind: 129,63 €) - 08-Nov-2013

Formaat: EPUB+DRM
Ilmumisaeg: 24-Oct-2013
Kirjastus: Blackwell Verlag GmbH
Keel: eng
ISBN-13: 9783527677290

Teised raamatud teemal:

Formaat - EPUB+DRM
Hind: 116,03 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
Raamatukogudele

Formaat: EPUB+DRM
Ilmumisaeg: 24-Oct-2013
Kirjastus: Blackwell Verlag GmbH
Keel: eng
ISBN-13: 9783527677290

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Assuming that readers are already familiar with basic probability theory and basic methods in parameter estimation such as maximum likelihood, Narsky and Porter introduce physicists to the tools of statistics that have been developed in an environment of virtually unlimited computing power. They focus on supervised machine learning, and within it on classification rather than regression. Among the topics are parameter likelihood fits, linear transformations and dimensionality reduction, assessing classifier performance, local learning and kernel expansion, and bump hunting in multivariate data. Annotation ©2014 Book News, Inc., Portland, OR (booknews.com)

Modern analysis of HEP data needs advanced statistical tools to separate signal from background. This is the first book which focuses on machine learning techniques. It will be of interest to almost every high energy physicist, and, due to its coverage, suitable for students.

Acknowledgements

xiii

Notation and Vocabulary

1 Why We Wrote This Book and How You Should Read It

(4)

2 Parametric Likelihood Fits

(34)

2.1 Preliminaries

(7)

2.1.1 Example: CP Violation via Mixing

(2)

2.1.2 The Exponential Family

(1)

2.1.3 Confidence Intervals

(1)

2.1.4 Hypothesis Tests

(1)

2.2 Parametric Likelihood Fits

(9)

2.2.1 Nuisance Parameters

(1)

2.2.2 Confidence Intervals from Pivotal Quantities

(2)

2.2.3 Asymptotic Inference

(1)

2.2.4 Profile Likelihood

(1)

2.2.5 Conditional Likelihood

(1)

2.3 Fits for Small Statistics

(5)

2.3.1 Sample Study of Coverage at Small Statistics

(3)

2.3.2 When the pdf Goes Negative

(1)

2.4 Results Near the Boundary of a Physical Region

(2)

2.5 Likelihood Ratio Test for Presence of Signal

(3)

2.6 sPlots

(4)

2.7 Exercises

(4)

References

(2)

3 Goodness of Fit

(24)

3.1 Binned Goodness of Fit Tests

(5)

3.2 Statistics Converging to Chi-Square

(3)

3.3 Univariate Unbinned Goodness of Fit Tests

(3)

3.3.1 Kolmogorov--Smirnov

(1)

3.3.2 Anderson--Darling

(1)

3.3.3 Watson

(1)

3.3.4 Neyman Smooth

(1)

3.4 Multivariate Tests

(7)

3.4.1 Energy Tests

(1)

3.4.2 Transformations to a Uniform Distribution

(1)

3.4.3 Local Density Tests

(1)

3.4.4 Kernel-based Tests

(1)

3.4.5 Mixed Sample Tests

(1)

3.4.6 Using a Classifier

(1)

3.5 Exercises

(4)

References

(2)

4 Resampling Techniques

(26)

4.1 Permutation Sampling

(2)

4.2 Bootstrap

(5)

4.2.1 Bootstrap Confidence Intervals

(2)

4.2.2 Smoothed Bootstrap

(1)

4.2.3 Parametric Bootstrap

(1)

4.3 Jackknife

(6)

4.4 BCa Confidence Intervals

(2)

4.5 Cross-Validation

(4)

4.6 Resampling Weighted Observations

(4)

4.7 Exercises

(3)

References

(3)

5 Density Estimation

(32)

5.1 Empirical Density Estimate

(1)

5.2 Histograms

(2)

5.3 Kernel Estimation

(1)

5.3.1 Multivariate Kernel Estimation

(1)

5.4 Ideogram

(1)

5.5 Parametric vs. Nonparametric Density Estimation

(1)

5.6 Optimization

(6)

5.6.1 Choosing Histogram Binning

(3)

5.7 Estimating Errors

100

(2)

5.8 The Curse of Dimensionality

102

(1)

5.9 Adaptive Kernel Estimation

103

(2)

5.10 Naive Bayes Classification

105

(1)

5.11 Multivariate Kernel Estimation

106

(2)

5.12 Estimation Using Orthogonal Series

108

(3)

5.13 Using Monte Carlo Models

111

(1)

5.14 Unfolding

112

(8)

5.14.1 Unfolding: Regularization

116

(4)

5.15 Exercises

120

(1)

References

120

(1)

6 Basic Concepts and Definitions of Machine Learning

121

(8)

6.1 Supervised, Unsupervised, and Semi-Supervised

121

(2)

6.2 Tall and Wide Data

123

(1)

6.3 Batch and Online Learning

124

(1)

6.4 Parallel Learning

125

(2)

6.5 Classification and Regression

127

(2)

References

128

(1)

7 Data Preprocessing

129

(16)

7.1 Categorical Variables

129

(3)

7.2 Missing Values

132

(7)

7.2.1 Likelihood Optimization

134

(1)

7.2.2 Deletion

135

(2)

7.2.3 Augmentation

137

(1)

7.2.4 Imputation

137

(2)

7.2.5 Other Methods

139

(1)

7.3 Outliers

139

(2)

7.4 Exercises

141

(4)

References

142

(3)

8 Linear Transformations and Dimensionality Reduction

145

(20)

8.1 Centering, Scaling, Reflection and Rotation

145

(1)

8.2 Rotation and Dimensionality Reduction

146

(1)

8.3 Principal Component Analysis (PCA)

147

(11)

8.3.1 Theory

148

(1)

8.3.2 Numerical Implementation

149

(1)

8.3.3 Weighted Data

150

(1)

8.3.4 How Many Principal Components Are Enough?

151

(3)

8.3.5 Example: Apply PCA and Choose the Optimal Number of Components

154

(4)

8.4 Independent Component Analysis (ICA)

158

(5)

8.4.1 Theory

158

(3)

8.4.2 Numerical implementation

161

(1)

8.4.3 Properties

162

(1)

8.5 Exercises

163

(2)

References

163

(2)

9 Introduction to Classification

165

(30)

9.1 Loss Functions: Hard Labels and Soft Scores

165

(3)

9.2 Bias, Variance, and Noise

168

(5)

9.3 Training, Validating and Testing: The Optimal Splitting Rule

173

(4)

9.4 Resampling Techniques: Cross-Validation and Bootstrap

177

(5)

9.4.1 Cross-Validation

177

(2)

9.4.2 Bootstrap

179

(2)

9.4.3 Sampling with Stratification

181

(1)

9.5 Data with Unbalanced Classes

182

(8)

9.5.1 Adjusting Prior Probabilities

183

(1)

9.5.2 Undersampling the Majority Class

184

(1)

9.5.3 Oversampling the Minority Class

185

(1)

9.5.4 Example: Classification of Forest Cover Type Data

186

(4)

9.6 Learning with Cost

190

(1)

9.7 Exercises

191

(4)

References

192

(3)

10 Assessing Classifier Performance

195

(26)

10.1 Classification Error and Other Measures of Predictive Power

195

(1)

10.2 Receiver Operating Characteristic (ROC) and Other Curves

196

(14)

10.2.1 Empirical ROC curve

196

(2)

10.2.2 Other Performance Measures

198

(1)

10.2.3 Optimal Operating Point

198

(2)

10.2.4 Area Under Curve

200

(1)

10.2.5 Smooth ROC Curves

200

(5)

10.2.6 Confidence Bounds for ROC Curves

205

(5)

10.3 Testing Equivalence of Two Classification Models

210

(5)

10.4 Comparing Several Classifiers

215

(2)

10.5 Exercises

217

(4)

References

218

(3)

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

221

(30)

11.1 Discriminant Analysis

221

(10)

11.1.1 Estimating the Covariance Matrix

223

(2)

11.1.2 Verifying Discriminant Analysis Assumptions

225

(1)

11.1.3 Applying LDA When LDA Assumptions Are Invalid

226

(2)

11.1.4 Numerical Implementation

228

(1)

11.1.5 Regularized Discriminant Analysis

228

(1)

11.1.6 LDA for Variable Transformation

229

(2)

11.2 Logistic Regression

231

(4)

11.2.1 Binomial Logistic Regression: Theory and Numerical Implementation

231

(2)

11.2.2 Properties of the Binomial Model

233

(1)

11.2.3 Verifying Model Assumptions

233

(1)

11.2.4 Logistic Regression with Multiple Classes

234

(1)

11.3 Classification by Linear Regression

235

(1)

11.4 Partial Least Squares Regression

236

(3)

11.5 Example: Linear Models for MAGIC Telescope Data

239

(8)

11.6 Choosing a Linear Classifier for Your Analysis

247

(1)

11.7 Exercises

247

(4)

References

248

(3)

12 Neural Networks

251

(14)

12.1 Perceptrons

251

(3)

12.2 The Feed-Forward Neural Network

254

(2)

12.3 Backpropagation

256

(4)

12.4 Bayes Neural Networks

260

(2)

12.5 Genetic Algorithms

262

(1)

12.6 Exercises

263

(2)

References

263

(2)

13 Local Learning and Kernel Expansion

265

(42)

13.1 From Input Variables to the Feature Space

266

(4)

13.1.1 Kernel Regression

269

(1)

13.2 Regularization

270

(8)

13.2.1 Kernel Ridge Regression

274

(4)

13.3 Making and Choosing Kernels

278

(1)

13.4 Radial Basis Functions

279

(4)

13.4.1 Example: RBF Classification for the MAGIC Telescope Data

280

(3)

13.5 Support Vector Machines (SVM)

283

(10)

13.5.1 SVM with Weighted Data

286

(2)

13.5.2 SVM with Probabilistic Outputs

288

(1)

13.5.3 Numerical Implementation

288

(5)

13.5.4 Multiclass Extensions

293

(1)

13.6 Empirical Local Methods

293

(9)

13.6.1 Classification by Probability Density Estimation

294

(1)

13.6.2 Locally Weighted Regression

295

(3)

13.6.3 Nearest Neighbors and Fuzzy Rules

298

(4)

13.7 Kernel Methods: The Good, the Bad and the Curse of Dimensionality

302

(1)

13.8 Exercises

303

(4)

References

304

(3)

14 Decision Trees

307

(24)

14.1 Growing Trees

308

(4)

14.2 Predicting by Decision Trees

312

(1)

14.3 Stopping Rules

312

(1)

14.4 Pruning Trees

313

(6)

14.4.1 Example: Pruning a Classification Tree

317

(2)

14.5 Trees for Multiple Classes

319

(1)

14.6 Splits on Categorical Variables

320

(1)

14.7 Surrogate Splits

321

(2)

14.8 Missing Values

323

(1)

14.9 Variable importance

324

(3)

14.10 Why Are Decision Trees Good (or Bad)?

327

(1)

14.11 Exercises

328

(3)

References

329

(2)

15 Ensemble Learning

331

(40)

15.1 Boosting

332

(26)

15.1.1 Early Boosting

332

(1)

15.1.2 AdaBoost for Two Classes

333

(3)

15.1.3 Minimizing Convex Loss by Stagewise Additive Modeling

336

(7)

15.1.4 Maximizing the Minimal Margin

343

(8)

15.1.5 Nonconvex Loss and Robust Boosting

351

(6)

15.1.6 Boosting for Multiple Classes

357

(1)

15.2 Diversifying the Weak Learner: Bagging, Random Subspace and Random Forest

358

(7)

15.2.1 Measures of Diversity

359

(2)

15.2.2 Bagging and Random Forest

361

(2)

15.2.3 Random Subspace

363

(1)

15.2.4 Example: K/π Separation for BaBar PID

364

(1)

15.3 Choosing an Ensemble for Your Analysis

365

(2)

15.4 Exercises

367

(4)

References

367

(4)

16 Reducing Multiclass to Binary

371

(10)

16.1 Encoding

372

(3)

16.2 Decoding

375

(3)

16.3 Summary: Choosing the Right Design

378

(3)

References

379

(2)

17 How to Choose the Right classifier for Your Analysis and Apply It Correctly

381

(4)

17.1 Predictive Performance and Interpretability

381

(1)

17.2 Matching Classifiers and Variables

382

(1)

17.3 Using Classifier Predictions

382

(1)

17.4 Optimizing Accuracy

383

(1)

17.5 CPU and Memory Requirements

383

(2)

18 Methods for Variable Ranking and Selection

385

(32)

18.1 Definitions

386

(3)

18.1.1 Variable Ranking and Selection

386

(1)

18.1.2 Strong and Weak Relevance

386

(3)

18.2 Variable Ranking

389

(12)

18.2.1 Filters: Correlation and Mutual Information

390

(4)

18.2.2 Wrappers: Sequential Forward Selection (SFS), Sequential Backward Elimination (SBE), and Feature-based Sensitivity of Posterior Probabilities (FSPP)

394

(6)

18.2.3 Embedded Methods: Estimation of Variable Importance by Decision Trees, Neural Networks, Nearest Neighbors, and Linear Models

400

(1)

18.3 Variable Selection

401

(12)

18.3.1 Optimal-Set Search Strategies

401

(2)

18.3.2 Multiple Testing: Backward Elimination by change in Margin (BECM)

403

(7)

18.3.3 Estimation of the Reference Distribution by Permutations: Artificial Contrasts with Ensembles (ACE) Algorithm

410

(3)

18.4 Exercises

413

(4)

References

414

(3)

19 Bump Hunting in Multivariate Data

417

(8)

19.1 Voronoi Tessellation and SLEUTH Algorithm

418

(2)

19.2 Identifying Box Regions by PRIM and Other Algorithms

420

(2)

19.3 Bump Hunting Through Supervised Learning

422

(3)

References

423

(2)

20 Software Packages for Machine Learning

425

(6)

20.1 Tools Developed in HEP

425

(1)

20.2 R

426

(1)

20.3 Matlab

427

(1)

20.4 Tools for Java and Python

428

(1)

20.5 What Software Tool Is Right for You?

429

(2)

References

430

(1)

Appendix A Optimization Algorithms

431

(4)

A.1 Line Search

431

(1)

A.2 Linear Programming (LP)

432

(3)

Index

435

The authors are experts in the use of statistics in particle physics data analysis. Frank C. Porter is Professor at Physics at the California Institute of Technology and has lectured extensively at CalTech, the SLAC Laboratory at Stanford, and elsewhere. Ilya Narsky is Senior Matlab Developer at The MathWorks, a leading developer of technical computing software for engineers and scientists, and the initiator of the StatPatternRecognition, a C++ package for statistical analysis of HEP data. Together, they have taught courses for graduate students and postdocs.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97835276772906e.html

Märksõnad:

E-raamat: Statistical Analysis Techniques in Particle Physics: Fits, Density Estimation and Supervised Learning

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv