Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Advanced Statistics with Applications in R

Eugene Demidenko (Dartmouth Medical School, Lebanon, NH, USA)

Formaat: EPUB+DRM
Sari: Wiley Series in Probability and Statistics
Ilmumisaeg: 26-Nov-2019
Kirjastus: John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781118594612

Teised raamatud teemal:

Probability & statistics

Formaat - EPUB+DRM
Hind: 135,79 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: EPUB+DRM
Sari: Wiley Series in Probability and Statistics
Ilmumisaeg: 26-Nov-2019
Kirjastus: John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781118594612

Teised raamatud teemal:

Probability & statistics

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Advanced Statistics with Applications in R fills the gap between several excellent theoretical statistics textbooks and many applied statistics books where teaching reduces to using existing packages. This book looks at what is under the hood. Many statistics issues including the recent crisis with p-value are caused by misunderstanding of statistical concepts due to poor theoretical background of practitioners and applied statisticians. This book is the product of a forty-year experience in teaching of probability and statistics and their applications for solving real-life problems.

There are more than 442 examples in the book: basically every probability or statistics concept is illustrated with an example accompanied with an R code. Many examples, such as Who said π? What team is better? The fall of the Roman empire, James Bond chase problem, Black Friday shopping, Free fall equation: Aristotle or Galilei, and many others are intriguing. These examples cover biostatistics, finance, physics and engineering, text and image analysis, epidemiology, spatial statistics, sociology, etc.

Advanced Statistics with Applications in R teaches students to use theory for solving real-life problems through computations: there are about 500 R codes and 100 datasets. These data can be freely downloaded from the author's website dartmouth.edu/~eugened.

This book is suitable as a text for senior undergraduate students with major in statistics or data science or graduate students. Many researchers who apply statistics on the regular basis find explanation of many fundamental concepts from the theoretical perspective illustrated by concrete real-world applications.

Why I Wrote This Book

1 Discrete random variables

(42)

1.1 Motivating example

(1)

1.2 Bernoulli random variable

(2)

1.3 General discrete random variable

(2)

1.4 Mean and variance

(9)

1.4.1 Mechanical interpretation of the mean

(5)

1.4.2 Variance

(3)

1.5 R basics

(11)

1.5.1 Scripts/functions

(1)

1.5.2 Text editing in R

(1)

1.5.3 Saving your R code

(1)

1.5.4 for loop

(1)

1.5.5 Vectorized computations

(4)

1.5.6 Graphics

(2)

1.5.7 Coding and help in R

(1)

1.6 Binomial distribution

(6)

1.7 Poisson distribution

(6)

1.8 Random number generation using sample

(5)

1.8.1 Generation of a discrete random variable

(1)

1.8.2 Random Sudoku

(4)

2 Continuous random variables

(106)

2.1 Distribution and density functions

(5)

2.1.1 Cumulative distribution function

(2)

2.1.2 Empirical cdf

(1)

2.1.3 Density function

(2)

2.2 Mean, variance, and other moments

(11)

2.2.1 Quantiles, quartiles, and the median

(1)

2.2.2 The tight confidence range

(4)

2.3 Uniform distribution

(4)

2.4 Exponential distribution

(6)

2.4.1 Laplace or double-exponential distribution

(1)

2.4.2 R functions

(2)

2.5 Moment generating function

(6)

2.5.1 Fourier transform and characteristic function

(3)

2.6 Gamma distribution

(7)

2.6.1 Relationship to Poisson distribution

(2)

2.6.2 Computing the gamma distribution in R

(1)

2.6.3 The tight confidence range

(3)

2.7 Normal distribution

(9)

2.8 Chebyshev's inequality

(2)

2.9 The law of large numbers

(11)

2.9.1 Four types of stochastic convergence

(5)

2.9.2 Integral approximation using simulations

(5)

2.10 The central limit theorem

104

(12)

2.10.1 Why the normal distribution is the most natural symmetric distribution

112

(1)

2.10.2 CLT on the relative scale

113

(3)

2.11 Lognormal distribution

116

(4)

2.11.1 Computation of the tight confidence range

118

(2)

2.12 Transformations and the delta method

120

(6)

2.12.1 The delta method

124

(2)

2.13 Random number generation

126

(6)

2.13.1 Cauchy distribution

130

(2)

2.14 Beta distribution

132

(2)

2.15 Entropy

134

(4)

2.16 Benford's law: the distribution of the first digit

138

(7)

2.16.1 Distributions that almost obey Benford's law

142

(3)

2.17 The Pearson family of distributions

145

(2)

2.18 Major univariate continuous distributions

147

(2)

3 Multivariate random variables

149

(106)

3.1 Joint cdf and density

149

(7)

3.1.1 Expectation

154

(1)

3.1.2 Bivariate discrete distribution

154

(2)

3.2 Independence

156

(12)

3.2.1 Convolution

159

(9)

3.3 Conditional density

168

(21)

3.3.1 Conditional mean and variance

171

(8)

3.3.2 Mixture distribution and Bayesian statistics

179

(3)

3.3.3 Random sum

182

(2)

3.3.4 Cancer tumors grow exponentially

184

(5)

3.4 Correlation and linear regression

189

(9)

3.5 Bivariate normal distribution

198

(20)

3.5.1 Regression as conditional mean

206

(2)

3.5.2 Variance decomposition and coefficient of determination

208

(1)

3.5.3 Generation of dependent normal observations

209

(5)

3.5.4 Copula

214

(4)

3.6 Joint density upon transformation

218

(5)

3.7 Geometric probability

223

(7)

3.7.1 Meeting problem

224

(1)

3.7.2 Random objects on the square

225

(5)

3.8 Optimal portfolio allocation

230

(6)

3.8.1 Stocks do not correlate

231

(1)

3.8.2 Correlated stocks

232

(1)

3.8.3 Markowitz bullet

233

(1)

3.8.4 Probability bullet

234

(2)

3.9 Distribution of order statistics

236

(3)

3.10 Multidimensional random vectors

239

(16)

3.10.1 Multivariate conditional distribution

245

(2)

3.10.2 Multivariate MGF

247

(1)

3.10.3 Multivariate delta method

248

(3)

3.10.4 Multinomial distribution

251

(4)

4 Four important distributions in statistics

255

(36)

4.1 Multivariate normal distribution

255

(15)

4.1.1 Generation of multivariate normal variables

259

(2)

4.1.2 Conditional distribution

261

(7)

4.1.3 Multivariate CLT

268

(2)

4.2 Chi-square distribution

270

(10)

4.2.1 Noncentral chi-square distribution

276

(1)

4.2.2 Expectations and variances of quadratic forms

277

(1)

4.2.3 Kronecker product and covariance matrix

277

(3)

4.3 t-distribution

280

(6)

4.3.1 Noncentral t-distribution

284

(2)

4.4 F-distribution

286

(5)

5 Preliminary data analysis and visualization

291

(56)

5.1 Comparison of random variables using the cdf

291

(21)

5.1.1 ROC curve

294

(11)

5.1.2 Survival probability

305

(7)

5.2 Histogram

312

(3)

5.3 Q-Q plot

315

(9)

5.3.1 The q-q confidence bands

319

(5)

5.4 Box plot

324

(1)

5.5 Kernel density estimation

325

(10)

5.5.1 Density movie

331

(2)

5.5.2 3D scatterplots

333

(2)

5.6 Bivariate normal kernel density

335

(12)

5.6.1 Bivariate kernel smoother for images

339

(2)

5.6.2 Smoothed scatterplot

341

(1)

5.6.3 Spatial statistics for disease mapping

342

(5)

6 Parameter estimation

347

(176)

6.1 Statistics as inverse probability

349

(1)

6.2 Method of moments

350

(7)

6.2.1 Generalized method of moments

353

(4)

6.3 Method of quantiles

357

(1)

6.4 Statistical properties of an estimator

358

(20)

6.4.1 Unbiasedness

359

(6)

6.4.2 Mean Square Error

365

(6)

6.4.3 Multidimensional MSE

371

(2)

6.4.4 Consistency of estimators

373

(5)

6.5 Linear estimation

378

(7)

6.5.1 Estimation of the mean using linear estimator

379

(4)

6.5.2 Vector representation

383

(2)

6.6 Estimation of variance and correlation coefficient

385

(13)

6.6.1 Quadratic estimation of the variance

386

(3)

6.6.2 Estimation of the covariance and correlation coefficient

389

(9)

6.7 Least squares for simple linear regression

398

(17)

6.7.1 Gauss-Markov theorem

402

(2)

6.7.2 Statistical properties of the OLS estimator under the normal assumption

404

(2)

6.7.3 The lm function and prediction by linear regression

406

(4)

6.7.4 Misinterpretation of the coefficient of determination

410

(5)

6.8 Sufficient statistics and the exponential family of distributions

415

(18)

6.8.1 Uniformly minimum-variance unbiased estimator

419

(3)

6.8.2 Exponential family of distributions

422

(11)

6.9 Fisher information and the Cramer-Rao bound

433

(20)

6.9.1 One parameter

434

(6)

6.9.2 Multiple parameters

440

(13)

6.10 Maximum likelihood

453

(57)

6.10.1 Basic definitions and examples

453

(18)

6.10.2 Circular statistics and the von Mises distribution

471

(4)

6.10.3 Maximum likelihood, sufficient statistics and the exponential family

475

(2)

6.10.4 Asymptotic properties of ML

477

(8)

6.10.5 When maximum likelihood breaks down

485

(13)

6.10.6 Algorithms for log-likelihood function maximization

498

(12)

6.11 Estimating equations and the M-estimator

510

(13)

6.11.1 Robust statistics

516

(7)

7 Hypothesis testing and confidence intervals

523

(104)

7.1 Fundamentals of statistical testing

523

(8)

7.1.1 The p-value and its interpretation

525

(3)

7.1.2 Ad hoc statistical testing

528

(3)

7.2 Simple hypothesis

531

(5)

7.3 The power function of the Z-test

536

(13)

7.3.1 Type II error and the power function

536

(6)

7.3.2 Optimal significance level and the ROC curve

542

(3)

7.3.3 One-sided hypothesis

545

(4)

7.4 The t-test for the means

549

(13)

7.4.1 One-sample t-test

549

(3)

7.4.2 Two-sample t-test

552

(5)

7.4.3 One-sided t-test

557

(1)

7.4.4 Paired versus unpaired t-test

558

(2)

7.4.5 Parametric versus nonparametric tests

560

(2)

7.5 Variance test

562

(4)

7.5.1 Two-sided variance test

562

(3)

7.5.2 One-sided variance test

565

(1)

7.6 Inverse-cdf test

566

(14)

7.6.1 General formulation

567

(2)

7.6.2 The F-test for variances

569

(4)

7.6.3 Binomial proportion

573

(4)

7.6.4 Poisson rate

577

(3)

7.7 Testing for correlation coefficient

580

(3)

7.8 Confidence interval

583

(14)

7.8.1 Unbiased CI and its connection to hypothesis testing

588

(1)

7.8.2 Inverse cdf CI

589

(2)

7.8.3 CI for the normal variance and SD

591

(1)

7.8.4 CI for other major statistical parameters

592

(2)

7.8.5 Confidence region

594

(3)

7.9 Three asymptotic tests and confidence intervals

597

(15)

7.9.1 Pearson chi-square test

605

(3)

7.9.2 Handwritten digit recognition

608

(4)

7.10 Limitations of classical hypothesis testing and the d-value

612

(15)

7.10.1 What the p-value means?

613

(1)

7.10.2 Why α = 0.05?

614

(2)

7.10.3 The null hypothesis is always rejected with a large enough sample size

616

(2)

7.10.4 Parameter-based inference

618

(1)

7.10.5 The d-value for individual inference

619

(8)

8 Linear model and its extensions

627

(114)

8.1 Basic definitions and linear least squares

627

(12)

8.1.1 Linear model with the intercept term

632

(1)

8.1.2 The vector-space geometry of least squares

633

(3)

8.1.3 Coefficient of determination

636

(3)

8.2 The Gauss-Markov theorem

639

(4)

8.2.1 Estimation of regression variance

641

(2)

8.3 Properties of OLS estimators under the normal assumption

643

(7)

8.3.1 The sensitivity of statistical inference to violation of the normal assumption

646

(4)

8.4 Statistical inference with linear models

650

(21)

8.4.1 Confidence interval and region

650

(3)

8.4.2 Linear hypothesis testing and the F-test

653

(8)

8.4.3 Prediction by linear regression and simultaneous confidence band

661

(3)

8.4.4 Testing the null hypothesis and the coefficient of determination

664

(1)

8.4.5 Is X fixed or random?

665

(6)

8.5 The one-sided p- and d-value for regression coefficients

671

(5)

8.5.1 The one-sided p-value for interpretation on the population level

672

(1)

8.5.2 The d-value for interpretation on the individual level

673

(3)

8.6 Examples and pitfalls

676

(20)

8.6.1 Kids drinking and alcohol movie watching

676

(4)

8.6.2 My first false discovery

680

(1)

8.6.3 Height, foot, and nose regression

681

(3)

8.6.4 A geometric interpretation of adding a new predictor

684

(3)

8.6.5 Contrast coefficient of determination against spurious regression

687

(9)

8.7 Dummy variable approach and ANOVA

696

(27)

8.7.1 Dummy variables for categories

696

(9)

8.7.2 Unpaired and paired t-test

705

(3)

8.7.3 Modeling longitudinal data

708

(4)

8.7.4 One-way ANOVA model

712

(8)

8.7.5 Two-way ANOVA

720

(3)

8.8 Generalized linear model

723

(18)

8.8.1 MLE estimation of GLM

727

(1)

8.8.2 Logistic and probit regressions for binary outcome

728

(8)

8.8.3 Poisson regression

736

(5)

9 Nonlinear regression

741

(70)

9.1 Definition and motivating examples

741

(9)

9.2 Nonlinear least squares

750

(3)

9.3 Gauss-Newton algorithm

753

(4)

9.4 Statistical properties of the NLS estimator

757

(13)

9.4.1 Large sample properties

757

(5)

9.4.2 Small sample properties

762

(1)

9.4.3 Asymptotic confidence intervals and hypothesis testing

763

(5)

9.4.4 Three methods of statistical inference in large sample

768

(2)

9.5 The nls function and examples

770

(16)

9.5.1 NLS-cdf estimator

782

(4)

9.6 Studying small sample properties through simulations

786

(8)

9.6.1 Normal distribution approximation

787

(2)

9.6.2 Statistical tests

789

(2)

9.6.3 Confidence region

791

(1)

9.6.4 Confidence intervals

792

(2)

9.7 Numerical complications of the nonlinear least squares

794

(5)

9.7.1 Criteria for existence

795

(1)

9.7.2 Criteria for uniqueness

796

(3)

9.8 Optimal design of experiments with nonlinear regression

799

(6)

9.8.1 Motivating examples

799

(3)

9.8.2 Optimal designs with nonlinear regression

802

(3)

9.9 The Michaelis-Menten model

805

(6)

9.9.1 The NLS solution

806

(1)

9.9.2 The exact solution

807

(4)

10 Appendix

811

(32)

10.1 Notation

811

(1)

10.2 Basics of matrix algebra

811

(7)

10.2.1 Preliminaries and matrix inverse

812

(3)

10.2.2 Determinant

815

(1)

10.2.3 Partition matrices

816

(2)

10.3 Eigenvalues and eigenvectors

818

(4)

10.3.1 Jordan spectral matrix decomposition

819

(1)

10.3.2 SVD: Singular value decomposition of a rectangular matrix

820

(2)

10.4 Quadratic forms and positive definite matrices

822

(4)

10.4.1 Quadratic forms

822

(1)

10.4.2 Positive and nonnegative definite matrices

823

(3)

10.5 Vector and matrix calculus

826

(3)

10.5.1 Differentiation of a scalar-valued function with respect to a vector

826

(1)

10.5.2 Differentiation of a vector-valued function with respect to a vector

827

(1)

10.5.3 Kronecker product

828

(1)

10.5.4 vec operator

828

(1)

10.6 Optimization

829

(14)

10.6.1 Convex and concave functions

830

(1)

10.6.2 Criteria for unconstrained minimization

831

(4)

10.6.3 Gradient algorithms

835

(3)

10.6.4 Constrained optimization: Lagrange multiplier technique

838

(5)

Bibliography

843

(8)

Index

851

PROFESSOR EUGENE DEMIDENKO works at Dartmouth College in the Department of Biomedical Science, he teaches statistics at Mathematics Department to undergraduate students and to graduate students at Quantitative Biomedical Sciences at Geisel School of Medicine. He has brought experience in theoretical and applied statistics, such as epidemiology and biostatistics, statistical analysis of images, tumor regrowth, ill-posed inverse problems in engineering and technology, optimal portfolio allocation, among others. His first book with Wiley Mixed Model: Theory and Applications with R gained much popularity among researchers and graduate/PhD students. Prof. Demidenko is the author of a controversial paper The P-value You Can't Buy published in 2016 in The American Statistician.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97811185946126e.html

Märksõnad:

E-raamat: Advanced Statistics with Applications in R

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv