Muutke küpsiste eelistusi

E-raamat: Advanced Statistics with Applications in R

(Dartmouth Medical School, Lebanon, NH, USA)
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 135,79 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Advanced Statistics with Applications in R fills the gap between several excellent theoretical statistics textbooks and many applied statistics books where teaching reduces to using existing packages. This book looks at what is under the hood. Many statistics issues including the recent crisis with p-value are caused by misunderstanding of statistical concepts due to poor theoretical background of practitioners and applied statisticians. This book is the product of a forty-year experience in teaching of probability and statistics and their applications for solving real-life problems.

There are more than 442 examples in the book: basically every probability or statistics concept is illustrated with an example accompanied with an R code. Many examples, such as Who said π? What team is better? The fall of the Roman empire, James Bond chase problem, Black Friday shopping, Free fall equation: Aristotle or Galilei, and many others are intriguing. These examples cover biostatistics, finance, physics and engineering, text and image analysis, epidemiology, spatial statistics, sociology, etc.

Advanced Statistics with Applications in R teaches students to use theory for solving real-life problems through computations: there are about 500 R codes and 100 datasets. These data can be freely downloaded from the author's website dartmouth.edu/~eugened.

This book is suitable as a text for senior undergraduate students with major in statistics or data science or graduate students. Many researchers who apply statistics on the regular basis find explanation of many fundamental concepts from the theoretical perspective illustrated by concrete real-world applications. 

Why I Wrote This Book xv
1 Discrete random variables 1(42)
1.1 Motivating example
1(1)
1.2 Bernoulli random variable
2(2)
1.3 General discrete random variable
4(2)
1.4 Mean and variance
6(9)
1.4.1 Mechanical interpretation of the mean
7(5)
1.4.2 Variance
12(3)
1.5 R basics
15(11)
1.5.1 Scripts/functions
16(1)
1.5.2 Text editing in R
17(1)
1.5.3 Saving your R code
18(1)
1.5.4 for loop
18(1)
1.5.5 Vectorized computations
19(4)
1.5.6 Graphics
23(2)
1.5.7 Coding and help in R
25(1)
1.6 Binomial distribution
26(6)
1.7 Poisson distribution
32(6)
1.8 Random number generation using sample
38(5)
1.8.1 Generation of a discrete random variable
38(1)
1.8.2 Random Sudoku
39(4)
2 Continuous random variables 43(106)
2.1 Distribution and density functions
43(5)
2.1.1 Cumulative distribution function
43(2)
2.1.2 Empirical cdf
45(1)
2.1.3 Density function
46(2)
2.2 Mean, variance, and other moments
48(11)
2.2.1 Quantiles, quartiles, and the median
54(1)
2.2.2 The tight confidence range
55(4)
2.3 Uniform distribution
59(4)
2.4 Exponential distribution
63(6)
2.4.1 Laplace or double-exponential distribution
67(1)
2.4.2 R functions
67(2)
2.5 Moment generating function
69(6)
2.5.1 Fourier transform and characteristic function
72(3)
2.6 Gamma distribution
75(7)
2.6.1 Relationship to Poisson distribution
77(2)
2.6.2 Computing the gamma distribution in R
79(1)
2.6.3 The tight confidence range
79(3)
2.7 Normal distribution
82(9)
2.8 Chebyshev's inequality
91(2)
2.9 The law of large numbers
93(11)
2.9.1 Four types of stochastic convergence
94(5)
2.9.2 Integral approximation using simulations
99(5)
2.10 The central limit theorem
104(12)
2.10.1 Why the normal distribution is the most natural symmetric distribution
112(1)
2.10.2 CLT on the relative scale
113(3)
2.11 Lognormal distribution
116(4)
2.11.1 Computation of the tight confidence range
118(2)
2.12 Transformations and the delta method
120(6)
2.12.1 The delta method
124(2)
2.13 Random number generation
126(6)
2.13.1 Cauchy distribution
130(2)
2.14 Beta distribution
132(2)
2.15 Entropy
134(4)
2.16 Benford's law: the distribution of the first digit
138(7)
2.16.1 Distributions that almost obey Benford's law
142(3)
2.17 The Pearson family of distributions
145(2)
2.18 Major univariate continuous distributions
147(2)
3 Multivariate random variables 149(106)
3.1 Joint cdf and density
149(7)
3.1.1 Expectation
154(1)
3.1.2 Bivariate discrete distribution
154(2)
3.2 Independence
156(12)
3.2.1 Convolution
159(9)
3.3 Conditional density
168(21)
3.3.1 Conditional mean and variance
171(8)
3.3.2 Mixture distribution and Bayesian statistics
179(3)
3.3.3 Random sum
182(2)
3.3.4 Cancer tumors grow exponentially
184(5)
3.4 Correlation and linear regression
189(9)
3.5 Bivariate normal distribution
198(20)
3.5.1 Regression as conditional mean
206(2)
3.5.2 Variance decomposition and coefficient of determination
208(1)
3.5.3 Generation of dependent normal observations
209(5)
3.5.4 Copula
214(4)
3.6 Joint density upon transformation
218(5)
3.7 Geometric probability
223(7)
3.7.1 Meeting problem
224(1)
3.7.2 Random objects on the square
225(5)
3.8 Optimal portfolio allocation
230(6)
3.8.1 Stocks do not correlate
231(1)
3.8.2 Correlated stocks
232(1)
3.8.3 Markowitz bullet
233(1)
3.8.4 Probability bullet
234(2)
3.9 Distribution of order statistics
236(3)
3.10 Multidimensional random vectors
239(16)
3.10.1 Multivariate conditional distribution
245(2)
3.10.2 Multivariate MGF
247(1)
3.10.3 Multivariate delta method
248(3)
3.10.4 Multinomial distribution
251(4)
4 Four important distributions in statistics 255(36)
4.1 Multivariate normal distribution
255(15)
4.1.1 Generation of multivariate normal variables
259(2)
4.1.2 Conditional distribution
261(7)
4.1.3 Multivariate CLT
268(2)
4.2 Chi-square distribution
270(10)
4.2.1 Noncentral chi-square distribution
276(1)
4.2.2 Expectations and variances of quadratic forms
277(1)
4.2.3 Kronecker product and covariance matrix
277(3)
4.3 t-distribution
280(6)
4.3.1 Noncentral t-distribution
284(2)
4.4 F-distribution
286(5)
5 Preliminary data analysis and visualization 291(56)
5.1 Comparison of random variables using the cdf
291(21)
5.1.1 ROC curve
294(11)
5.1.2 Survival probability
305(7)
5.2 Histogram
312(3)
5.3 Q-Q plot
315(9)
5.3.1 The q-q confidence bands
319(5)
5.4 Box plot
324(1)
5.5 Kernel density estimation
325(10)
5.5.1 Density movie
331(2)
5.5.2 3D scatterplots
333(2)
5.6 Bivariate normal kernel density
335(12)
5.6.1 Bivariate kernel smoother for images
339(2)
5.6.2 Smoothed scatterplot
341(1)
5.6.3 Spatial statistics for disease mapping
342(5)
6 Parameter estimation 347(176)
6.1 Statistics as inverse probability
349(1)
6.2 Method of moments
350(7)
6.2.1 Generalized method of moments
353(4)
6.3 Method of quantiles
357(1)
6.4 Statistical properties of an estimator
358(20)
6.4.1 Unbiasedness
359(6)
6.4.2 Mean Square Error
365(6)
6.4.3 Multidimensional MSE
371(2)
6.4.4 Consistency of estimators
373(5)
6.5 Linear estimation
378(7)
6.5.1 Estimation of the mean using linear estimator
379(4)
6.5.2 Vector representation
383(2)
6.6 Estimation of variance and correlation coefficient
385(13)
6.6.1 Quadratic estimation of the variance
386(3)
6.6.2 Estimation of the covariance and correlation coefficient
389(9)
6.7 Least squares for simple linear regression
398(17)
6.7.1 Gauss-Markov theorem
402(2)
6.7.2 Statistical properties of the OLS estimator under the normal assumption
404(2)
6.7.3 The lm function and prediction by linear regression
406(4)
6.7.4 Misinterpretation of the coefficient of determination
410(5)
6.8 Sufficient statistics and the exponential family of distributions
415(18)
6.8.1 Uniformly minimum-variance unbiased estimator
419(3)
6.8.2 Exponential family of distributions
422(11)
6.9 Fisher information and the Cramer-Rao bound
433(20)
6.9.1 One parameter
434(6)
6.9.2 Multiple parameters
440(13)
6.10 Maximum likelihood
453(57)
6.10.1 Basic definitions and examples
453(18)
6.10.2 Circular statistics and the von Mises distribution
471(4)
6.10.3 Maximum likelihood, sufficient statistics and the exponential family
475(2)
6.10.4 Asymptotic properties of ML
477(8)
6.10.5 When maximum likelihood breaks down
485(13)
6.10.6 Algorithms for log-likelihood function maximization
498(12)
6.11 Estimating equations and the M-estimator
510(13)
6.11.1 Robust statistics
516(7)
7 Hypothesis testing and confidence intervals 523(104)
7.1 Fundamentals of statistical testing
523(8)
7.1.1 The p-value and its interpretation
525(3)
7.1.2 Ad hoc statistical testing
528(3)
7.2 Simple hypothesis
531(5)
7.3 The power function of the Z-test
536(13)
7.3.1 Type II error and the power function
536(6)
7.3.2 Optimal significance level and the ROC curve
542(3)
7.3.3 One-sided hypothesis
545(4)
7.4 The t-test for the means
549(13)
7.4.1 One-sample t-test
549(3)
7.4.2 Two-sample t-test
552(5)
7.4.3 One-sided t-test
557(1)
7.4.4 Paired versus unpaired t-test
558(2)
7.4.5 Parametric versus nonparametric tests
560(2)
7.5 Variance test
562(4)
7.5.1 Two-sided variance test
562(3)
7.5.2 One-sided variance test
565(1)
7.6 Inverse-cdf test
566(14)
7.6.1 General formulation
567(2)
7.6.2 The F-test for variances
569(4)
7.6.3 Binomial proportion
573(4)
7.6.4 Poisson rate
577(3)
7.7 Testing for correlation coefficient
580(3)
7.8 Confidence interval
583(14)
7.8.1 Unbiased CI and its connection to hypothesis testing
588(1)
7.8.2 Inverse cdf CI
589(2)
7.8.3 CI for the normal variance and SD
591(1)
7.8.4 CI for other major statistical parameters
592(2)
7.8.5 Confidence region
594(3)
7.9 Three asymptotic tests and confidence intervals
597(15)
7.9.1 Pearson chi-square test
605(3)
7.9.2 Handwritten digit recognition
608(4)
7.10 Limitations of classical hypothesis testing and the d-value
612(15)
7.10.1 What the p-value means?
613(1)
7.10.2 Why α = 0.05?
614(2)
7.10.3 The null hypothesis is always rejected with a large enough sample size
616(2)
7.10.4 Parameter-based inference
618(1)
7.10.5 The d-value for individual inference
619(8)
8 Linear model and its extensions 627(114)
8.1 Basic definitions and linear least squares
627(12)
8.1.1 Linear model with the intercept term
632(1)
8.1.2 The vector-space geometry of least squares
633(3)
8.1.3 Coefficient of determination
636(3)
8.2 The Gauss-Markov theorem
639(4)
8.2.1 Estimation of regression variance
641(2)
8.3 Properties of OLS estimators under the normal assumption
643(7)
8.3.1 The sensitivity of statistical inference to violation of the normal assumption
646(4)
8.4 Statistical inference with linear models
650(21)
8.4.1 Confidence interval and region
650(3)
8.4.2 Linear hypothesis testing and the F-test
653(8)
8.4.3 Prediction by linear regression and simultaneous confidence band
661(3)
8.4.4 Testing the null hypothesis and the coefficient of determination
664(1)
8.4.5 Is X fixed or random?
665(6)
8.5 The one-sided p- and d-value for regression coefficients
671(5)
8.5.1 The one-sided p-value for interpretation on the population level
672(1)
8.5.2 The d-value for interpretation on the individual level
673(3)
8.6 Examples and pitfalls
676(20)
8.6.1 Kids drinking and alcohol movie watching
676(4)
8.6.2 My first false discovery
680(1)
8.6.3 Height, foot, and nose regression
681(3)
8.6.4 A geometric interpretation of adding a new predictor
684(3)
8.6.5 Contrast coefficient of determination against spurious regression
687(9)
8.7 Dummy variable approach and ANOVA
696(27)
8.7.1 Dummy variables for categories
696(9)
8.7.2 Unpaired and paired t-test
705(3)
8.7.3 Modeling longitudinal data
708(4)
8.7.4 One-way ANOVA model
712(8)
8.7.5 Two-way ANOVA
720(3)
8.8 Generalized linear model
723(18)
8.8.1 MLE estimation of GLM
727(1)
8.8.2 Logistic and probit regressions for binary outcome
728(8)
8.8.3 Poisson regression
736(5)
9 Nonlinear regression 741(70)
9.1 Definition and motivating examples
741(9)
9.2 Nonlinear least squares
750(3)
9.3 Gauss-Newton algorithm
753(4)
9.4 Statistical properties of the NLS estimator
757(13)
9.4.1 Large sample properties
757(5)
9.4.2 Small sample properties
762(1)
9.4.3 Asymptotic confidence intervals and hypothesis testing
763(5)
9.4.4 Three methods of statistical inference in large sample
768(2)
9.5 The nls function and examples
770(16)
9.5.1 NLS-cdf estimator
782(4)
9.6 Studying small sample properties through simulations
786(8)
9.6.1 Normal distribution approximation
787(2)
9.6.2 Statistical tests
789(2)
9.6.3 Confidence region
791(1)
9.6.4 Confidence intervals
792(2)
9.7 Numerical complications of the nonlinear least squares
794(5)
9.7.1 Criteria for existence
795(1)
9.7.2 Criteria for uniqueness
796(3)
9.8 Optimal design of experiments with nonlinear regression
799(6)
9.8.1 Motivating examples
799(3)
9.8.2 Optimal designs with nonlinear regression
802(3)
9.9 The Michaelis-Menten model
805(6)
9.9.1 The NLS solution
806(1)
9.9.2 The exact solution
807(4)
10 Appendix 811(32)
10.1 Notation
811(1)
10.2 Basics of matrix algebra
811(7)
10.2.1 Preliminaries and matrix inverse
812(3)
10.2.2 Determinant
815(1)
10.2.3 Partition matrices
816(2)
10.3 Eigenvalues and eigenvectors
818(4)
10.3.1 Jordan spectral matrix decomposition
819(1)
10.3.2 SVD: Singular value decomposition of a rectangular matrix
820(2)
10.4 Quadratic forms and positive definite matrices
822(4)
10.4.1 Quadratic forms
822(1)
10.4.2 Positive and nonnegative definite matrices
823(3)
10.5 Vector and matrix calculus
826(3)
10.5.1 Differentiation of a scalar-valued function with respect to a vector
826(1)
10.5.2 Differentiation of a vector-valued function with respect to a vector
827(1)
10.5.3 Kronecker product
828(1)
10.5.4 vec operator
828(1)
10.6 Optimization
829(14)
10.6.1 Convex and concave functions
830(1)
10.6.2 Criteria for unconstrained minimization
831(4)
10.6.3 Gradient algorithms
835(3)
10.6.4 Constrained optimization: Lagrange multiplier technique
838(5)
Bibliography 843(8)
Index 851
PROFESSOR EUGENE DEMIDENKO works at Dartmouth College in the Department of Biomedical Science, he teaches statistics at Mathematics Department to undergraduate students and to graduate students at Quantitative Biomedical Sciences at Geisel School of Medicine. He has brought experience in theoretical and applied statistics, such as epidemiology and biostatistics, statistical analysis of images, tumor regrowth, ill-posed inverse problems in engineering and technology, optimal portfolio allocation, among others. His first book with Wiley Mixed Model: Theory and Applications with R gained much popularity among researchers and graduate/PhD students. Prof. Demidenko is the author of a controversial paper The P-value You Can't Buy published in 2016 in The American Statistician.