Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Statistical Inference and Machine Learning for Big Data

Mayer Alvo

Formaat: EPUB+DRM
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 30-Nov-2022
Kirjastus: Springer International Publishing AG
Keel: eng
ISBN-13: 9783031067846

Teised raamatud teemal:

Formaat - EPUB+DRM
Hind: 148,19 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: EPUB+DRM
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 30-Nov-2022
Kirjastus: Springer International Publishing AG
Keel: eng
ISBN-13: 9783031067846

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This book presents a variety of advanced statistical methods at a level suitable for advanced undergraduate and graduate students as well as for others interested in familiarizing themselves with these important subjects. It proceeds to illustrate these methods in the context of real-life applications in a variety of areas such as genetics, medicine, and environmental problems.

The book begins in Part I by outlining various data types and by indicating how these are normally represented graphically and subsequently analyzed. In Part II, the basic tools in probability and statistics are introduced with special reference to symbolic data analysis. The most useful and relevant results pertinent to this book are retained. In Part III, the focus is on the tools of machine learning whereas in Part IV the computational aspects of BIG DATA are presented.

This book would serve as a handy desk reference for statistical methods at the undergraduate and graduate level as well as be useful in courses which aim to provide an overview of modern statistics and its applications.

I Introduction to Big Data

(14)

1 Examples of Big Data

(10)

1.1 Multivariate Data

(3)

1.2 Categorical Data

(2)

1.3 Environmental Data

(1)

1.4 Genetic Data

(1)

1.5 Time Series Data

(1)

1.6 Ranking Data

(1)

1.7 Social Network Data

(1)

1.8 Symbolic Data

(1)

1.9 Image Data

(2)

II Statistical Inference for Big Data

(260)

2 Basic Concepts in Probability

(20)

2.1 Pearson System of Distributions

(6)

2.2 Modes of Convergence

(6)

2.3 Multivariate Central Limit Theorem

(1)

2.4 Markov Chains

(3)

3 Basic Concepts in Statistics

(26)

3.1 Parametric Estimation

(9)

3.2 Hypothesis Testing

(11)

3.3 Classical Bayesian Statistics

(6)

4 Multivariate Methods

(32)

4.1 Matrix Algebra

(1)

4.2 Multivariate Analysis as a Generalization of Univariate Analysis

(7)

4.2.1 The General Linear Model

(1)

4.2.2 One Sample Problem

(1)

4.2.3 Two-Sample Problem

(2)

4.3 Structure in Multivariate Data Analysis

(24)

4.3.1 Principal Component Analysis

(3)

4.3.2 Factor Analysis

(2)

4.3.3 Canonical Correlation

(3)

4.3.4 Linear Discriminant Analysis

(1)

4.3.5 Multidimensional Scaling

(7)

4.3.6 Copula Methods

(8)

5 Nonparametric Statistics

(76)

5.1 Goodness-of-Fit Tests

(2)

5.2 Linear Rank Statistics

(14)

5.3 U Statistics

112

(2)

5.4 Hoeffding's Combinatorial Central Limit Theorem

114

(2)

5.5 Nonparametric Tests

116

(7)

5.5.1 One-Sample Tests of Location

116

(3)

5.5.2 Confidence Interval for the Median

119

(1)

5.5.3 Wilcoxon Signed Rank Test

120

(3)

5.6 Multi-Sample Tests

123

(4)

5.6.1 Two-Sample Tests for Location

124

(1)

5.6.2 Multi-Sample Test for Location

125

(1)

5.6.3 Tests for Dispersion

126

(1)

5.7 Compatibility

127

(1)

5.8 Tests for Ordered Alternatives

128

(4)

5.9 A Unified Theory of Hypothesis Testing

132

(10)

5.9.1 Umbrella Alternatives

132

(4)

5.9.2 Tests for Trend in Proportions

136

(6)

5.10 Randomized Block Designs

142

(2)

5.11 Density Estimation

144

(10)

5.11.1 Univariate Kernel Density Estimation

145

(4)

5.11.2 The Rank Transform

149

(1)

5.11.3 Multivariate Kernel Density Estimation

149

(5)

5.12 Spatial Data Analysis

154

(8)

5.12.1 Spatial Prediction

156

(4)

5.12.2 Point Poisson Kriging of Areal Data

160

(2)

5.13 Efficiency

162

(7)

5.13.1 Pitman Efficiency

162

(6)

5.13.2 Application of Le Cam's Lemmas

168

(1)

5.14 Permutation Methods

169

(2)

6 Exponential Tilting and Its Applications

171

(24)

6.1 Neyman Smooth Tests

171

(4)

6.2 Smooth Models for Discrete Distributions

175

(4)

6.3 Rejection Sampling

179

(5)

6.4 Tweedie's Formula: Univariate Case

184

(4)

6.5 Tweedie's Formula: Multivariate Case

188

(1)

6.6 The Saddlepoint Approximation and Notions of Information

189

(6)

7 Counting Data Analysis

195

(20)

7.1 Inference for Generalized Linear Models

198

(2)

7.2 Inference for Contingency Tables

200

(4)

7.3 Two-Way Ordered Classifications

204

(5)

7.4 Survival Analysis

209

(6)

7.4.1 Kaplan-Meier Estimator

211

(3)

7.4.2 Modeling Survival Data

214

(1)

8 Time Series Methods

215

(14)

8.1 Classical Methods of Analysis

215

(9)

8.2 State Space Modeling

224

(5)

9 Estimating Equations

229

(18)

9.1 Composite Likelihood

234

(2)

9.2 Empirical Likelihood

236

(11)

9.2.1 Application to One-Sample Ranking Problems

239

(4)

9.2.2 Application to Two-Sample Ranking Problems

243

(4)

10 Symbolic Data Analysis

247

(28)

10.1 Introduction

247

(1)

10.2 Some Examples

247

(1)

10.3 Interval Data

248

(5)

10.3.1 Frequency

248

(3)

10.3.2 Sample Mean and Sample Variance

251

(2)

10.3.3 Realization In SODAS

253

(1)

10.4 Multi-nominal Data

253

(3)

10.4.1 Frequency

253

(3)

10.5 Symbolic Regression

256

(2)

10.5.1 Symbolic Regression for Interval Data

256

(1)

10.5.2 Symbolic Regression for Modal Data

257

(1)

10.5.3 Symbolic Regression in SODAS

257

(1)

10.6 Cluster Analysis

258

(1)

10.7 Factor Analysis

259

(1)

10.8 Factorial Discriminant Analysis

260

(1)

10.9 Application to Parkinson's Disease

260

(7)

10.9.1 Data Processing

261

(1)

10.9.2 Result Analysis

262

(1)

10.9.2.1 Viewer

262

(1)

10.9.2.2 Descriptive Statistics

262

(1)

10.9.2.3 Symbolic Regression Analysis

263

(1)

10.9.2.4 Symbolic Clustering

263

(1)

10.9.2.5 Principal Component Analysis

264

(3)

10.9.3 Comparison with Classical Method

267

(1)

10.10 Application to Cardiovascular Disease Analysis

267

(8)

10.10.1 Results of the Analysis

269

(4)

10.10.2 Comparison with the Classical Method

273

(2)

III Machine Learning for Big Data

275

(108)

11 Tools for Machine Learning

277

(52)

11.1 Regression Models

277

(2)

11.2 Simple Linear Regression

279

(10)

11.2.1 Least Squares Method

280

(2)

11.2.2 Statistical Inference on Regression Coefficients

282

(2)

11.2.3 Verifying the Assumptions on the Error Terms

284

(5)

11.3 Multiple Linear Regression

289

(7)

11.3.1 Multiple Linear Regression Model

289

(1)

11.3.2 Normal Equations

290

(1)

11.3.3 Statistical Inference on Regression Coefficients

291

(1)

11.3.4 Model Fit Evaluation

292

(4)

11.4 Regression in Machine Learning

296

(10)

11.4.1 Optimization for Linear Regression in Machine Learning

298

(2)

11.4.1.1 Gradient Descent

300

(1)

11.4.1.2 Feature Standardization

301

(2)

11.4.1.3 Computing Cost on a Test Set

303

(3)

11.5 Classification Models

306

(23)

11.5.1 Logistic Regression

307

(1)

11.5.1.1 Optimization with Maximal Likelihood for Logistic Regression

308

(2)

11.5.1.2 Statistical Inference

310

(1)

11.5.2 Logistic Regression for Binary Classification

311

(1)

11.5.2.1 Kullback-Leibler Divergence

312

(4)

11.5.3 Logistic Regression with Multiple Response Classes

316

(1)

11.5.4 Regularization for Regression Models in Machine Learning

317

(2)

11.5.4.1 Ridge Regression

319

(1)

11.5.4.2 Lasso Regression

320

(1)

11.5.4.3 The Choice of Regularization Method

321

(1)

11.5.5 Support Vector Machines (SVM)

321

(1)

11.5.5.1 Introduction

321

(1)

11.5.5.2 Finding the Optimal Hyperplane

322

(3)

11.5.5.3 SVM for Nonlinearly Separable Data Sets

325

(1)

11.5.5.4 Illustrating SVM

325

(4)

12 Neural Networks

329

(54)

12.1 Feed-Forward Networks

329

(21)

12.1.1 Motivation

330

(3)

12.1.2 Introduction to Neural Networks

333

(1)

12.1.3 Building a Deep Feed-Forward Network

334

(6)

12.1.4 Learning in Deep Networks

340

(1)

12.1.4.1 Quantitative Model

341

(1)

12.1.4.2 Binary Classification Model

342

(1)

12.1.5 Generalization

342

(3)

12.1.5.1 A Machine Learning Approach to Generalization

345

(5)

12.2 Recurrent Neural Networks

350

(16)

12.2.1 Building a Recurrent Neural Network

350

(2)

12.2.2 Learning in Recurrent Networks

352

(2)

12.2.3 Most Common Design Structures of RNNs

354

(3)

12.2.4 Deep RNN

357

(2)

12.2.5 Bidirectional RNN

359

(2)

12.2.6 Long-Term Dependencies and LSTM RNN

361

(3)

12.2.7 Reduction for Exploding Gradients

364

(2)

12.3 Convolution Neural Networks

366

(10)

12.3.1 Convolution Operator for Arrays

368

(1)

12.3.1.1 Properties of the Convolution Operator

369

(3)

12.3.2 Convolution Layers

372

(3)

12.3.3 Pooling Layers

375

(1)

12.4 Text Analytics

376

(7)

12.4.1 Introduction

376

(2)

12.4.2 General Architecture

378

(5)

IV Computational Methods for Statistical Inference

383

(44)

13 Bayesian Computation Methods

385

(42)

13.1 Data Augmentation Methods

385

(2)

13.2 Metropolis-Hastings Algorithm

387

(2)

13.3 Gibbs Sampling

389

(1)

13.4 EM Algorithm

390

(10)

13.4.1 Application to Ranking

391

(7)

13.4.2 Extension to Several Populations

398

(2)

13.5 Variational Bayesian Methods

400

(4)

13.5.1 Optimization of the Variational Distribution

402

(2)

13.6 Bayesian Nonparametric Methods

404

(23)

13.6.1 Dirichlet Prior

404

(4)

13.6.2 The Poisson-Dirichlet Prior

408

(1)

13.6.3 Simulation of Bayesian Posterior Distributions

408

(2)

13.6.4 Other Applications

410

(17)

Index

427

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97830310678466e.html

Märksõnad:

E-raamat: Statistical Inference and Machine Learning for Big Data

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv