Muutke küpsiste eelistusi

E-raamat: Introduction to Machine Learning, fourth edition

(Özyegin University)
  • Formaat - EPUB+DRM
  • Hind: 82,08 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

A substantially revised fourth edition of a comprehensive textbook, including new coverage of recent advances in deep learning and neural networks.

The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Machine learning underlies such exciting new technologies as self-driving cars, speech recognition, and translation applications. This substantially revised fourth edition of a comprehensive, widely used machine learning textbook offers new coverage of recent advances in the field in both theory and practice, including developments in deep learning and neural networks.

The book covers a broad array of topics not usually included in introductory machine learning texts, including supervised learning, Bayesian decision theory, parametric methods, semiparametric methods, nonparametric methods, multivariate analysis, hidden Markov models, reinforcement learning, kernel machines, graphical models, Bayesian estimation, and statistical testing. The fourth edition offers a new chapter on deep learning that discusses training, regularizing, and structuring deep neural networks such as convolutional and generative adversarial networks; new material in the chapter on reinforcement learning that covers the use of deep networks, the policy gradient methods, and deep reinforcement learning; new material in the chapter on multilayer perceptrons on autoencoders and the word2vec network; and discussion of a popular method of dimensionality reduction, t-SNE. New appendixes offer background material on linear algebra and optimization. End-of-chapter exercises help readers to apply concepts learned. Introduction to Machine Learning can be used in courses for advanced undergraduate and graduate students and as a reference for professionals.



A substantially revised fourth edition of a comprehensive textbook, including new coverage of recent advances in deep learning and neural networks.
Preface xix
Notations xxiii
1 Introduction
1(22)
1.1 What Is Machine Learning?
1(3)
1.2 Examples of Machine Learning Applications
4(9)
1.2.1 Association Rules
4(1)
1.2.2 Classification
4(5)
1.2.3 Regression
9(2)
1.2.4 Unsupervised Learning
11(1)
1.2.5 Reinforcement Learning
12(1)
1.3 History
13(2)
1.4 Related Topics
15(3)
1.4.1 High-Performance Computing
15(1)
1.4.2 Data Privacy and Security
16(1)
1.4.3 Model Interpretability and Trust
17(1)
1.4.4 Data Science
18(1)
1.5 Exercises
18(2)
1.6 References
20(3)
2 Supervised Learning
23(194)
2.1 Learning a Class from Examples
23(6)
2.2 Vapnik-Chervonenkis Dimension
29(2)
2.3 Probably Approximately Correct Learning
31(1)
2.4 Noise
32(2)
2.5 Learning Multiple Classes
34(156)
8.2 Nonparametric Density Estimation
190(6)
8.2.1 Histogram Estimator
191(1)
8.2.2 Kernel Estimator
192(2)
8.2.3 fe-Nearest Neighbor Estimator
194(2)
8.3 Generalization to Multivariate Data
196(1)
8.4 Nonparametric Classification
197(1)
8.5 Condensed Nearest Neighbor
198(2)
8.6 Distance-Based Classification
200(3)
8.7 Outlier Detection
203(2)
8.8 Nonparametric Regression: Smoothing Models
205(3)
8.8.1 Running Mean Smoother
205(2)
8.8.2 Kernel Smoother
207(1)
8.8.3 Running Line Smoother
208(1)
8.9 How to Choose the Smoothing Parameter
208(1)
8.10 Notes
209(3)
8.11 Exercises
212(2)
8.12 References
214(3)
9 Decision Trees
217(26)
9.1 Introduction
217(2)
9.2 Univariate Trees
219(7)
9.2.1 Classification Trees
220(4)
9.2.2 Regression Trees
224(2)
9.3 Pruning
226(3)
9.4 Rule Extraction from Trees
229(1)
9.5 Learning Rules from Data
230(4)
9.6 Multivariate Trees
234(2)
9.7 Notes
236(3)
9.8 Exercises
239(2)
9.9 References
241(2)
10 Linear Discrimination
243(28)
10.1 Introduction
243(2)
10.2 Generalizing the Linear Model
245(1)
10.3 Geometry of the Linear Discriminant
246(4)
10.3.1 Two Classes
246(2)
10.3.2 Multiple Classes
248(2)
10.4 Pairwise Separation
250(1)
10.5 Parametric Discrimination Revisited
251(1)
10.6 Gradient Descent
252(2)
10.7 Logistic Discrimination
254(10)
10.7.1 Two Classes
254(3)
10.7.2 Multiple Classes
257(6)
10.7.3 Multiple Labels
263(1)
10.8 Learning to Rank
264(1)
10.9 Notes
265(2)
10.10 Exercises
267(2)
10.11 References
269(2)
11 Multilayer Perceptrons
271(42)
11.1 Introduction
271(4)
11.1.1 Understanding the Brain
272(1)
11.1.2 Neural Networks as a Paradigm for Parallel Processing
273(2)
11.2 The Perceptron
275(3)
11.3 Training a Perceptron
278(4)
11.4 Learning Boolean Functions
282(1)
11.5 Multilayer Perceptrons
283(3)
11.6 MLP as a Universal Approximator
286(2)
11.7 Backpropagation Algorithm
288(7)
11.7.1 Nonlinear Regression
288(3)
11.7.2 Two-Class Discrimination
291(1)
11.7.3 Multiclass Discrimination
292(2)
11.7.4 Multilabel Discrimination
294(1)
11.8 Overtraining
295(1)
11.9 Learning Hidden Representations
296(5)
11.10 Autoencoders
301(2)
11.11 Word2vec Architecture
303(4)
11.12 Notes
307(2)
11.13 Exercises
309(1)
11.14 References
310(3)
12 Deep Learning
313(48)
12.1 Introduction
313(4)
12.2 How to Train Multiple Hidden Layers
317(4)
12.2.1 Rectified Linear Unit
317(1)
12.2.2 Initialization
317(1)
12.2.3 Generalizing Backpropagation to Multiple Hidden Layers
318(3)
12.3 Improving Training Convergence
321(4)
12.3.1 Momentum
321(1)
12.3.2 Adaptive Learning Factor
322(1)
12.3.3 Batch Normalization
323(2)
12.4 Regularization
325(6)
12.4.1 Hints
325(2)
12.4.2 Weight Decay
327(3)
12.4.3 Dropout
330(1)
12.5 Convolutional Layers
331(9)
12.5.1 The Idea
331(2)
12.5.2 Formalization
333(4)
12.5.3 Examples: LeNet-5 and AlexNet
337(1)
12.5.4 Extensions
338(2)
12.5.5 Multimodal Deep Networks
340(1)
12.6 Tuning the Network Structure
340(4)
12.6.1 Structure and Hyperparameter Search
340(2)
12.6.2 Skip Connections
342(1)
12.6.3 Gating Units
343(1)
12.7 Learning Sequences
344(6)
12.7.1 Example Tasks
344(1)
12.7.2 Time-Delay Neural Networks
345(1)
12.7.3 Recurrent Networks
345(3)
12.7.4 Long Short-Term Memory Unit
348(1)
12.7.5 Gated Recurrent Unit
349(1)
12.8 Generative Adversarial Network
350(3)
12.9 Notes
353(1)
12.10 Exercises
354(2)
12.11 References
356(5)
13 Local Models
361(34)
13.1 Introduction
361(1)
13.2 Competitive Learning
362(8)
13.2.1 Online k-Means
362(5)
13.2.2 Adaptive Resonance Theory
367(1)
13.2.3 Self-Organizing Maps
368(2)
13.3 Radial Basis Functions
370(6)
13.4 Incorporating Rule-Based Knowledge
376(1)
13.5 Normalized Basis Functions
377(2)
13.6 Competitive Basis Functions
379(3)
13.7 Learning Vector Quantization
382(1)
13.8 The Mixture of Experts
382(4)
13.8.1 Cooperative Experts
385(1)
13.8.2 Competitive Experts
386(1)
13.9 Hierarchical Mixture of Experts and Soft Decision Trees
386(2)
13.10 Notes
388(1)
13.11 Exercises
389(3)
13.12 References
392(3)
14 Kernel Machines
395(38)
14.1 Introduction
395(2)
14.2 Optimal Separating Hyperplane
397(4)
14.3 The Nonseparable Case: Soft Margin Hyperplane
401(3)
14.4 v-SVM
404(1)
14.5 Kernel Trick
405(2)
14.6 Vectorial Kernels
407(3)
14.7 Defining Kernels
410(1)
14.8 Multiple Kernel Learning
411(2)
14.9 Multiclass Kernel Machines
413(1)
14.10 Kernel Machines for Regression
414(5)
14.11 Kernel Machines for Ranking
419(1)
14.12 One-Class Kernel Machines
420(3)
14.13 Large Margin Nearest Neighbor Classifier
423(2)
14.14 Kernel Dimensionality Reduction
425(1)
14.15 Notes
426(2)
14.16 Exercises
428(1)
14.17 References
429(4)
15 Graphical Models
433(30)
15.1 Introduction
433(2)
15.2 Canonical Cases for Conditional Independence
435(7)
15.3 Generative Models
442(3)
15.4 d-Separation
445(1)
15.5 Belief Propagation
445(8)
15.5.1 Chains
446(2)
15.5.2 Trees
448(2)
15.5.3 Polytrees
450(2)
15.5.4 Junction Trees
452(1)
15.6 Undirected Graphs: Markov Random Fields
453(3)
15.7 Learning the Structure of a Graphical Model
456(1)
15.8 Influence Diagrams
457(1)
15.9 Notes
458(1)
15.10 Exercises
459(2)
15.11 References
461(2)
16 Hidden Markov Models
463(28)
16.1 Introduction
463(1)
16.2 Discrete Markov Processes
464(3)
16.3 Hidden Markov Models
467(2)
16.4 Three Basic Problems of HMMs
469(1)
16.5 Evaluation Problem
469(4)
16.6 Finding the State Sequence
473(2)
16.7 Learning Model Parameters
475(3)
16.8 Continuous Observations
478(1)
16.9 The HMM as a Graphical Model
479(3)
16.10 Model Selection in HMMs
482(2)
16.11 Notes
484(2)
16.12 Exercises
486(3)
16.13 References
489(2)
17 Bayesian Estimation
491(42)
17.1 Introduction
491(4)
17.2 Bayesian Estimation of the Parameters of a Discrete Distribution
495(2)
17.2.1 K > 2 States: Dirichlet Distribution
495(1)
17.2.2 K = 2 States: Beta Distribution
496(1)
17.3 Bayesian Estimation of the Parameters of a Gaussian Distribution
497(5)
17.3.1 Univariate Case: Unknown Mean, Known Variance
497(2)
17.3.2 Univariate Case: Unknown Mean, Unknown Variance
499(2)
17.3.3 Multivariate Case: Unknown Mean, Unknown Covariance
501(1)
17.4 Bayesian Estimation of the Parameters of a Function
502(10)
17.4.1 Regression
502(4)
17.4.2 Regression with Prior on Noise Precision
506(1)
17.4.3 The Use of Basis/Kernel Functions
507(2)
17.4.4 Bayesian Classification
509(3)
17.5 Choosing a Prior
512(1)
17.6 Bayesian Model Comparison
513(3)
17.7 Bayesian Estimation of a Mixture Model
516(3)
17.8 Nonparametric Bayesian Modeling
519(1)
17.9 Gaussian Processes
520(4)
17.10 Dirichlet Processes and Chinese Restaurants
524(2)
17.11 Latent Dirichlet Allocation
526(2)
17.12 Beta Processes and Indian Buffets
528(1)
17.13 Notes
529(1)
17.14 Exercises
530(1)
17.15 References
531(2)
18 Combining Multiple Learners
533(30)
18.1 Rationale
533(1)
18.2 Generating Diverse Learners
534(3)
18.3 Model Combination Schemes
537(1)
18.4 Voting
538(4)
18.5 Error-Correcting Output Codes
542(2)
18.6 Bagging
544(1)
18.7 Boosting
545(3)
18.8 The Mixture of Experts Revisited
548(2)
18.9 Stacked Generalization
550(1)
18.10 Fine-Tunmg an Ensemble
551(2)
18.10.1 Choosing a Subset of the Ensemble
552(1)
18.10.2 Constructing Metalearners
552(1)
18.11 Cascading
553(2)
18.12 Notes
555(2)
18.13 Exercises
557(2)
18.14 References
559(4)
19 Reinforcement Learning
563(34)
19.1 Introduction
563(2)
19.2 Single State Case: K-Armed Bandit
565(1)
19.3 Elements of Reinforcement Learning
566(3)
19.4 Model-Based Learning
569(2)
19.4.1 Value Iteration
569(1)
19.4.2 Policy Iteration
570(1)
19.5 Temporal Difference Learning
571(6)
19.5.1 Exploration Strategies
571(1)
19.5.2 Deterministic Rewards and Actions
572(1)
19.5.3 Nondeterrninistic Rewards and Actions
573(3)
19.5.4 Eligibility Traces
576(1)
19.6 Generalization
577(3)
19.7 Partially Observable States
580(7)
19.7.1 The Setting
580(2)
19.7.2 Example: The Tiger Problem
582(5)
19.8 Deep Q Learning
587(1)
19.9 Policy Gradients
588(3)
19.10 Learning to Play Backgammon and Go
591(1)
19.11 Notes
592(1)
19.12 Exercises
593(2)
19.13 References
595(2)
20 Design and Analysis of Machine Learning Experiments
597(46)
20.1 Introduction
597(3)
20.2 Factors, Response, and Strategy of Experimentation
600(3)
20.3 Response Surface Design
603(1)
20.4 Randomization, Replication, and Blocking
604(1)
20.5 Guidelines for Machine Learning Experiments
605(3)
20.6 Cross-Validation and Resampling Methods
608(3)
20.6.1 K-Fold Cross-Validation
609(1)
20.6.2 5 x 2 Cross-Validation
610(1)
20.6.3 Bootstrapping
611(1)
20.7 Measuring Classifier Performance
611(3)
20.8 Interval Estimation
614(4)
20.9 Hypothesis Testing
618(2)
20.10 Assessing a Classification Algorithm's Performance
620(3)
20.10.1 Binomial Test
621(1)
20.10.2 Approximate Normal Test
622(1)
20.10.3 T Test
622(1)
20.11 Comparing Two Classification Algorithms
623(3)
20.11.1 McNemar's Test
623(1)
20.11.2 K-Fold Cross-Validated Paired t Test
623(1)
20.11.3 5 × 2 cv Paired t Test
624(1)
20.11.4 5 × 2 cv Paired F Test
625(1)
20.12 Comparing Multiple Algorithms: Analysis of Variance
626(4)
20.13 Comparison over Multiple Datasets
630(4)
20.13.1 Comparing Two Algorithms
631(2)
20.13.2 Multiple Algorithms
633(1)
20.14 Multivariate Tests
634(3)
20.14.1 Comparing Two Algorithms
635(1)
20.14.2 Comparing Multiple Algorithms
636(1)
20.15 Notes
637(1)
20.16 Exercises
638(2)
20.17 References
640(3)
A Probability
643(12)
A.1 Elements of Probability
643(2)
A.1.1 Axioms of Probability
644(1)
A.1.2 Conditional Probability
644(1)
A.2 Random Variables
645(4)
A.2.1 Probability Distribution and Density Functions
645(1)
A.2.2 Joint Distribution and Density Functions
646(1)
A.2.3 Conditional Distributions
646(1)
A.2.4 Bayes' Rule
647(1)
A.2.5 Expectation
647(1)
A.2.6 Variance
648(1)
A.2.7 Weak Law of Large Numbers
649(1)
A.3 Special Random Variables
649(4)
A.3.1 Bernoulli Distribution
649(1)
A.3.2 Binomial Distribution
650(1)
A.3.3 Multinomial Distribution
650(1)
A.3.4 Uniform Distribution
650(1)
A.3.5 Normal (Gaussian) Distribution
651(1)
A.3.6 Chi-Square Distribution
652(1)
A.3.7 T Distribution
653(1)
A.3.8 F Distribution
653(1)
A.4 References
653(2)
B Linear Algebra
655(10)
B.1 Vectors
655(2)
B.2 Matrices
657(1)
B.3 Similarity of Vectors
658(1)
B.4 Square Matrices
659(1)
B.5 Linear Dependence and Ranks
659(1)
B.6 Inverses
660(1)
B.7 Positive Definite Matrices
660(1)
B.8 Trace and Determinant
660(1)
B.9 Eigenvalues and Eigenvectors
661(1)
B.10 Spectral Decomposition
662(1)
B.11 Singular Value Decomposition
662(1)
B.12 References
663(2)
C Optimization
665(8)
C.1 Introduction
665(2)
C.2 Linear Optimization
667(1)
C.3 Convex Optimization
667(1)
C.4 Duality
668(2)
C.5 Local Optimization
670(1)
C.6 References
671(2)
Index 673