Preface |
|
xiii | |
|
1 Introduction to Data Science on AWS |
|
|
1 | (28) |
|
Benefits of Cloud Computing |
|
|
1 | (3) |
|
Data Science Pipelines and Workflows |
|
|
4 | (3) |
|
|
7 | (3) |
|
Amazon AI Services and AutoML with Amazon SageMaker |
|
|
10 | (3) |
|
Data Ingestion, Exploration, and Preparation in AWS |
|
|
13 | (5) |
|
Model Training and Tuning with Amazon SageMaker |
|
|
18 | (3) |
|
Model Deployment with Amazon SageMaker and AWS Lambda Functions |
|
|
21 | (1) |
|
Streaming Analytics and Machine Learning on AWS |
|
|
21 | (2) |
|
AWS Infrastructure and Custom-Built Hardware |
|
|
23 | (3) |
|
Reduce Cost with Tags, Budgets, and Alerts |
|
|
26 | (1) |
|
|
26 | (3) |
|
|
29 | (46) |
|
Innovation Across Every Industry |
|
|
29 | (1) |
|
Personalized Product Recommendations |
|
|
30 | (6) |
|
Detect Inappropriate Videos with Amazon Rekognition |
|
|
36 | (2) |
|
|
38 | (4) |
|
Identify Fake Accounts with Amazon Fraud Detector |
|
|
42 | (1) |
|
Enable Privacy-Leak Detection with Amazon Macie |
|
|
43 | (1) |
|
Conversational Devices and Voice Assistants |
|
|
44 | (1) |
|
Text Analysis and Natural Language Processing |
|
|
45 | (5) |
|
Cognitive Search and Natural Language Understanding |
|
|
50 | (1) |
|
Intelligent Customer Support Centers |
|
|
51 | (1) |
|
Industrial AI Services and Predictive Maintenance |
|
|
52 | (1) |
|
Home Automation with AWS IoT and Amazon SageMaker |
|
|
53 | (1) |
|
Extract Medical Information from Healthcare Documents |
|
|
54 | (1) |
|
Self-Optimizing and Intelligent Cloud Infrastructure |
|
|
55 | (1) |
|
Cognitive and Predictive Business Intelligence |
|
|
56 | (4) |
|
Educating the Next Generation of AI and ML Developers |
|
|
60 | (5) |
|
Program Nature's Operating System with Quantum Computing |
|
|
65 | (5) |
|
Increase Performance and Reduce Cost |
|
|
70 | (3) |
|
|
73 | (2) |
|
3 Automated Machine Learning |
|
|
75 | (22) |
|
Automated Machine Learning with SageMaker Autopilot |
|
|
76 | (2) |
|
Track Experiments with SageMaker Autopilot |
|
|
78 | (1) |
|
Train and Deploy a Text Classifier with SageMaker Autopilot |
|
|
78 | (13) |
|
Automated Machine Learning with Amazon Comprehend |
|
|
91 | (4) |
|
|
95 | (2) |
|
4 Ingest Data into the Cloud |
|
|
97 | (30) |
|
|
98 | (7) |
|
Query the Amazon S3 Data Lake with Amazon Athena |
|
|
105 | (4) |
|
Continuously Ingest New Data with AWS Glue Crawler |
|
|
109 | (2) |
|
Build a Lake House with Amazon Redshift Spectrum |
|
|
111 | (7) |
|
Choose Between Amazon Athena and Amazon Redshift |
|
|
118 | (1) |
|
Reduce Cost and Increase Performance |
|
|
119 | (7) |
|
|
126 | (1) |
|
|
127 | (46) |
|
Tools for Exploring Data in AWS |
|
|
128 | (1) |
|
Visualize Our Data Lake with SageMaker Studio |
|
|
129 | (13) |
|
|
142 | (8) |
|
Create Dashboards with Amazon QuickSight |
|
|
150 | (1) |
|
Detect Data-Quality Issues with Amazon SageMaker and Apache Spark |
|
|
151 | (8) |
|
Detect Bias in Our Dataset |
|
|
159 | (7) |
|
Detect Different Types of Drift with SageMaker Clarify |
|
|
166 | (2) |
|
Analyze Our Data with AWS Glue DataBrew |
|
|
168 | (2) |
|
Reduce Cost and Increase Performance |
|
|
170 | (2) |
|
|
172 | (1) |
|
6 Prepare the Dataset for Model Training |
|
|
173 | (34) |
|
Perform Feature Selection and Engineering |
|
|
173 | (14) |
|
Scale Feature Engineering with SageMaker Processing Jobs |
|
|
187 | (7) |
|
Share Features Through SageMaker Feature Store |
|
|
194 | (4) |
|
Ingest and Transform Data with SageMaker Data Wrangler |
|
|
198 | (1) |
|
Track Artifact and Experiment Lineage with Amazon SageMaker |
|
|
199 | (5) |
|
Ingest and Transform Data with AWS Glue DataBrew |
|
|
204 | (2) |
|
|
206 | (1) |
|
|
207 | (70) |
|
Understand the SageMaker Infrastructure |
|
|
207 | (5) |
|
Deploy a Pre-Trained BERT Model with SageMaker JumpStart |
|
|
212 | (2) |
|
Develop a SageMaker Model |
|
|
214 | (2) |
|
A Brief History of Natural Language Processing |
|
|
216 | (3) |
|
BERT Transformer Architecture |
|
|
219 | (2) |
|
Training BERT from Scratch |
|
|
221 | (2) |
|
Fine Tune a Pre-Trained BERT Model |
|
|
223 | (3) |
|
Create the Training Script |
|
|
226 | (6) |
|
Launch the Training Script from a SageMaker Notebook |
|
|
232 | (7) |
|
|
239 | (6) |
|
Debug and Profile Model Training with SageMaker Debugger |
|
|
245 | (4) |
|
Interpret and Explain Model Predictions |
|
|
249 | (6) |
|
Detect Model Bias and Explain Predictions |
|
|
255 | (4) |
|
More Training Options for BERT |
|
|
259 | (9) |
|
Reduce Cost and Increase Performance |
|
|
268 | (6) |
|
|
274 | (3) |
|
8 Train and Optimize Models at Scale |
|
|
277 | (24) |
|
Automatically Find the Best Model Hyper-Parameters |
|
|
277 | (7) |
|
Use Warm Start for Additional SageMaker Hyper-Parameter Tuning Jobs |
|
|
284 | (4) |
|
Scale Out with SageMaker Distributed Training |
|
|
288 | (8) |
|
Reduce Cost and Increase Performance |
|
|
296 | (4) |
|
|
300 | (1) |
|
9 Deploy Models to Production |
|
|
301 | (68) |
|
Choose Real-Time or Batch Predictions |
|
|
301 | (1) |
|
Real-Time Predictions with SageMaker Endpoints |
|
|
302 | (8) |
|
Auto-Scale SageMaker Endpoints Using Amazon CloudWatch |
|
|
310 | (5) |
|
Strategies to Deploy New and Updated Models |
|
|
315 | (4) |
|
Testing and Comparing New Models |
|
|
319 | (12) |
|
Monitor Model Performance and Detect Drift |
|
|
331 | (4) |
|
Monitor Data Quality of Deployed SageMaker Endpoints |
|
|
335 | (6) |
|
Monitor Model Quality of Deployed SageMaker Endpoints |
|
|
341 | (4) |
|
Monitor Bias Drift of Deployed SageMaker Endpoints |
|
|
345 | (3) |
|
Monitor Feature Attribution Drift of Deployed SageMaker Endpoints |
|
|
348 | (3) |
|
Perform Batch Predictions with SageMaker Batch Transform |
|
|
351 | (5) |
|
AWS Lambda Functions and Amazon API Gateway |
|
|
356 | (1) |
|
Optimize and Manage Models at the Edge |
|
|
357 | (1) |
|
Deploy a PyTorch Model with TorchServe |
|
|
357 | (3) |
|
TensorFlow-BERT Inference with AWS Deep Java Library |
|
|
360 | (2) |
|
Reduce Cost and Increase Performance |
|
|
362 | (5) |
|
|
367 | (2) |
|
|
369 | (40) |
|
Machine Learning Operations |
|
|
369 | (2) |
|
|
371 | (1) |
|
Machine Learning Pipelines |
|
|
371 | (4) |
|
Pipeline Orchestration with SageMaker Pipelines |
|
|
375 | (11) |
|
Automation with SageMaker Pipelines |
|
|
386 | (5) |
|
|
391 | (9) |
|
Human-in-the-Loop Workflows |
|
|
400 | (6) |
|
Reduce Cost and Improve Performance |
|
|
406 | (1) |
|
|
407 | (2) |
|
11 Streaming Analytics and Machine Learning |
|
|
409 | (34) |
|
Online Learning Versus Offline Learning |
|
|
410 | (1) |
|
|
410 | (1) |
|
Windowed Queries on Streaming Data |
|
|
411 | (4) |
|
Streaming Analytics and Machine Learning on AWS |
|
|
415 | (2) |
|
Classify Real-Time Product Reviews with Amazon Kinesis, AWS Lambda, and Amazon SageMaker |
|
|
417 | (1) |
|
Implement Streaming Data Ingest Using Amazon Kinesis Data Firehose |
|
|
418 | (4) |
|
Summarize Real-Time Product Reviews with Streaming Analytics |
|
|
422 | (2) |
|
Setting Up Amazon Kinesis Data Analytics |
|
|
424 | (8) |
|
Amazon Kinesis Data Analytics Applications |
|
|
432 | (7) |
|
Classify Product Reviews with Apache Kafka, AWS Lambda, and Amazon SageMaker |
|
|
439 | (1) |
|
Reduce Cost and Improve Performance |
|
|
440 | (2) |
|
|
442 | (1) |
|
12 Secure Data Science on AWS |
|
|
443 | (44) |
|
Shared Responsibility Model Between AWS and Customers |
|
|
443 | (1) |
|
Applying AWS Identity and Access Management |
|
|
444 | (8) |
|
Isolating Compute and Network Environments |
|
|
452 | (3) |
|
Securing Amazon S3 Data Access |
|
|
455 | (8) |
|
|
463 | (4) |
|
|
467 | (2) |
|
Securing SageMaker Notebook Instances |
|
|
469 | (2) |
|
Securing SageMaker Studio |
|
|
471 | (2) |
|
Securing SageMaker Jobs and Models |
|
|
473 | (4) |
|
Securing AWS Lake Formation |
|
|
477 | (1) |
|
Securing Database Credentials with AWS Secrets Manager |
|
|
478 | (1) |
|
|
478 | (3) |
|
|
481 | (2) |
|
Reduce Cost and Improve Performance |
|
|
483 | (2) |
|
|
485 | (2) |
Index |
|
487 | |