Muutke küpsiste eelistusi

E-raamat: Official Google Cloud Certified Professional Data Engineer Study Guide

  • Formaat: PDF+DRM
  • Ilmumisaeg: 18-May-2020
  • Kirjastus: Sybex Inc.,U.S.
  • Keel: eng
  • ISBN-13: 9781119618447
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 48,75 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Raamatukogudele
  • Formaat: PDF+DRM
  • Ilmumisaeg: 18-May-2020
  • Kirjastus: Sybex Inc.,U.S.
  • Keel: eng
  • ISBN-13: 9781119618447
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

The proven Study Guide that prepares you for this new Google Cloud exam

The Google Cloud Certified Professional Data Engineer Study Guide, provides everything you need to prepare for this important exam and master the skills necessary to land that coveted Google Cloud Professional Data Engineer certification. Beginning with a pre-book assessment quiz to evaluate what you know before you begin, each chapter features exam objectives and review questions, plus the online learning environment includes additional complete practice tests. 

Written by Dan Sullivan, a popular and experienced online course author for machine learning, big data, and Cloud topics, Google Cloud Certified Professional Data Engineer Study Guide is your ace in the hole for deploying and managing analytics and machine learning applications. 

•    Build and operationalize storage systems, pipelines, and compute infrastructure

•    Understand machine learning models and learn how to select pre-built models

•    Monitor and troubleshoot machine learning models

•    Design analytics and machine learning applications that are secure, scalable, and highly available. 

This exam guide is designed to help you develop an in depth understanding of data engineering and machine learning on Google Cloud Platform.

Introduction xxiii
Assessment Test xxix
Chapter 1 Selecting Appropriate Storage Technologies
1(28)
From Business Requirements to Storage Systems
2(6)
Ingest
3(2)
Store
5(1)
Process and Analyze
6(2)
Explore and Visualize
8(1)
Technical Aspects of Data: Volume, Velocity, Variation, Access, and Security
8(4)
Volume
8(1)
Velocity
9(1)
Variation in Structure
10(1)
Data Access Patterns
11(1)
Security Requirements
12(1)
Types of Structure: Structured, Semi-Structured, and Unstructured
12(4)
Structured: Transactional vs. Analytical
13(1)
Semi-Structured: Fully Indexed vs. Row Key Access
13(2)
Unstructured Data
15(1)
Google's Storage Decision Tree
16(1)
Schema Design Considerations
16(7)
Relational Database Design
17(3)
NoSQL Database Design
20(3)
Exam Essentials
23(1)
Review Questions
24(5)
Chapter 2 Building and Operationalizing Storage Systems
29(32)
Cloud SQL
30(4)
Configuring Cloud SQL
31(2)
Improving Read Performance with Read Replicas
33(1)
Importing and Exporting Data
33(1)
Cloud Spanner
34(3)
Configuring Cloud Spanner
34(1)
Replication in Cloud Spanner
35(1)
Database Design Considerations
36(1)
Importing and Exporting Data
36(1)
Cloud Bigtable
37(2)
Configuring Bigtable
37(1)
Database Design Considerations
38(1)
Importing and Exporting
39(1)
Cloud Firestore
39(3)
Cloud Firestore Data Model
40(1)
Indexing and Querying
41(1)
Importing and Exporting
42(1)
BigQuery
42(6)
BigQuery Datasets
43(1)
Loading and Exporting Data
44(1)
Clustering, Partitioning, and Sharding Tables
45(1)
Streaming Inserts
46(1)
Monitoring and Logging in BigQuery
46(1)
BigQuery Cost Considerations
47(1)
Tips for Optimizing BigQuery
47(1)
Cloud Memorystore
48(2)
Cloud Storage
50(3)
Organizing Objects in a Namespace
50(1)
Storage Tiers
51(1)
Cloud Storage Use Cases
52(1)
Data Retention and Lifecycle Management
52(1)
Unmanaged Databases
53(1)
Exam Essentials
54(2)
Review Questions
56(5)
Chapter 3 Designing Data Pipelines
61(28)
Overview of Data Pipelines
62(11)
Data Pipeline Stages
63(3)
Types of Data Pipelines
66(7)
GCP Pipeline Components
73(9)
Cloud Pub/Sub
74(2)
Cloud Dataflow
76(3)
Cloud Dataproc
79(3)
Cloud Composer
82(1)
Migrating Hadoop and Spark to GCP
82(1)
Exam Essentials
83(3)
Review Questions
86(3)
Chapter 4 Designing a Data Processing Solution
89(22)
Designing Infrastructure
90(8)
Choosing Infrastructure
90(3)
Availability, Reliability, and Scalability of Infrastructure
93(3)
Hybrid Cloud and Edge Computing
96(2)
Designing for Distributed Processing
98(4)
Distributed Processing: Messaging
98(3)
Distributed Processing: Services
101(1)
Migrating a Data Warehouse
102(3)
Assessing the Current State of a Data Warehouse
102(1)
Designing the Future State of a Data Warehouse
103(1)
Migrating Data, Jobs, and Access Controls
104(1)
Validating the Data Warehouse
105(1)
Exam Essentials
105(2)
Review Questions
107(4)
Chapter 5 Building and Operationalizing Processing Infrastructure
111(28)
Provisioning and Adjusting Processing Resources
112(18)
Provisioning and Adjusting Compute Engine
113(5)
Provisioning and Adjusting Kubernetes Engine
118(6)
Provisioning and Adjusting Cloud Bigtable
124(3)
Provisioning and Adjusting Cloud Dataproc
127(2)
Configuring Managed Serverless Processing Services
129(1)
Monitoring Processing Resources
130(2)
Stackdriver Monitoring
130(1)
Stackdriver Logging
130(1)
Stackdriver Trace
131(1)
Exam Essentials
132(2)
Review Questions
134(5)
Chapter 6 Designing for Security and Compliance
139(26)
Identity and Access Management with Cloud IAM
140(8)
Predefined Roles
141(2)
Custom Roles
143(2)
Using Roles with Service Accounts
145(1)
Access Control with Policies
146(2)
Using IAM with Storage and Processing Services
148(3)
Cloud Storage and IAM
148(1)
Cloud Bigtable and IAM
149(1)
BigQuery and IAM
149(1)
Cloud Dataflow and IAM
150(1)
Data Security
151(3)
Encryption
151(2)
Key Management
153(1)
Ensuring Privacy with the Data Loss Prevention API
154(2)
Detecting Sensitive Data
154(1)
Running Data Loss Prevention Jobs
155(1)
Inspection Best Practices
156(1)
Legal Compliance
156(2)
Health Insurance Portability and Accountability Act (HIPAA)
156(1)
Children's Online Privacy Protection Act
157(1)
FedRAMP
158(1)
General Data Protection Regulation
158(1)
Exam Essentials
158(3)
Review Questions
161(4)
Chapter 7 Designing Databases for Reliability, Scalability, and Availability
165(26)
Designing Cloud Bigtable Databases for Scalability and Reliability
166(6)
Data Modeling with Cloud Bigtable
166(2)
Designing Row-keys
168(2)
Designing for Time Series
170(1)
Use Replication for Availability and Scalability
171(1)
Designing Cloud Spanner Databases for Scalability and Reliability
172(7)
Relational Database Features
173(1)
Interleaved Tables
174(1)
Primary Keys and Hotspots
174(1)
Database Splits
175(1)
Secondary Indexes
176(1)
Query Best Practices
177(2)
Designing BigQuery Databases for Data Warehousing
179(6)
Schema Design for Data Warehousing
179(2)
Clustered and Partitioned Tables
181(1)
Querying Data in BigQuery
182(1)
External Data Access
183(2)
BigQuery ML
185(1)
Exam Essentials
185(3)
Review Questions
188(3)
Chapter 8 Understanding Data Operations for Flexibility and Portability
191(18)
Cataloging and Discovery with Data Catalog
192(3)
Searching in Data Catalog
193(1)
Tagging in Data Catalog
194(1)
Data Preprocessing with Dataprep
195(3)
Cleansing Data
196(1)
Discovering Data
196(1)
Enriching Data
197(1)
Importing and Exporting Data
197(1)
Structuring and Validating Data
198(1)
Visualizing with Data Studio
198(2)
Connecting to Data Sources
198(2)
Visualizing Data
200(1)
Sharing Data
200(1)
Exploring Data with Cloud Datalab
200(2)
Jupyter Notebooks
201(1)
Managing Cloud Datalab Instances
201(1)
Adding Libraries to Cloud Datalab Instances
202(1)
Orchestrating Workflows with Cloud Composer
202(2)
Airflow Environments
203(1)
Creating DAGs
203(1)
Airflow Logs
204(1)
Exam Essentials
204(2)
Review Questions
206(3)
Chapter 9 Deploying Machine Learning Pipelines
209(22)
Structure of ML Pipelines
210(11)
Data Ingestion
211(1)
Data Preparation
212(3)
Data Segregation
215(2)
Model Training
217(1)
Model Evaluation
218(2)
Model Deployment
220(1)
Model Monitoring
221(1)
GCP Options for Deploying Machine Learning Pipeline
221(4)
Cloud AutoML
221(2)
BigQuery ML
223(1)
Kubeflow
223(1)
Spark Machine Learning
224(1)
Exam Essentials
225(2)
Review Questions
227(4)
Chapter 10 Choosing Training and Serving Infrastructure
231(16)
Hardware Accelerators
232(2)
Graphics Processing Units
232(1)
Tensor Processing Units
233(1)
Choosing Between CPUs, GPUs, and TPUs
233(1)
Distributed and Single Machine Infrastructure
234(3)
Single Machine Model Training
234(1)
Distributed Model Training
235(1)
Serving Models
236(1)
Edge Computing with GCP
237(4)
Edge Computing Overview
237(2)
Edge Computing Components and Processes
239(1)
Edge TPU
240(1)
Cloud IoT
240(1)
Exam Essentials
241(3)
Review Questions
244(3)
Chapter 11 Measuring, Monitoring, and Troubleshooting Machine Learning Models
247(22)
Three Types of Machine Learning Algorithms
248(7)
Supervised Learning
248(5)
Unsupervised Learning
253(1)
Anomaly Detection
254(1)
Reinforcement Learning
254(1)
Deep Learning
255(2)
Engineering Machine Learning Models
257(6)
Model Training and Evaluation
257(5)
Operationalizing ML Models
262(1)
Common Sources of Error in Machine Learning Models
263(2)
Data Quality
264(1)
Unbalanced Training Sets
264(1)
Types of Bias
264(1)
Exam Essentials
265(2)
Review Questions
267(2)
Chapter 12 Leveraging Prebuilt Models as a Service
269(16)
Sight
270(4)
Vision AI
270(2)
Video AI
272(2)
Conversation
274(2)
Dialogflow
274(1)
Cloud Text-to-Speech API
275(1)
Cloud Speech-to-Text API
275(1)
Language
276(2)
Translation
276(1)
Natural Language
277(1)
Structured Data
278(2)
Recommendations AI API
278(2)
Cloud Inference API
280(1)
Exam Essentials
280(2)
Review Questions
282(3)
Appendix Answers to Review Questions
285(22)
Chapter 1 Selecting Appropriate Storage Technologies
286(2)
Chapter 2 Building and Operationalizing Storage Systems
288(2)
Chapter 3 Designing Data Pipelines
290(1)
Chapter 4 Designing a Data Processing Solution
291(2)
Chapter 5 Building and Operationalizing Processing Infrastructure
293(2)
Chapter 6 Designing for Security and Compliance
295(1)
Chapter 7 Designing Databases for Reliability, Scalability, and Availability
296(2)
Chapter 8 Understanding Data Operations for Flexibility and Portability
298(1)
Chapter 9 Deploying Machine Learning Pipelines
299(2)
Chapter 10 Choosing Training and Serving Infrastructure
301(2)
Chapter 11 Measuring, Monitoring, and Troubleshooting Machine Learning Models
303(1)
Chapter 12 Leveraging Prebuilt Models as a Service
304(3)
Index 307
DAN SULLIVAN is a software architect specializing in data architecture, machine learning, and cloud computing. Dan is a Google Cloud Certified Professional Data Engineer, Professional Architect, and Associate Cloud Engineer. Dan is the author of six books and numerous articles. He is an instructor with LinkedIn Learning and Udemy for Business.