Muutke küpsiste eelistusi

E-raamat: Azure Data Factory by Example: Practical Implementation for Data Engineers

  • Formaat: PDF+DRM
  • Ilmumisaeg: 09-Jun-2021
  • Kirjastus: APress
  • Keel: eng
  • ISBN-13: 9781484270295
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 67,91 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: PDF+DRM
  • Ilmumisaeg: 09-Jun-2021
  • Kirjastus: APress
  • Keel: eng
  • ISBN-13: 9781484270295
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Intermediate user level

Data engineers who need to hit the ground running will use this book to build skills in Azure Data Factory v2 (ADF). The tutorial-first approach to ADF taken in this book gets you working from the first chapter, explaining key ideas naturally as you encounter them. From creating your first data factory to building complex, metadata-driven nested pipelines, the book guides you through essential concepts in Microsoft’s cloud-based ETL/ELT platform. It introduces components indispensable for the movement and transformation of data in the cloud. Then it demonstrates the tools necessary to orchestrate, monitor, and manage those components.

The hands-on introduction to ADF found in this book is equally well-suited to data engineers embracing their first ETL/ELT toolset as it is to seasoned veterans of Microsoft’s SQL Server Integration Services (SSIS). The example-driven approach leads you through ADF pipeline construction from the ground up, introducing important ideas and making learning natural and engaging. SSIS users will find concepts with familiar parallels, while ADF-first readers will quickly master those concepts through the book’s steady building up of knowledge in successive chapters. Summaries of key concepts at the end of each chapter provide a ready reference that you can return to again and again.


What You Will Learn
  • Create pipelines, activities, datasets, and linked services
  • Build reusable components using variables, parameters, and expressions
  • Move data into and around Azure services automatically
  • Transform data natively using ADF data flows and Power Query data wrangling
  • Master flow-of-control and triggers for tightly orchestrated pipeline execution
  • Publish and monitor pipelines easily and with confidence


Who This Book Is For

Data engineers and ETL developers taking their first steps in Azure Data Factory, SQL Server Integration Services users making the transition toward doing ETL in Microsoft’s Azure cloud, and SQL Server database administrators involved in data warehousing and ETL operations

About the Author xv
About the Technical Reviewer xvii
Acknowledgments xix
Introduction xxi
Chapter 1 Creating an Azure Data Factory Instance 1(22)
Get Started in Azure
2(2)
Create a Free Azure Account
2(1)
Explore the Azure Portal
2(2)
Create a Resource Group
4(3)
Create an Azure Data Factory
7(2)
Explore the Azure Data Factory User Experience
9(3)
Navigation Header Bar
10(1)
Navigation Sidebar
11(1)
Link to a Git Repository
12(5)
Create a Git Repository in Azure Repos
13(2)
Link the Data Factory to the Git Repository
15(2)
The ADF UX as a Web-Based IDE
17(2)
Review
19(1)
Key Concepts
20(2)
For SSIS Developers
22(1)
Chapter 2 Your First Pipeline 23(22)
Work with Azure Storage
23(5)
Create an Azure Storage Account
23(3)
Explore Azure Storage
26(1)
Upload Sample Data
27(1)
Use the Copy Data Tool
28(4)
Explore Your Pipeline
32(8)
Linked Services
33(1)
Datasets
34(1)
Pipelines
35(1)
Activities
36(1)
Integration Runtimes
37(2)
Factory Resources in Git
39(1)
Debug Your Pipeline
40(2)
Run the Pipeline in Debug Mode
41(1)
Inspect Execution Results
42(1)
Review
42(1)
Key Concepts
42(2)
For SSIS Developers
44(1)
Chapter 3 The Copy Data Activity 45(38)
Prepare an Azure SQL Database
45(6)
Create the Database
46(3)
Create Database Objects
49(2)
Import Structured Data into Azure SQL DB
51(11)
Create the Basic Pipeline
51(8)
Process Multiple Files
59(2)
Truncate Before Load
61(1)
Map Source and Sink Schemas
62(5)
Create a New Source Dataset
63(1)
Create a New Pipeline
64(1)
Configure Schema Mapping
65(2)
Import Semi-structured Data into Azure SQL DB
67(6)
Create a JSON File Dataset
67(1)
Create the Pipeline
68(1)
Configure Schema Mapping
68(1)
Set the Collection Reference
69(1)
The Effect of Schema Drift
70(2)
Understanding Type Conversion
72(1)
Transform JSON Files into Parquet
73(3)
Create a New JSON Dataset
74(1)
Create a Parquet Dataset
74(1)
Create and Run the Transformation Pipeline
75(1)
Performance Settings
76(1)
Data Integration Unit
76(1)
Degree of Copy Parallelism
77(1)
Review
77(1)
Key Concepts
78(1)
Azure Data Factory User Experience (ADF UX)
79(2)
For SSIS Developers
81(2)
Chapter 4 Expressions 83(30)
Explore the Expression Builder
83(3)
Use System Variables
86(2)
Enable Storage of Audit Information
86(1)
Create a New Pipeline
86(1)
Add New Source Columns
86(1)
Run the Pipeline
87(1)
Access Activity Run Properties
88(6)
Create Database Objects
89(1)
Add Stored Procedure Activity
90(3)
Run the Pipeline
93(1)
Use the Lookup Activity
94(8)
Create Database Objects
94(2)
Configure the Lookup Activity
96(2)
Use Breakpoints
98(2)
Use the Lookup Value
100(1)
Update the Stored Procedure Activity
100(1)
Run the Pipeline
101(1)
User Variables
102(4)
Create a Variable
102(1)
Set a Variable
103(1)
Use the Variable
104(1)
Array Variables
105(1)
Concatenate Strings
106(2)
Infix Operators
107(1)
String Interpolation
107(1)
Escaping @
108(1)
Review
108(1)
Key Concepts
108(2)
For SSIS Developers
110(3)
Chapter 5 Parameters 113(32)
Set Up an Azure Key Vault
113(8)
Create a Key Vault
114(1)
Create a Key Vault Secret
115(1)
Grant Access to the Key Vault
116(2)
Create a Key Vault ADF Linked Service
118(1)
Create a New Storage Account Linked Service
119(2)
Use Dataset Parameters
121(6)
Create a Parameterized Dataset
123(1)
Use the Parameterized Dataset
124(2)
Reuse the Parameterized Dataset
126(1)
Use Linked Service Parameters
127(6)
Create a Parameterized Linked Service
127(4)
Increase Dataset Reusability
131(1)
Use the New Dataset
132(1)
Why Parameterize Linked Services?
133(1)
Use Pipeline Parameters
133(6)
Create a Parameterized Pipeline
133(2)
Run the Parameterized Pipeline
135(2)
Use the Execute Pipeline Activity
137(2)
Parallel Execution
139(1)
Global Parameters
139(1)
Review
140(1)
Key Concepts
141(1)
For SSIS Developers
142(3)
Chapter 6 Controlling Flow 145(36)
Create a Per-File Pipeline
145(2)
Use Activity Dependency Conditions
147(9)
Explore Dependency Condition Interactions
149(3)
Understand Pipeline Outcome
152(4)
Raise Errors
156(1)
Use Conditional Activities
157(10)
Divert Error Rows
157(4)
Load Error Rows
161(4)
Understand the Switch Activity
165(2)
Use Iteration Activities
167(9)
Use the Get Metadata Activity
167(2)
Use the ForEach Activity
169(3)
Ensure Parallelizability
172(3)
Understand the Until Activity
175(1)
Review
176(1)
Key Concepts
177(2)
For SSIS Developers
179(2)
Chapter 7 Data Flows 181(36)
Build a Data Flow
181(21)
Enable Data Flow Debugging
182(2)
Add a Data Flow Transformation
184(4)
Use the Filter Transformation
188(3)
Use the Lookup Transformation
191(3)
Use the Derived Column Transformation
194(2)
Use the Select Transformation
196(1)
Use the Sink Transformation
197(1)
Execute the Data Flow
198(4)
Maintain a Product Dimension
202(10)
Create a Dimension Table
203(1)
Create Supporting Datasets
203(1)
Build the Product Maintenance Data Flow
204(6)
Execute the Dimension Data Flow
210(2)
Review
212(1)
Key Concepts
212(2)
For SSIS Developers
214(3)
Chapter 8 Integration Runtimes 217(24)
Azure Integration Runtime
217(7)
Inspect the AutoResolveIntegrationRuntime
218(1)
Create a New Azure Integration Runtime
219(2)
Use the New Azure Integration Runtime
221(3)
Self-Hosted Integration Runtime
224(7)
Create a Shared Data Factory
225(1)
Create a Self-Hosted Integration Runtime
225(1)
Link to a Self-Hosted Integration Runtime
226(1)
Use the Self-Hosted Integration Runtime
227(4)
Azure-SSIS Integration Runtime
231(7)
Create an Azure-SSIS Integration Runtime
231(3)
Deploy SSIS Packages to the Azure-SSIS IR
234(2)
Run an SSIS Package in ADF
236(1)
Stop the Azure-SSIS IR
237(1)
Review
238(1)
Key Concepts
239(1)
For SSIS Developers
240(1)
Chapter 9 Power Query in ADF 241(12)
Create a Power Query Mashup
241(2)
Explore the Power Query Editor
243(2)
Wrangle Data
245(3)
Run the Power Query Activity
248(2)
Review
250(3)
Chapter 10 Publishing to ADF 253(28)
Publish to Your Factory Instance
254(3)
Trigger a Pipeline from the ADF UX
254(1)
Publish Factory Resources
255(1)
Inspect Published Pipeline Run Outcome
256(1)
Publish to Another Data Factory
257(6)
Prepare a Production Environment
257(2)
Export ARM Template from Your Development Factory
259(1)
Import ARM Template into Your Production Factory
260(2)
Understand Deployment Parameters
262(1)
Automate Publishing to Another Factory
263(9)
Create a DevOps Service Connection
264(1)
Create an Azure DevOps Pipeline
265(5)
Trigger an Automatic Deployment
270(2)
Feature Branch Workflow
272(6)
Azure Data Factory Utilities
274(1)
Publish Resources as JSON
275(3)
Review
278(3)
Chapter 11 Triggers 281(26)
Use a Schedule Trigger
281(8)
Create a Schedule Trigger
281(2)
Reuse a Trigger
283(1)
Inspect Trigger Definitions
284(1)
Publish the Trigger
285(1)
Monitor Trigger Runs
286(1)
Deactivate the Trigger
287(1)
Advanced Recurrence Options
288(1)
Use an Event-Based Trigger
289(7)
Register the Event Grid Resource Provider
290(1)
Create an Event-Based Trigger
291(2)
Cause the Trigger to Run
293(2)
Trigger-Scoped System Variables
295(1)
Use a Tumbling Window Trigger
296(6)
Prepare Data
296(1)
Create a Windowed Copy Pipeline
297(2)
Create a Tumbling Window Trigger
299(1)
Monitor Trigger Runs
299(2)
Advanced Features
301(1)
Publishing Triggers Automatically
302(1)
Triggering Pipelines Programmatically
303(1)
Review
303(1)
Key Concepts
304(1)
For SSIS Developers
305(2)
Chapter 12 Monitoring 307(24)
Generate Factory Activity
307(1)
Inspect Factory Logs
308(6)
Inspect Trigger Runs
308(1)
Inspect Pipeline Runs
309(2)
Add Metadata to the Log
311(3)
Inspect Factory Metrics
314(2)
Export Logs and Metrics
316(3)
Create a Log Analytics Workspace
316(1)
Configure Diagnostic Settings
316(2)
Inspect Logs in Blob Storage
318(1)
Use the Log Analytics Workspace
319(4)
Query Logs
319(2)
Use a Log Analytics Workbook
321(2)
Receive Alerts
323(4)
Configure Metric-Based Alerts
323(2)
Configure Log-Based Alerts
325(2)
Deactivate ADF Triggers
327(1)
Review
327(1)
Key Concepts
328(1)
For SSIS Developers
329(2)
Index 331
Richard Swinbank is a data engineer and Microsoft Data Platform MVP. He specializes in building and automating analytics platforms using Microsoft technologies from the SQL Server stack to the Azure cloud. He is a fervent advocate of DataOps, with a technical focus on bringing automation to both analytics development and operations. An active member of the data community and keen knowledge-sharer, Richard is a volunteer, organizer, speaker, blogger, open source contributor, and author. He holds a PhD in computer science from the University of Birmingham (UK).