Preface |
|
xv | |
|
|
1 | (20) |
|
Journey Map from Raw Data to Insights |
|
|
3 | (7) |
|
|
4 | (2) |
|
|
6 | (1) |
|
|
7 | (1) |
|
|
8 | (2) |
|
Defining Your Time-to-insight Scorecard |
|
|
10 | (5) |
|
Build Your Self-Service Data Roadmap |
|
|
15 | (6) |
|
Part I Self-Service Data Discovery |
|
|
|
2 Metadata Catalog Service |
|
|
21 | (14) |
|
|
22 | (2) |
|
|
23 | (1) |
|
|
23 | (1) |
|
|
24 | (1) |
|
Minimizing Time to Interpret |
|
|
24 | (2) |
|
Extracting Technical Metadata |
|
|
24 | (1) |
|
Extracting Operational Metadata |
|
|
25 | (1) |
|
|
26 | (1) |
|
|
26 | (3) |
|
Technical Metadata Extractor Requirements |
|
|
27 | (1) |
|
Operational Metadata Requirements |
|
|
28 | (1) |
|
Team Knowledge Aggregator Requirements |
|
|
28 | (1) |
|
|
29 | (4) |
|
Source-Specific Connectors Pattern |
|
|
29 | (2) |
|
Lineage Correlation Pattern |
|
|
31 | (1) |
|
|
32 | (1) |
|
|
33 | (2) |
|
|
35 | (16) |
|
|
35 | (2) |
|
Determining Feasibility of the Business Problem |
|
|
36 | (1) |
|
Selecting Relevant Datasets for Data Prep |
|
|
36 | (1) |
|
Reusing Existing Artifacts for Prototyping |
|
|
36 | (1) |
|
|
37 | (1) |
|
Indexing Datasets and Artifacts |
|
|
37 | (1) |
|
|
37 | (1) |
|
|
38 | (1) |
|
|
38 | (3) |
|
|
39 | (1) |
|
|
40 | (1) |
|
Access Control Requirements |
|
|
40 | (1) |
|
Nonfunctional Requirements |
|
|
40 | (1) |
|
|
41 | (8) |
|
Push-Pull Indexer Pattern |
|
|
42 | (2) |
|
Hybrid Search Ranking Pattern |
|
|
44 | (2) |
|
Catalog Access Control Pattern |
|
|
46 | (3) |
|
|
49 | (2) |
|
|
51 | (12) |
|
|
52 | (1) |
|
Finding Available Features |
|
|
53 | (1) |
|
|
53 | (1) |
|
Feature Pipeline for Online Inference |
|
|
53 | (1) |
|
Minimize Time to Featurize |
|
|
53 | (2) |
|
|
54 | (1) |
|
|
54 | (1) |
|
|
55 | (2) |
|
|
55 | (1) |
|
|
56 | (1) |
|
Nonfunctional Requirements |
|
|
57 | (1) |
|
|
57 | (5) |
|
Hybrid Feature Computation Pattern |
|
|
58 | (2) |
|
|
60 | (2) |
|
|
62 | (1) |
|
|
63 | (14) |
|
|
63 | (1) |
|
Aggregating Data Across Sources |
|
|
63 | (1) |
|
Moving Raw Data to Specialized Query Engines |
|
|
64 | (1) |
|
Moving Processed Data to Serving Stores |
|
|
64 | (1) |
|
Exploratory Analysis Across Sources |
|
|
64 | (1) |
|
Minimizing Time to Data Availability |
|
|
64 | (2) |
|
Data Ingestion Configuration and Change Management |
|
|
65 | (1) |
|
|
65 | (1) |
|
Data Quality Verification |
|
|
65 | (1) |
|
|
66 | (4) |
|
|
66 | (2) |
|
Transformation Requirements |
|
|
68 | (1) |
|
|
68 | (1) |
|
Verification Requirements |
|
|
69 | (1) |
|
Nonfunctional Requirements |
|
|
69 | (1) |
|
|
70 | (6) |
|
|
70 | (2) |
|
Change Data Capture Ingestion Pattern |
|
|
72 | (3) |
|
Event Aggregation Pattern |
|
|
75 | (1) |
|
|
76 | (1) |
|
6 Clickstream Tracking Service |
|
|
77 | (16) |
|
|
78 | (1) |
|
Minimizing Time to Click Metrics |
|
|
79 | (3) |
|
|
80 | (1) |
|
|
81 | (1) |
|
|
82 | (1) |
|
|
82 | (2) |
|
Instrumentation Requirements Checklist |
|
|
82 | (1) |
|
Enrichment Requirements Checklist |
|
|
83 | (1) |
|
|
84 | (5) |
|
|
84 | (1) |
|
Rule-Based Enrichment Patterns |
|
|
85 | (2) |
|
|
87 | (2) |
|
|
89 | (4) |
|
Part II Self-Service Data Prep |
|
|
|
7 Data Lake Management Service |
|
|
93 | (14) |
|
|
94 | (3) |
|
Primitive Life Cycle Management |
|
|
95 | (1) |
|
|
96 | (1) |
|
Managing Batching and Streaming Data Flows |
|
|
96 | (1) |
|
Minimizing Time to Data Lake Management |
|
|
97 | (5) |
|
|
97 | (5) |
|
|
102 | (4) |
|
Data Life Cycle Primitives Pattern |
|
|
103 | (1) |
|
|
104 | (1) |
|
Advanced Data Management Pattern |
|
|
105 | (1) |
|
|
106 | (1) |
|
|
107 | (8) |
|
|
108 | (1) |
|
Minimizing Time to Wrangle |
|
|
109 | (2) |
|
|
110 | (1) |
|
|
110 | (1) |
|
|
111 | (1) |
|
|
111 | (1) |
|
|
111 | (3) |
|
Exploratory Data Analysis Patterns |
|
|
112 | (1) |
|
Analytical Transformation Patterns |
|
|
113 | (1) |
|
|
114 | (1) |
|
9 Data Rights Governance Service |
|
|
115 | (16) |
|
|
117 | (1) |
|
Executing Data Rights Requests |
|
|
117 | (1) |
|
|
118 | (1) |
|
|
118 | (1) |
|
Minimizing Time to Comply |
|
|
118 | (1) |
|
Tracking the Customer Data Life Cycle |
|
|
118 | (1) |
|
Executing Customer Data Rights Requests |
|
|
119 | (1) |
|
|
119 | (1) |
|
|
119 | (3) |
|
Current Pain Point Questionnaire |
|
|
120 | (1) |
|
|
120 | (1) |
|
|
121 | (1) |
|
Nonfunctional Requirements |
|
|
122 | (1) |
|
|
122 | (5) |
|
Sensitive Data Discovery and Classification Pattern |
|
|
123 | (1) |
|
Data Lake Deletion Pattern |
|
|
124 | (1) |
|
Use Case-Dependent Access Control |
|
|
125 | (2) |
|
|
127 | (4) |
|
Part III Self-Service Build |
|
|
|
10 Data Virtualization Service |
|
|
131 | (12) |
|
|
132 | (1) |
|
|
132 | (1) |
|
Picking a Processing Cluster |
|
|
132 | (1) |
|
|
133 | (1) |
|
Picking the Execution Environment |
|
|
133 | (1) |
|
Formulating Polyglot Queries |
|
|
133 | (1) |
|
Joining Data Across Silos |
|
|
134 | (1) |
|
|
134 | (2) |
|
Current Pain Point Analysis |
|
|
134 | (1) |
|
|
135 | (1) |
|
|
135 | (1) |
|
Nonfunctional Requirements |
|
|
135 | (1) |
|
|
136 | (5) |
|
Automatic Query Routing Pattern |
|
|
137 | (1) |
|
|
138 | (2) |
|
|
140 | (1) |
|
|
141 | (2) |
|
11 Data Transformation Service |
|
|
143 | (10) |
|
|
144 | (1) |
|
Production Dashboard and ML Pipelines |
|
|
144 | (1) |
|
|
144 | (1) |
|
Minimizing Time to Transform |
|
|
144 | (1) |
|
Transformation Implementation |
|
|
144 | (1) |
|
|
145 | (1) |
|
Transformation Operations |
|
|
145 | (1) |
|
|
145 | (2) |
|
Current State Questionnaire |
|
|
146 | (1) |
|
|
146 | (1) |
|
Nonfunctional Requirements |
|
|
147 | (1) |
|
|
147 | (5) |
|
|
148 | (3) |
|
|
151 | (1) |
|
|
152 | (1) |
|
12 Model Training Service |
|
|
153 | (14) |
|
|
154 | (2) |
|
|
154 | (1) |
|
|
155 | (1) |
|
|
156 | (1) |
|
|
156 | (2) |
|
|
156 | (1) |
|
|
157 | (1) |
|
|
157 | (1) |
|
|
158 | (3) |
|
|
158 | (2) |
|
|
160 | (1) |
|
|
160 | (1) |
|
Nonfunctional Requirements |
|
|
160 | (1) |
|
|
161 | (5) |
|
Distributed Training Orchestrator Pattern |
|
|
162 | (1) |
|
|
163 | (1) |
|
Data-Aware Continuous Training |
|
|
164 | (2) |
|
|
166 | (1) |
|
13 Continuous Integration Service |
|
|
167 | (10) |
|
|
168 | (1) |
|
Collaborating on an ML Pipeline |
|
|
168 | (1) |
|
|
168 | (1) |
|
Validating Schema Changes |
|
|
169 | (1) |
|
Minimizing Time to Integrate |
|
|
169 | (1) |
|
|
169 | (1) |
|
|
170 | (1) |
|
|
170 | (1) |
|
|
170 | (2) |
|
Experiment Tracking Module |
|
|
171 | (1) |
|
Pipeline Packaging Module |
|
|
171 | (1) |
|
Testing Automation Module |
|
|
172 | (1) |
|
|
172 | (3) |
|
Programmable Tracking Pattern |
|
|
173 | (1) |
|
Reproducible Project Pattern |
|
|
174 | (1) |
|
|
175 | (2) |
|
|
177 | (12) |
|
|
179 | (2) |
|
Minimizing Time to A/B Test |
|
|
181 | (2) |
|
|
182 | (1) |
|
|
182 | (1) |
|
|
183 | (1) |
|
|
183 | (3) |
|
Experiment Specification Pattern |
|
|
184 | (1) |
|
Metrics Definition Pattern |
|
|
185 | (1) |
|
Automated Experiment Optimization |
|
|
185 | (1) |
|
|
186 | (3) |
|
Part IV Self-Service Operationalize |
|
|
|
15 Query Optimization Service |
|
|
189 | (14) |
|
|
190 | (1) |
|
|
190 | (1) |
|
Resolving Runtime Query Issues |
|
|
190 | (1) |
|
|
191 | (1) |
|
Minimizing Time to Optimize |
|
|
191 | (3) |
|
|
191 | (1) |
|
|
192 | (1) |
|
|
193 | (1) |
|
|
194 | (2) |
|
Current Pain Points Questionnaire |
|
|
194 | (1) |
|
|
195 | (1) |
|
Functionality Requirements |
|
|
195 | (1) |
|
Nonfunctional Requirements |
|
|
195 | (1) |
|
|
196 | (5) |
|
|
196 | (2) |
|
Operational Insights Pattern |
|
|
198 | (2) |
|
|
200 | (1) |
|
|
201 | (2) |
|
16 Pipeline Orchestration Service |
|
|
203 | (12) |
|
|
204 | (1) |
|
Invoke Exploratory Pipelines |
|
|
205 | (1) |
|
|
205 | (1) |
|
Minimizing Time to Orchestrate |
|
|
205 | (1) |
|
Defining Job Dependencies |
|
|
205 | (1) |
|
|
206 | (1) |
|
|
206 | (1) |
|
|
206 | (3) |
|
Current Pain Points Questionnaire |
|
|
207 | (1) |
|
|
207 | (1) |
|
|
208 | (1) |
|
Nonfunctional Requirements |
|
|
208 | (1) |
|
|
209 | (4) |
|
Dependency Authoring Patterns |
|
|
209 | (2) |
|
Orchestration Observability Patterns |
|
|
211 | (1) |
|
Distributed Execution Pattern |
|
|
212 | (1) |
|
|
213 | (2) |
|
|
215 | (12) |
|
|
216 | (1) |
|
Model Deployment in Production |
|
|
216 | (1) |
|
Model Maintenance and Upgrade |
|
|
216 | (1) |
|
Minimizing Time to Deploy |
|
|
217 | (1) |
|
|
217 | (1) |
|
|
217 | (1) |
|
|
218 | (1) |
|
|
218 | (3) |
|
|
218 | (2) |
|
Model Scaling and Performance |
|
|
220 | (1) |
|
|
221 | (1) |
|
Nonfunctional Requirements |
|
|
221 | (1) |
|
|
221 | (5) |
|
Universal Deployment Pattern |
|
|
222 | (2) |
|
Autoscaling Deployment Pattern |
|
|
224 | (1) |
|
Model Drift Tracking Pattern |
|
|
225 | (1) |
|
|
226 | (1) |
|
18 Quality Observability Service |
|
|
227 | (12) |
|
|
228 | (1) |
|
Daily Data Quality Monitoring Reports |
|
|
228 | (1) |
|
|
228 | (1) |
|
Handling Low-Quality Data Records |
|
|
229 | (1) |
|
Minimizing Time to Insight Quality |
|
|
229 | (2) |
|
Verify the Accuracy of the Data |
|
|
229 | (1) |
|
|
230 | (1) |
|
Prevent Data Quality Issues |
|
|
231 | (1) |
|
|
231 | (2) |
|
Detection and Handling Data Quality Issues |
|
|
232 | (1) |
|
|
232 | (1) |
|
Nonfunctional Requirements |
|
|
233 | (1) |
|
|
233 | (5) |
|
|
234 | (1) |
|
Profiling-Based Anomaly Detection Pattern |
|
|
235 | (1) |
|
|
236 | (2) |
|
|
238 | (1) |
|
19 Cost Management Service |
|
|
239 | (12) |
|
|
240 | (1) |
|
|
240 | (1) |
|
Continuous Cost Optimization |
|
|
241 | (1) |
|
Minimizing Time to Optimize Cost |
|
|
241 | (2) |
|
Expenditure Observability |
|
|
241 | (1) |
|
Matching Supply and Demand |
|
|
242 | (1) |
|
Continuous Cost Optimization |
|
|
242 | (1) |
|
|
243 | (1) |
|
Pain Points Questionnaire |
|
|
243 | (1) |
|
|
243 | (1) |
|
Nonfunctional Requirements |
|
|
244 | (1) |
|
|
244 | (5) |
|
Continuous Cost Monitoring Pattern |
|
|
245 | (1) |
|
Automated Scaling Pattern |
|
|
246 | (2) |
|
|
248 | (1) |
|
|
249 | (2) |
Index |
|
251 | |