Muutke küpsiste eelistusi

E-raamat: Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling

4.17/5 (1138 hinnangut Goodreads-ist)
(Kimball Group), (Kimball Group)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 01-Jul-2013
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118530771
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 61,75 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: PDF+DRM
  • Ilmumisaeg: 01-Jul-2013
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118530771
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence!

The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more.

  • Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence
  • Begins with fundamental design recommendations and progresses through increasingly complex scenarios
  • Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more
  • Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more

Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Introduction xxvii
1 Data Warehousing, Business Intelligence, and Dimensional Modeling Primer 1(36)
Different Worlds of Data Capture and Data Analysis
2(1)
Goals of Data Warehousing and Business Intelligence
3(4)
Publishing Metaphor for DW/BI Managers
5(2)
Dimensional Modeling Introduction
7(11)
Star Schemas Versus OLAP Cubes
8(2)
Fact Tables for Measurements
10(3)
Dimension Tables for Descriptive Context
13(3)
Facts and Dimensions Joined in a Star Schema
16(2)
Kimball's DW/BI Architecture
18(8)
Operational Source Systems
18(1)
Extract, Transformation, and Load System
19(2)
Presentation Area to Support Business Intelligence
21(1)
Business Intelligence Applications
22(1)
Restaurant Metaphor for the Kimball Architecture
23(3)
Alternative DW/BI Architectures
26(4)
Independent Data Mart Architecture
26(2)
Hub-and-Spoke Corporate, Information Factory Inmon Architecture
28(1)
Hybrid Hub-and-Spoke and Kimball Architecture
29(1)
Dimensional Modeling Myths
30(2)
Myth 1: Dimensional Models are Only for Summary Data
30(1)
Myth 2: Dimensional Models are Departmental, Not Enterprise
31(1)
Myth 3: Dimensional Models are Not Scalable
31(1)
Myth 4: Dimensional Models are Only for Predictable Usage
31(1)
Myth 5: Dimensional Models Can't Be Integrated
32(1)
More Reasons to Think Dimensionally
32(2)
Agile Considerations
34(1)
Summary
35(2)
2 Kimball Dimensional Modeling Techniques Overview 37(32)
Fundamental Concepts
37(4)
Gather Business Requirements and Data Realities
37(1)
Collaborative Dimensional Modeling Workshops
38(1)
Four-Step Dimensional Design Process
38(1)
Business Processes
39(1)
Grain
39(1)
Dimensions for Descriptive Context
40(1)
Facts for Measurements
40(1)
Star Schemas and OLAP Cubes
40(1)
Graceful Extensions to Dimensional Models
41(1)
Basic Fact Table Techniques
41(5)
Fact Table Structure
41(1)
Additive, Semi-Additive, Non-Additive Facts
42(1)
Nulls in Fact Tables
42(1)
Conformed Facts
42(1)
Transaction Fact Tables
43(1)
Periodic Snapshot Fact Tables
43(1)
Accumulating Snapshot Fact Tables
44(1)
Factless Fact Tables
44(1)
Aggregate Fact Tables or OLAP Cubes
45(1)
Consolidated Fact Tables
45(1)
Basic Dimension Table Techniques
46(4)
Dimension Table Structure
46(1)
Dimension Surrogate Keys
46(1)
Natural, Durable, and Supernatural Keys
46(1)
Drilling Down
47(1)
Degenerate Dimensions
47(1)
Denormalized Flattened Dimensions
47(1)
Multiple Hierarchies in Dimensions
48(1)
Flags and indicators a Textual Attributes
48(1)
Null Attributes in Dimensions
48(1)
Calendar Date Dimensions
48(1)
Role-Playing Dimensions
49(1)
Junk Dimensions
49(1)
Snowflaked Dimensions
50(1)
Outrigger Dimensions
50(1)
Integration via Conformed Dimensions
50(3)
Conformed Dimensions
51(1)
Shrunken Dimensions
51(1)
Drilling Across
51(1)
Value Chain
52(1)
Enterprise Data Warehouse Bus Architecture
52(1)
Enterprise Data Warehouse Bus Matrix
52(1)
Detailed Implementation Bus Matrix
53(1)
Opportunity/Stakeholder Matrix
53(1)
Dealing with Slowly Changing Dimension Attributes
53(3)
Type 0: Retain Original
54(1)
Type 1: Overwrite
54(1)
Type 2: Add New Row
54(1)
Type 3: Add New Attribute
55(1)
Type 4: Add Mini-Dimension
55(1)
Type 5: Add Mini-Dimension and Type 1 Outrigger
55(1)
Type 6: Add Type 1 Attributes to Type 2 Dimension
56(1)
Type 7: Dual Type 1 and Type 2 Dimensions
56(1)
Dealing with Dimension Hierarchies
56(2)
Fixed Depth Positional Hierarchies
56(1)
Slightly Ragged/Variable Depth Hierarchies
57(1)
Ragged/Variable Depth Hierarchies with Hierarchy Bridge Tables
57(1)
Ragged/Variable Depth Hierarchies with Pathstring Attributes
57(1)
Advanced Fact Table Techniques
58(4)
Fact Table Surrogate Keys
58(1)
Centipede Fact Tables
58(1)
Numeric Values as Attributes or Facts
59(1)
Lag/Duration Facts
59(1)
Header/Line Fact Tables
59(1)
Allocated Facts
60(1)
Profit and Loss Fact Tables Using' Allocations
60(1)
Multiple Currency Facts
60(1)
Multiple Units of Measure Facts
61(1)
Year-to-Date Facts
61(1)
Multipass SQL to Avoid Fact-to-Fact Table Joins
61(1)
Timespan Tracking in Fact Tables
62(1)
Late Arriving Facts
62(1)
Advanced Dimension Techniques
62(5)
Dimension-to-Dimension Table Joins
62(1)
Multivalued Dimensions and Bridge Tables
63(1)
Time Varying Multivalued Bridge Tables
63(1)
Behavior Tag Time Series
63(1)
Behavior Study Groups
64(1)
Aggregated Facts as Dimension Attributes
64(1)
Dynamic Value Bands
64(1)
Text Comments Dimension
65(1)
Multiple Time Zones
65(1)
Measure Type Dimensions
65(1)
Step Dimensions
65(1)
Hot Swappable Dimensions
66(1)
Abstract Generic Dimensions
66(1)
Audit Dimensions
66(1)
Late Arriving Dimensions
67(1)
Special Purpose Schemas
67(2)
Supertype and Subtype Schemas for Heterogeneous Products
67(1)
Real-Time Fact Tables
68(1)
Error Event Schemas
68(1)
3 Retail Sales 69(42)
Four-Step Dimensional Design Process
70(2)
Step 1: Select the Business Process
70(1)
Step 2: Declare the Grain
71(1)
Step 3: Identify the Dimensions
72(1)
Step 4: Identify the Facts
72(1)
Retail Case Study
72(7)
Step 1: Select the Business Process
74(1)
Step 2: Declare the Grain
74(2)
Step 3: Identify the Dimensions
76(1)
Step 4: Identify the Facts
76(3)
Dimension Table Details
79(15)
Date Dimension
79(4)
Product Dimension
83(4)
Store Dimension
87(2)
Promotion Dimension
89(3)
Other Retail Sales Dimensions
92(1)
Degenerate Dimensions for Transaction Numbers
93(1)
Retail Schema in Action
94(1)
Retail Schema Extensibility
95(2)
Factless Fact Tables
97(1)
Dimension and Fact Table Keys
98(6)
Dimension Table Surrogate Keys
98(2)
Dimension Natural and Durable Supernatural Keys
100(1)
Degenerate Dimension Surrogate Keys
101(1)
Date Dimension Smart Keys
101(1)
Fact Table Surrogate Keys
102(2)
Resisting Normalization Urges
104(5)
Snowflake Schemas with Normalized Dimensions
104(2)
Outriggers
106(2)
Centipede Fact Tables with Too Many Dimensions
108(1)
Summary
109(2)
4 Inventory 111(30)
Value Chain Introduction
111(1)
Inventory Models
112(7)
Inventory Periodic Snapshot
113(3)
Inventory Transactions
116(2)
Inventory Accumulating Snapshot
118(1)
Fact Table Types
119(3)
Transaction Fact Tables
120(1)
Periodic Snapshot Fact Tables
120(1)
Accumulating Snapshot Fact Tables
121(1)
Complementary Fact Table Types
122(1)
Value Chain Integration
122(1)
Enterprise Data Warehouse Bus Architecture
123(7)
Understanding the Bus Architecture
124(1)
Enterprise Data Warehouse Bus Matrix
125(5)
Conformed Dimensions
130(8)
Drilling Across Fact Tables
130(1)
Identical Conformed Dimensions
131(1)
Shrunken Rollup Conformed Dimension with Attribute Subset
132(1)
Shrunken Conformed Dimension with Row Subset
132(2)
Shrunken Conformed Dimensions on the Bus Matrix
134(1)
Limited Conformity
135(1)
Importance of Data Governance and Stewardship
135(2)
Conformed Dimensions and the Agile Movement
137(1)
Conformed Facts
138(1)
Summary
139(2)
5 Procurement 141(26)
Procurement Case Study
141(1)
Procurement Transactions and Bus Matrix
142(5)
Single Versus Multiple Transaction Fact Tables
143(4)
Complementary Procurement Snapshot
147(1)
Slowly Changing Dimension Basics
147(12)
Type 0: Retain Original
148(1)
Type 1: Overwrite
149(1)
Type 2: Add New Row
150(4)
Type 3: Add New Attribute
154(2)
Type 4: Add Mini-Dimension
156(3)
Hybrid Slowly Changing Dimension Techniques,
159(5)
Type 5: Mini-Dimension and Type 1 Outrigger
160(1)
Type 6: Add Type 1 Attributes to Type 2 Dimension
160(2)
Type 7: Dual Type 1 and Type 2 Dimensions
162(2)
Slowly Changing Dimension Recap
164(1)
Summary
165(2)
6 Order Management 167(34)
Order Management Bus Matrix
168(1)
Order Transactions
168(19)
Fact Normalization
169(1)
Dimension Role Playing
170(2)
Product Dimension Revisited
172(2)
Customer Dimension
174(3)
Deal Dimension
177(1)
Degenerate Dimension for Order Number
178(1)
Junk Dimensions
179(2)
Header/Line Pattern to Avoid
181(1)
Multiple Currencies
182(2)
Transaction Facts at Different Granularity
184(2)
Another Header/Line Pattern to Avoid
186(1)
Invoice Transactions
187(7)
Service Level Performance as Facts, Dimensions, or Both
188(1)
Profit and Loss Facts
189(3)
Audit Dimension
192(2)
Accumulating Snapshot for Order Fulfillment Pipeline
194(5)
Lag Calculations
196(1)
Multiple Units of Measure
197(1)
Beyond the Rearview Mirror
198(1)
Summary
199(2)
7 Accounting 201(28)
Accounting Case Study and Bus Matrix
202(1)
General Ledger Data
203(7)
General Ledger Periodic Snapshot
203(1)
Chart of Accounts
203(1)
Period Close
204(2)
Year-to-Date Facts
206(1)
Multiple Currencies Revisited
206(1)
General Ledger Journal Transactions
206(2)
Multiple Fiscal Accounting Calendars
208(1)
Drilling Down Through a Multilevel Hierarchy
209(1)
Financial Statements
209(1)
Budgeting Process
210(4)
Dimension Attribute Hierarchies
214(10)
Fixed Depth Positional Hierarchies
214(1)
Slightly Ragged Variable Depth Hierarchies
214(1)
Ragged Variable Depth Hierarchies
215(4)
Shared Ownership in a Ragged Hierarchy
219(1)
Time Varying Ragged Hierarchies
220(1)
Modifying Ragged Hierarchies
220(1)
Alternative Ragged Hierarchy Modeling Approaches
221(2)
Advantages of the Bridge Table Approach for Ragged Hierarchies
223(1)
Consolidated Fact Tables
224(2)
Role of OLAP and Packaged Analytic Solutions
226(1)
Summary
227(2)
8 Customer Relationship Management 229(34)
CRM Overview
230(3)
Operational and Analytic CRM
231(2)
Customer Dimension Attributes
233(12)
Name and Address Parsing
233(3)
International Name and Address Considerations
236(2)
Customer-Centric Dates
238(1)
Aggregated Facts as Dimension Attributes
239(1)
Segmentation Attributes and Scores
240(3)
Counts with Type 2 Dimension Changes
243(1)
Outrigger for Low Cardinality Attribute Set
243(1)
Customer Hierarchy Considerations
244(1)
Bridge Tables for Multivalued Dimensions
245(4)
Bridge Table for Sparse Attributes
247(1)
Bridge Table for Multiple Customer, Contacts
248(1)
Complex Customer Behavior
249(7)
Behavior Study Groups for Cohorts
249(2)
Step Dimension for Sequential Behavior
251(1)
Timespan Fact Tables
252(2)
Tagging Fact Tables with Satisfaction Indicators
254(1)
Tagging Fact Table, with Abnormal Scenario Indicators
255(1)
Customer Data-Integration Approaches
256(4)
Master Data Management Creating a Single Customer Dimension
256(2)
Partial Conformity of Multiple Customer Dimensions
258(1)
Avoiding Fact-to-Fact Table Joins
259(1)
Low Latency Reality Check
260(1)
Summary
261(2)
9 Human Resources Management 263(18)
Employee Profile Tracking
263(4)
Precise Effective and Expiration Timespans
265(1)
Dimension Change Reason Tracking
266(1)
Profile Changes as Type 2 Attributes or Fact Events
267(1)
Headcount Periodic Snapshot
267(1)
Bus Matrix for HR Processes
268(2)
Packaged Analytic Solutions and Data Models
270(1)
Recursive Employee Hierarchies
271(3)
Change Tracking on Embedded Manager Key
272(1)
Drilling Up and Down Management Hierarchies
273(1)
Multivalued Skill Keyword Attributes
274(3)
Skill Keyword Bridge
275(1)
Skill Keyword Text String
276(1)
Survey Questionnaire Data
277(2)
Text Comments
278(1)
Summary
279(2)
10 Financial Services 281(16)
Banking Case Study and Bus Matrix
282(1)
Dimension Triage to Avoid Too Few Dimensions
283(10)
Household Dimension
286(1)
Multivalued Dimensions and Weighting Factors
287(2)
Mini-Dimensions Revisited
289(1)
Adding a Mini-Dimension to a Bridge Table
290(1)
Dynamic Value Banding of Facts
291(2)
Supertype and Subtype Schemas for Heterogeneous Products
293(3)
Supertype and Subtype Products with Common Facts,
295(1)
Hot Swappable Dimensions
296(1)
Summary
296(1)
11 Telecommunications 297(14)
Telecommunications Case Study and Bus Matrix
297(2)
General Design Review Considerations
299(5)
Balance Business Requirements and Source Realities
300(1)
Focus on Business Processes
300(1)
Granularity
300(1)
Single Granularity for Facts
301(1)
Dimension Granularity and Hierarchies
301(1)
Date Dimension
302(1)
Degenerate Dimensions
303(1)
Surrogate Keys
303(1)
Dimension Decodes and Descriptions
303(1)
Conformity Commitment
304(1)
Design Review Guidelines
304(2)
Draft Design Exercise Discussion
306(3)
Remodeling Existing Data Structures
309(1)
Geographic Location Dimension
310(1)
Summary
310(1)
12 Transportation 311(14)
Airline Case Study and Bus Matrix
311(6)
Multiple Fact Table Granularities
312(3)
Linking Segments into Trips
315(1)
Related Fact Tables
316(1)
Extensions to Other Industries
317(1)
Cargo Shipper
317(1)
Travel Services
317(1)
Combining Correlated Dimensions
318(3)
Class of Service
319(1)
Origin and Destination
320(1)
More Date and Time Considerations
321(3)
Country-Specific Calendars as Outriggers
321(2)
Date and Time in Multiple Time Zones
323(1)
Localization Recap
324(1)
Summary
324(1)
13 Education 325(14)
University Case Study and Bus Matrix
325(1)
Accumulating Snapshot Fact Tables
326(3)
Applicant Pipeline
326(3)
Research Grant Proposal Pipeline
329(1)
Factless Fact Tables
329(7)
Admissions Events
330(1)
Course Registrations
330(4)
Facility Utilization
334(1)
Student Attendance
335(1)
More Educational Analytic Opportunities
336(1)
Summary
336(3)
14 Healthcare 339(14)
Healthcare Case Study and Bus Matrix
339(3)
Claims Billing and Payments
342(6)
Date Dimension Role Playing
345(1)
Multivalued Diagnoses
345(2)
Supertypes and Subtypes for Charges
347(1)
Electronic Medical Records
348(3)
Measure Type Dimension for Sparse Facts
349(1)
Freeform Text Comments
350(1)
Images
350(1)
Facility/Equipment Inventory Utilization
351(1)
Dealing with Retroactive Changes
351(1)
Summary
352(1)
15 Electronic Commerce 353(22)
Clickstream Source Data
353(4)
Clickstream Data Challenges
354(3)
Clickstream Dimensional Models
357(11)
Page Dimension
358(1)
Event Dimension
359(1)
Session Dimension
359(1)
Referral Dimension
360(1)
Clickstream Session Fact Table
361(2)
Clickstream Page Event Fact Table
363(3)
Step Dimension
366(1)
Aggregate Clickstream Fact Tables
366(1)
Google Analytics
367(1)
Integrating Clickstream into Web Retailer's Bus Matrix
368(2)
Profitability Across Channels Including Web
370(3)
Summary
373(2)
16 Insurance 375(28)
Insurance Case Study
376(3)
Insurance Value Chain
377(1)
Draft Bus Matrix
378(1)
Policy Transactions
379(6)
Dimension Role Playing
380(1)
Slowly Changing Dimensions
380(1)
Mini-Dimensions for Large or Rapidly Changing Dimensions
381(1)
Multivalued Dimension Attributes
382(1)
Numeric Attributes as Facts or Dimensions
382(1)
Degenerate Dimension
383(1)
Low Cardinality Dimension Tables
383(1)
Audit Dimension
383(1)
Policy Transaction Fact Table
383(1)
Heterogeneous Supertype and Subtype Products
384(1)
Complementary Policy Accumulating Snapshot
384(1)
Premium Periodic Snapshot
385(3)
Conformed Dimensions
386(1)
Conformed Facts
386(1)
Pay-in-Advance Facts
386(1)
Heterogeneous Supertypes and Subtypes Revisited
387(1)
Multivalued Dimensions Revisited
388(1)
More Insurance Case Study Background
388(2)
Updated Insurance Bus Matrix
389(1)
Detailed Implementation Bus Matrix
390(1)
Claim Transactions
390(2)
Transaction Versus Profile Junk Dimensions
392(1)
Claim Accumulating Snapshot
392(3)
Accumulating Snapshot for Complex Workflows
393(1)
Timespan Accumulating Snapshot
394(1)
Periodic Instead of Accumulating Snapshot
395(1)
Policy/Claim Consolidated Periodic Snapshot
395(1)
Factless Accident Events
396(1)
Common Dimensional Modeling Mistakes to Avoid
397(4)
Mistake 10: Place Text Attributes in a Fact Table
397(1)
Mistake 9: Limit Verbose Descriptors to Save Space
398(1)
Mistake 8: Split Hierarchies into Multiple Dimensions
398(1)
Mistake 7: Ignore the Need to Track Dimension Changes
398(1)
Mistake 6: Solve All Performance Problems with More Hardware
399(1)
Mistake 5: Use Operational Keys to Join Dimensions and Facts
399(1)
Mistake 4: Neglect to Declare and Comply with the Fact Grain
399(1)
Mistake 3: Use a Report to Design the Dimensional Model
400(1)
Mistake 2: Expect Users to Query Normalized Atomic Data
400(1)
Mistake 1: Fail to Conform Facts and Dimensions
400(1)
Summary
401(2)
17 Kimball DW/BI Lifecycle Overview 403(26)
Lifecycle Roadmap
404(2)
Roadmap Mile Markers
405(1)
Lifecycle Launch Activities
406(10)
Program/Project Planning and Management
406(4)
Business Requirements Definition
410(6)
Lifecycle Technology Track
416(4)
Technical Architecture Design
416(2)
Product Selection and Installation
418(2)
Lifecycle Data Track
420(2)
Dimensional Modeling
420(1)
Physical Design
420(2)
ETL Design and Development
422(1)
Lifecycle BI Applications Track
422(2)
BI Application Specification
423(1)
BI Application Development
423(1)
Lifecycle Wrap-up Activities
424(2)
Deployment
424(1)
Maintenance and Growth
425(1)
Common Pitfalls to Avoid
426(1)
Summary
427(2)
18 Dimensional Modeling Process and Tasks 429(14)
Modeling Process Overview
429(2)
Get Organized
431(3)
Identify Participants, Especially Business Representatives
431(1)
Review the Business Requirements
432(1)
Leverage a Modeling Tool
432(1)
Leverage a Data Profiling Tool
433(1)
Leverage or Establish Naming Conventions
433(1)
Coordinate Calendars and Facilities
433(1)
Design the Dimensional Model
434(7)
Reach Consensus on High-Level Bubble Chart
435(1)
Develop the Detailed Dimensional Model
436(3)
Review and Validate the Model
439(2)
Finalize the Design Documentation
441(1)
Summary
441(2)
19 ETL Subsystems and Techniques 443(54)
Round Up the Requirements
444(5)
Business Needs
444(1)
Compliance
445(1)
Data Quality
445(1)
Security
446(1)
Data Integration
446(1)
Data Latency
447(1)
Archiving and Lineage
447(1)
BI Delivery interfaces
448(1)
Available Skills
448(1)
Legacy Licenses
449(1)
The 34 Subsystems of ETL
449(1)
Extracting: Getting Data into the Data Warehouse
450(5)
Subsystem 1: Data Profiling
450(1)
Subsystem 2: Change Data Capture System
451(2)
Subsystem 3: Extract System
453(2)
Cleaning and Conforming Data
455(8)
Improving Data Quality Culture and Processes
455(1)
Subsystem 4: Data Cleansing System
456(2)
Subsystem 5: Error Event Schema
458(2)
Subsystem 6: Audit Dimension Assembler
460(1)
Subsystem 7: Deduplication System
460(1)
Subsystem 8: Conforming System
461(2)
Delivering: Prepare for Presentation
463(20)
Subsystem 9: Slowly Changing Dimension Manager
464(5)
Subsystem 10: Surrogate Key Generator
469(1)
Subsystem 11: Hierarchy Manager
470(1)
Subsystem 12: Special Dimensions Manager
470(3)
Subsystem 13: Fact Table Builders
473(2)
Subsystem 14: Surrogate Key Pipeline
475(2)
Subsystem 15: Multivalued Dimension Bridge Table Builder
477(1)
Subsystem 16: Late Arriving Data Handler
478(1)
Subsystem 17: Dimension Manager System
479(1)
Subsystem 18: Fact Provider System
480(1)
Subsystem 19: Aggregate Builder
481(1)
Subsystem 20: OLAP Cube Builder
481(1)
Subsystem 21: Data Propagation Manager
482(1)
Managing the ETL Environment
483(13)
Subsystem 22: Job Scheduler
483(2)
Subsystem 23: Backup System
485(1)
Subsystem 24: Recovery and Restart System
486(2)
Subsystem 25: Version Control System 8
488(1)
Subsystem 26: Version Migration System
488(1)
Subsystem 27: Workflow Monitor
489(1)
Subsystem 28: Sorting System
490(1)
Subsystem 29: Lineage and Dependency Analyzer
490(1)
Subsystem 30: Problem Escalation System
491(1)
Subsystem 31: Parallelizing/Pipelining System
492(1)
Subsystem 32: Security System
492(1)
Subsystem 33: Compliance Manager
493(2)
Subsystem 34: Metadata Repository Manager
495(1)
Summary
496(1)
20 ETL System Design and Development Process and Tasks 497(30)
ETL Process Overview
497(1)
Develop the ETL Plan
498(5)
Step 1: Draw the High-Level Plan
498(1)
Step 2: Choose an ETL Tool
499(1)
Step 3: Develop Default Strategies
500(1)
Step 4: Drill Down by Target Table
500(2)
Develop the ETL Specification Document
502(1)
Develop One-Time Historic Load Processing
503(9)
Step 5: Populate Dimension Tables with Historic Data
503(5)
Step 6: Perform the Fact Table Historic Load
508(4)
Develop Incremental ETL Processing
512(8)
Step 7: Dimension Table Incremental Processing
512(3)
Step 8: Fact Table Incremental Processing
515(4)
Step 9: Aggregate Table and OLAP Loads
519(1)
Step 10: ETL System Operation and Automation
519(1)
Real-Time Implications
520(6)
Real-Time Triage
521(1)
Real-Time Architecture Trade-Offs
522(2)
Real-Time Partitions in the Presentation Server
524(2)
Summary
526(1)
21 Big Data Analytics 527(16)
Big Data Overview
527(4)
Extended RDBMS Architecture
529(1)
MapReduce/Hadoop Architecture
530(1)
Comparison of Big Data Architectures
530(1)
Recommended Best Practices for Big Data
531(11)
Management Best Practices for Big Data
531(2)
Architecture Best Practices for Big Data
533(5)
Data Modeling Best Practices for Big Data
538(3)
Data Governance Best Practices for Big Data
541(1)
Summary
542(1)
Index 543
RALPH KIMBALL, PhD, has been a leading visionary in the data warehouse and business intelligence industry since 1982. The Data Warehouse Toolkit book series have been bestsellers since 1996.

MARGY ROSS is President of the Kimball Group and the coauthor of five Toolkit books with Ralph Kimball. She has focused exclusively on data warehousing and business intelligence for more than 30 years.