Muutke küpsiste eelistusi

E-raamat: Data Warehousing for Biomedical Informatics [Taylor & Francis e-raamat]

(Data-Oriented Quality Solutions, Orlando, Florida, USA)
  • Formaat: 656 pages, 34 Tables, black and white; 141 Illustrations, black and white
  • Ilmumisaeg: 19-Nov-2015
  • Kirjastus: Apple Academic Press Inc.
  • ISBN-13: 9780429091001
  • Taylor & Francis e-raamat
  • Hind: 207,73 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 296,75 €
  • Säästad 30%
  • Formaat: 656 pages, 34 Tables, black and white; 141 Illustrations, black and white
  • Ilmumisaeg: 19-Nov-2015
  • Kirjastus: Apple Academic Press Inc.
  • ISBN-13: 9780429091001
Data Warehousing for Biomedical Informatics is a step-by-step how-to guide for designing and building an enterprise-wide data warehouse across a biomedical or healthcare institution, using a four-iteration lifecycle and standardized design pattern. It enables you to quickly implement a fully-scalable generic data architecture that supports your organizations clinical, operational, administrative, financial, and research data. By following the guidelines in this book, you will be able to successfully progress through the Alpha, Beta, and Gamma versions, plus fully implement your first production release in about a year.

The Alpha version allows you to implement just enough of the basic design pattern to illustrate its core capabilities while loading a small sampling of limited data for demonstration purposes. This provides an easy way for everyone involved to visualize the new warehouse paradigm by actually examining a core subset of the working system. You can finish the Alpha version, also referred to as the proof-of-concept, in as little as 3-4 weeks.



The Beta version, which can be completed in about 2-3 months, adds required functionality and much more data. It allows you to get the full warehouse up and running quickly, in order to facilitate longer-term planning, user and support team training, and setup of the operational environment. The Gamma version, which is a fully-functional systemthough still lacking datacan be implemented in about 3-4 months. About one year after starting, you will be ready to launch Release 1.0 as a complete and secure data warehouse.
Preface xvii
Author xxi
1 Biomedical Data Warehousing
1(16)
Nature of Biomedical Data
1(3)
Nature of Warehoused Data
4(3)
Business Requirements
7(1)
Functional Requirements
8(3)
Data Queries, Reports, and Marts
9(1)
Data Issues and Hypotheses
10(1)
Performance and Control Data
11(1)
Never-Finished Warehouse
11(1)
Organizational Readiness
12(1)
Implementation Strategy
13(4)
Warehouse Project
14(3)
Section I Alpha Version
2 Dimensional Data Modeling
17(22)
Evolution of Data Warehouses
17(5)
Relational Normalization
18(1)
Dimensional Design
19(3)
The Star Schema
22(7)
Dimensions and Subdimensions
24(2)
Dimensional Granularity
26(3)
Transposing Dimensional Schema
29(3)
Anticipating Dimensions
32(1)
Affinity Analysis
33(6)
Dimensions as Supertypes
34(3)
Reverse Engineering
37(2)
3 Understanding Source Data
39(36)
Implicit versus Explicit Data
40(2)
Semantic Layers
42(11)
BFO Continuants
44(3)
BFO Occurrents
47(6)
BFO Temporal Regions
48(2)
BFO Spatiotemporal Regions
50(1)
BFO Processual Entities
51(1)
BFO Process Aggregate
52(1)
BFO Process Boundary
52(1)
BFO Processual Context
52(1)
Information Artifacts
53(4)
IA0 Material Information Bearer
54(1)
IA0 Information Content Entity
55(1)
IA0 Information Carrier
56(1)
Biomedical Context
57(7)
OBI Processes
58(3)
OBI Process Aggregates
61(2)
OBI Processual Contexts
63(1)
Clinical Picture
64(3)
Ontological Levels
67(1)
Epistemological Levels
68(3)
Observations
68(2)
Hypotheses
70(1)
Conclusions
71(4)
4 Biomedical Warehouse
75(42)
Biomedical Star
76(2)
Biomedical Facts
78(1)
Master Dimensions
79(7)
Organization Dimension
80(1)
Study Dimension
81(1)
Caregiver Dimension
82(1)
Interaction Dimension
83(1)
Subject Dimension
84(2)
Object Instantiation
85(1)
Reference Dimensions
86(10)
Diagnosis Dimension
86(1)
Pathology Dimension
87(3)
Procedure Dimension
90(1)
Treatment Dimension
91(1)
Material Dimension
92(1)
Facility Dimension
93(1)
Accounting Dimension
94(1)
-Omics Dimension
95(1)
Almanac Dimensions
96(9)
Geopolitics Dimension
96(3)
Calendar Dimension
99(1)
Clock Dimension
100(2)
Environment Dimension
102(1)
Structure Dimension
102(1)
System Dimension
103(1)
Unit of Measure Dimension
103(2)
Analysis Dimensions
105(1)
Annotation Dimension
105(1)
Quality Dimension
106(1)
Control Dimensions
106(3)
Metadata Dimension
107(2)
Data Feed Dimension
109(1)
Data State Dimension
109(1)
Operation Dimension
109(1)
Requirements Alignment
109(8)
5 Star Dimension Design Pattern
117(32)
Structure of a Dimension
117(1)
Master Data: Definition Tables
118(3)
Slowly Changing Dimensions
121(8)
SCD Example
125(4)
Source Keys: Context and Reference Tables
129(5)
Fact Participation: Group and Bridge Tables
134(3)
Interconnections: Hierarchy Tables
137(5)
Natural Hierarchies
140(2)
Connecting to Facts
142(1)
Dimension Navigation
143(6)
6 Loading Alpha Version
149(48)
Throw-Away Code
149(1)
Selecting and Preparing Sources
150(6)
Generating Surrogate Keys
156(3)
Simple Dimensions and Facts
159(13)
First Source: Patient Master Data
159(3)
Alternative G-B-H Processing
162(2)
Second Source: Patient Address Facts
164(6)
Quick Fact Queries
170(2)
Recap of Simple ETLs
172(3)
Complicated Dimensions and Facts
175(13)
Third Source: Basic Lab Results
175(13)
Finalizing Alpha Structures
188(1)
V&V of Alpha Version
189(8)
Metadata-Fact Counts
189(1)
Dimension-Subdimension Counts
190(1)
Dimension-Fact Counts
190(1)
Recreating Sources
191(6)
Section II Beta Version
7 Completing the Design
197(36)
Unit of Measure
198(10)
UOM Scale
199(1)
UOM Class
199(1)
UOM Unit
200(1)
UOM Measure
201(1)
UOM Language
201(1)
UOM Value
202(6)
Metadata Mappings
208(16)
Metadata Contexts
208(1)
Metadata Reference
209(1)
Metadata Definition
209(2)
Targeting Metadata
211(1)
Property Metadata
212(3)
Implicit UOM Metadata
213(1)
Superseding Metadata
213(1)
Codeset Translation Metadata
214(1)
Fiat Hierarchy Metadata
215(1)
Metadata Examples
215(9)
First Alpha Source: Patient Master Data
216(2)
Second Alpha Source: Patient Address Facts
218(4)
Third Alpha Source: Basic Lab Results
222(2)
Control Dimensions
224(2)
Data State
224(1)
Operation
225(1)
Data Feed
226(1)
Reinitializing the Warehouse
226(7)
Empty Warehouse
226(1)
System Rows
227(1)
Data States
228(1)
Units of Measure
229(1)
Configuration Data
229(3)
Context Entries
230(1)
Default Dimensionality
231(1)
Metadata Loading
232(1)
8 Data Sourcing
233(34)
Source Mapping Challenges
233(24)
Coverage and Seamlessness
234(4)
Functional Normalization
238(5)
Qualities of Facts
243(11)
Lab Result Facts
244(3)
Allergy Assessment Facts
247(3)
Financial Charge Facts
250(4)
Fact Superseding
254(3)
Dimensionalizing Facts
257(2)
Sourcing Your Data
259(8)
Selecting Columns
260(3)
Selecting Rows
263(1)
Selecting Time
264(3)
9 Generalizing ETL Workflows
267(42)
Standardizing Source Data
267(8)
Dataset Controls
268(4)
Transaction Controls
272(3)
Source Data Values
275(1)
Source Data Intake Jobs
275(2)
SDI Design Pattern
277(5)
Source Data Consolidation
282(1)
External versus Internal Sourcing
283(1)
Single Point of Function
284(3)
Single Point of Failure
286(1)
ETL "Pipes"
287(4)
Reference Pipe
288(1)
Definition Pipe
289(1)
Fact Pipe
290(1)
Checkpoint, Restart, and Bulk Loading
290(1)
Metadata Transformation
291(8)
Codeset Translations
292(2)
General versus Functional Transformations
294(2)
Resolving "-A11-" Entries
296(1)
Early versus Late Binding
297(2)
Data Control Pipe
299(7)
ETL Job
300(1)
Data Control Layers
300(4)
Data Control Points
304(1)
Assigning Master IDs to ETL Layers
305(1)
Wide versus Deep Data
306(3)
10 ETL Reference Pipe
309(34)
Metadata Transformation
312(5)
Codeset Translation
314(2)
Reference Processing
316(1)
Reference Composite
317(4)
Reference Staging
319(1)
Beta Limitations
320(1)
Resolve References
321(6)
Unknown or Broken Keys
323(1)
Alias Entry Collisions
324(3)
Unresolved References
327(5)
Distinct Composites
328(1)
Surrogate Assignments
329(1)
Transactional Redistribution
330(1)
Alias Propagation
331(1)
Reference Entries
332(1)
New References
332(1)
Alias Entries
332(2)
Bridges and Groups
334(1)
Hierarchy Entries
334(1)
New Self-Hierarchies
335(1)
Fiat Hierarchies
335(1)
Natural Hierarchies
336(7)
Terminus Resolution
339(1)
Fiat Hierarchy Cascade
340(3)
11 ETL Definition Pipe
343(34)
Processing Complexities
343(9)
Metadata Transformation
347(1)
Deep and Wide Staging
348(4)
Example Master Loads
352(4)
Insert New Definitions
356(3)
New Orphans
359(1)
Orphan Auto-Adoption
360(1)
Definition Change Processing
361(6)
Slowly Changing Dimensions
361(3)
Multiple Simultaneous Transactions
364(3)
Building SCD Transaction Sets
367(5)
Staging Existing Definitions
367(1)
Assigning Deep Row Numbers
368(1)
Distribute Deep Non-SCD Updates
369(1)
Assigning Relative Wide Row Numbers
370(2)
Applying Transactions to Dimensions
372(3)
Obtain New Dimension IDs
373(1)
Auto-Adopt Orphan Definitions
374(1)
Insert New Definitions
374(1)
Update Existing Definitions
375(1)
Performance Concerns
375(2)
12 ETL Fact Pipe
377(30)
Metadata Transformation
378(5)
Generating Sourced Facts
378(2)
Generating Factless Facts
380(2)
Staged Facts
382(1)
Bridges and Groups
383(8)
Bridge Staging
384(2)
Group Staging
386(1)
Group Lookups
387(2)
New Group Surrogates
389(1)
Distribute Surrogates to Group IDs
390(1)
Distribute Group IDs to Bridges
390(1)
Insert New Groups
390(1)
Insert New Bridges
391(1)
Build Facts
391(3)
Bridge Pivot
392(2)
Value Alignment
394(1)
Finalize Dimensions
394(6)
Unit of Measure
395(3)
Implicit UOM
396(1)
Assigning the Implicit UOM
397(1)
Calendar and Clock
398(1)
Organization
399(1)
Optional Dimensions
400(1)
Set Control Dimensions
400(1)
Data State
400(1)
Datafeed
401(1)
Insert Fact Values
401(1)
Superseding Facts
402(5)
Metadata Review
403(4)
13 Finalizing Beta
407(16)
Audit Trail Facts
407(3)
Datafeed Dimension
410(1)
Verification and Validation
411(8)
Structural Verification
412(13)
Context Tables
412(1)
Reference Tables
413(2)
Definition Tables
415(1)
Bridge Tables
416(1)
Group Tables
416(1)
Fact Table
417(1)
Metadata
418(1)
Preparing for Gamma
419(4)
Section III Gamma Version
14 Finalizing ETL Workflows
423(28)
Alternatively Sourced Keys
425(6)
Sourcing Compound Natural Keys
425(2)
Sourcing Warehouse Surrogates
427(1)
Alternatives in the Reference Pipe
428(3)
Sourced Metadata
431(1)
Standard Data Editing
432(5)
Value Trimming and Cleanup
433(1)
Timestamp Handling
433(1)
Data Type Checking
433(3)
Range Checking
436(1)
Value-Level UOM
437(4)
Undetermined Dimensionality
441(4)
ETL Transactions
445(1)
Target States
446(1)
Superseded Facts
447(3)
Continuous Functional Evolution
450(1)
15 Establishing Data Controls
451(30)
Finalizing Warehouse Design
451(11)
Database Statistics
451(1)
Application Layer Issues
452(1)
Indexing and Partitioning
453(1)
Outrigger Tables
454(3)
Fact Tables
457(5)
Redaction Control Settings
462(3)
Data Monitoring
465(8)
Pipeline Counts
466(1)
System Rows
466(1)
Unexpected and Undesired Values
467(2)
Orphan Tracking
469(4)
Surrogate Merges
473(2)
Security Controls
475(1)
Implementing Dataset Controls
476(3)
Warehouse Support Team
479(2)
16 Building out the Data
481(30)
Minimize Data Seams
481(5)
Shifting toward Metrics
486(10)
Metrics Process Design
487(1)
New Dimension and Fact Tables
488(4)
Metric Fact Table
490(1)
Control Fact Table
491(1)
Metric Subdimension
492(3)
Control Subdimension
495(1)
Display Subdimension
496(1)
Populating Metric Values
496(2)
Populating Control Values
498(2)
Populating Displays
500(11)
Metric Aggregation
502(5)
Metric—Control Fact Views
507(4)
17 Delivering Data
511(54)
Warehousing Use Cases
511(7)
User Analysts
512(1)
HIPAA Controller
512(1)
Data Owners
513(2)
Data Governance
514(1)
Institutional Review Board
515(1)
Support Teams
515(2)
Data Administration
516(1)
Auditor
516(1)
Data Sources
517(24)
Standards Bodies
517(1)
Privacy-Oriented Usage Profiles
518(2)
Metadata Browsing
520(4)
Cohort Identification
524(4)
Fact Count Queries
528(1)
Timeline Generation
529(10)
Business Intelligence
539(2)
Alternative Data Views
541(7)
Flattened Groups
541(6)
Flattened Facts
547(1)
External Data Marts
548(17)
i2b2
548(2)
Provider Dimension
550(3)
Patient Dimension
553(1)
Visit Dimension
553(4)
Concept Dimension
557(3)
Fact Table
560(5)
18 Finalizing Gamma
565(12)
Business Requirements
565(1)
Technical Challenges
566(4)
Configuration Management
566(2)
Job Scheduling
568(1)
Database Timing
569(1)
Functional Challenges
570(4)
Patient Merge/Unmerge
570(2)
HIPAA Redaction
572(1)
IRB Integration
573(1)
Going Live
574(3)
Section IV Release 1.0*
19 Knowledge Synthesis
577(28)
Fact Counts
577(8)
Fact Counts by Metadata
578(1)
Ranked Dimensioned Facts
579(5)
Correlation Analysis
584(1)
Derivative Data
585(3)
Inter-event Timings
586(1)
Census Data
587(1)
Timeline Analysis
588(5)
Nonpatient Timelines
590(3)
Statistical Analyses
593(1)
Descriptive Statistics
593(1)
Statistical Process Control
594(1)
Semantic Annotation
595(10)
20 Data Governance
605(12)
Organizing for Governance
605(5)
Data Governance Group
606(1)
Data Owners
607(1)
Data Administration
608(1)
Warehouse Support Team
609(1)
Governance Opportunities
610(7)
Immediate Decisions
610(1)
Near-Term Control
610(1)
Medium-Term Growth
611(3)
Release Management
612(1)
Source Application Feedback
612(1)
Expanded Data Sharing
613(1)
Long-Term Evolution
614(3)
Process Maturity
615(1)
Translational Medicine
615(2)
Index 617
Richard E. Biehl is an information technology consultant with 37 years of experience, specializing in logical and physical data architectures, quality management, and strategic planning for the application of information technology. His research interests include semantic interoperability in biomedical data and the integration of chaos and complexity theories into the systems engineering of healthcare. Dr. Biehl holds a PhD in applied management and decision science and an MS in educational change and technology innovation from Walden University, Minneapolis, Minnesota. He is a certified Six Sigma Black Belt (CSSBB) and a Software Quality Engineer (CSQE) by the American Society for Quality (ASQ), Milwaukee, Wisconsin. Dr. Biehl is a visiting instructor at the University of Central Florida (UCF), Orlando, Florida, in the College of Engineering and Computer Science (CECS), teaching quality and systems engineering in the Industrial Engineering and Management Systems (IEMS) Department.