|
1 Introduction to Information Quality |
|
|
1 | (20) |
|
|
1 | (1) |
|
1.2 Why Information Quality Is Relevant |
|
|
2 | (3) |
|
1.2.1 Private Initiatives |
|
|
3 | (1) |
|
|
4 | (1) |
|
1.3 Introduction to the Concept of Information Quality |
|
|
5 | (2) |
|
1.4 Information Quality and Information Classifications |
|
|
7 | (2) |
|
1.5 Information Quality and Types of Information Systems |
|
|
9 | (2) |
|
1.6 Main Research Issues and Application Domains |
|
|
11 | (7) |
|
1.6.1 Research Issues in Information Quality |
|
|
12 | (1) |
|
1.6.2 Application Domains in Information Quality |
|
|
13 | (3) |
|
1.6.3 Research Areas Related to Information Quality |
|
|
16 | (2) |
|
1.7 Standardization Efforts in Information Quality |
|
|
18 | (1) |
|
|
19 | (2) |
|
2 Data Quality Dimensions |
|
|
21 | (32) |
|
|
21 | (1) |
|
2.2 A Classification Framework for Data and Information Quality Dimensions |
|
|
22 | (1) |
|
|
23 | (5) |
|
2.3.1 Structural Accuracy Dimensions |
|
|
24 | (3) |
|
2.3.2 Time-Related Accuracy Dimensions |
|
|
27 | (1) |
|
|
28 | (5) |
|
2.4.1 Completeness of Relational Data |
|
|
29 | (3) |
|
2.4.2 Completeness of Web Data |
|
|
32 | (1) |
|
2.5 Accessibility Cluster |
|
|
33 | (2) |
|
|
35 | (2) |
|
2.6.1 Integrity Constraints |
|
|
35 | (2) |
|
|
37 | (1) |
|
2.7 Approaches to the Definition of Data Quality Dimensions |
|
|
37 | (7) |
|
2.7.1 Theoretical Approach |
|
|
38 | (1) |
|
|
39 | (1) |
|
|
40 | (1) |
|
2.7.4 A Comparative Analysis of the Dimension Definitions |
|
|
41 | (2) |
|
2.7.5 Trade-Offs Between Dimensions |
|
|
43 | (1) |
|
2.8 Schema Quality Dimensions |
|
|
44 | (6) |
|
|
45 | (1) |
|
2.8.2 Completeness Cluster |
|
|
45 | (1) |
|
|
46 | (2) |
|
2.8.4 Readability Cluster |
|
|
48 | (2) |
|
|
50 | (3) |
|
3 Information Quality Dimensions for Maps and Texts |
|
|
53 | (34) |
|
|
53 | (1) |
|
3.2 From Data Quality Dimensions to Information Quality Dimensions |
|
|
54 | (1) |
|
3.3 Information Quality in Maps |
|
|
55 | (7) |
|
3.3.1 Conceptual Structure of Maps and Quality Dimensions of Maps |
|
|
57 | (3) |
|
3.3.2 Levels of Abstraction and Quality of Maps |
|
|
60 | (2) |
|
3.4 Information Quality in Semistructured Texts |
|
|
62 | (14) |
|
|
64 | (1) |
|
3.4.2 Readability Cluster |
|
|
64 | (4) |
|
3.4.3 Consistency Cluster |
|
|
68 | (5) |
|
3.4.4 Other Issues Investigated in the Area of Text Comprehension |
|
|
73 | (1) |
|
3.4.5 Accessibility Cluster |
|
|
74 | (1) |
|
3.4.6 Text Quality in Administrative Documents |
|
|
75 | (1) |
|
3.5 Information Quality in Law Texts |
|
|
76 | (10) |
|
|
79 | (1) |
|
|
80 | (1) |
|
3.5.3 Readability Cluster |
|
|
81 | (1) |
|
3.5.4 Accessibility Cluster |
|
|
81 | (2) |
|
3.5.5 Consistency Cluster |
|
|
83 | (1) |
|
3.5.6 Global Quality Index |
|
|
84 | (2) |
|
|
86 | (1) |
|
4 Data Quality Issues in Linked Open Data |
|
|
87 | (26) |
|
|
87 | (1) |
|
4.2 Semantic Web Standards and Linked Data |
|
|
88 | (10) |
|
4.2.1 The Web and the Rationale for Linked Data |
|
|
88 | (1) |
|
4.2.2 Semantic Web Standards |
|
|
89 | (7) |
|
|
96 | (2) |
|
4.3 Quality Dimensions in Linked Open Data |
|
|
98 | (12) |
|
|
99 | (4) |
|
4.3.2 Completeness Cluster |
|
|
103 | (1) |
|
|
104 | (2) |
|
4.3.4 Readability Cluster |
|
|
106 | (1) |
|
4.3.5 Accessibility Cluster |
|
|
106 | (3) |
|
4.3.6 Consistency Cluster |
|
|
109 | (1) |
|
4.4 Interrelationships Between Dimensions |
|
|
110 | (2) |
|
|
112 | (1) |
|
|
113 | (24) |
|
|
113 | (2) |
|
5.2 Image Quality Models and Dimensions |
|
|
115 | (6) |
|
5.3 Image Quality Assessment Approaches |
|
|
121 | (6) |
|
5.3.1 Subjective Approaches to Assessment |
|
|
121 | (2) |
|
5.3.2 Objective Approaches |
|
|
123 | (4) |
|
5.4 Quality Assessment and Image Production Workflow |
|
|
127 | (2) |
|
5.5 Quality Assessment in High-Quality Image Archives |
|
|
129 | (4) |
|
5.6 Video Quality Assessment |
|
|
133 | (1) |
|
|
134 | (3) |
|
6 Models for Information Quality |
|
|
137 | (18) |
|
|
137 | (1) |
|
6.2 Extensions of Structured Data Models |
|
|
138 | (7) |
|
|
138 | (2) |
|
6.2.2 Logical Models for Data Description |
|
|
140 | (1) |
|
6.2.3 The Polygen Model for Data Manipulation |
|
|
141 | (1) |
|
|
142 | (3) |
|
6.3 Extensions of Semistructured Data Models |
|
|
145 | (2) |
|
6.4 Management Information System Models |
|
|
147 | (7) |
|
6.4.1 Models for Process Description: The IP-MAP Model |
|
|
147 | (2) |
|
6.4.2 Extensions of IP-MAP |
|
|
149 | (1) |
|
|
150 | (4) |
|
|
154 | (1) |
|
7 Activities for Information Quality |
|
|
155 | (22) |
|
|
155 | (1) |
|
7.2 Information Quality Activities: Generalities |
|
|
156 | (1) |
|
|
157 | (11) |
|
7.3.1 Models and Assumptions |
|
|
160 | (1) |
|
|
161 | (3) |
|
|
164 | (1) |
|
|
165 | (3) |
|
7.4 Error Localization and Correction |
|
|
168 | (6) |
|
7.4.1 Localize and Correct Inconsistencies |
|
|
169 | (2) |
|
|
171 | (1) |
|
7.4.3 Discovering Outliers |
|
|
172 | (2) |
|
|
174 | (3) |
|
|
177 | (40) |
|
|
177 | (1) |
|
8.2 Historical Perspective |
|
|
178 | (1) |
|
8.3 Object Identification for Different Data Types |
|
|
179 | (2) |
|
8.4 The High-Level Process for Object Identification |
|
|
181 | (2) |
|
8.5 Details on the Steps for Object Identification |
|
|
183 | (5) |
|
|
183 | (1) |
|
8.5.2 Search Space Reduction |
|
|
184 | (1) |
|
8.5.3 Distance-Based Comparison Functions |
|
|
185 | (2) |
|
|
187 | (1) |
|
8.6 Probabilistic Techniques |
|
|
188 | (7) |
|
8.6.1 The Fellegi and Sunter Theory and Extensions |
|
|
188 | (6) |
|
8.6.2 A Cost-Based Probabilistic Technique |
|
|
194 | (1) |
|
|
195 | (9) |
|
8.7.1 Sorted Neighborhood Method and Extensions |
|
|
195 | (3) |
|
8.7.2 The Priority Queue Algorithm |
|
|
198 | (1) |
|
8.7.3 A Technique for Complex Structured Data: Delphi |
|
|
199 | (2) |
|
8.7.4 XML Duplicate Detection: DogmatiX |
|
|
201 | (1) |
|
8.7.5 Other Empirical Methods |
|
|
202 | (2) |
|
8.8 Knowledge-Based Techniques |
|
|
204 | (5) |
|
|
204 | (1) |
|
8.8.2 A Rule-Based Approach: Intelliclean |
|
|
205 | (2) |
|
8.8.3 Learning Methods for Decision Rules: Atlas |
|
|
207 | (2) |
|
|
209 | (6) |
|
8.9.1 Qualities and Related Metrics |
|
|
209 | (2) |
|
8.9.2 Search Space Reduction Methods |
|
|
211 | (1) |
|
8.9.3 Comparison Functions |
|
|
211 | (1) |
|
|
211 | (3) |
|
|
214 | (1) |
|
|
215 | (2) |
|
9 Recent Advances in Object Identification |
|
|
217 | (62) |
|
|
217 | (2) |
|
|
219 | (6) |
|
9.2.1 Qualities for Reduction |
|
|
220 | (1) |
|
9.2.2 Qualities for the Comparison and Decision Step |
|
|
220 | (2) |
|
9.2.3 General Analyses and Recommendations |
|
|
222 | (1) |
|
9.2.4 Hints on Frameworks for OID Techniques Evaluation |
|
|
223 | (2) |
|
|
225 | (2) |
|
9.4 Search Space Reduction |
|
|
227 | (6) |
|
9.4.1 Introduction to Techniques for Search Space Reduction |
|
|
227 | (1) |
|
9.4.2 Indexing Techniques |
|
|
227 | (4) |
|
9.4.3 Learnable, Adaptive, and Context-Based Reduction Techniques |
|
|
231 | (2) |
|
9.5 Comparison and Decision |
|
|
233 | (25) |
|
9.5.1 Extensions of the Fellegi and Sunter Probabilistic Model |
|
|
234 | (1) |
|
9.5.2 Knowledge in the Comparison Function |
|
|
235 | (3) |
|
9.5.3 Contextual Knowledge in Decision |
|
|
238 | (6) |
|
9.5.4 Other Types of Knowledge in Decision |
|
|
244 | (2) |
|
9.5.5 Incremental Techniques |
|
|
246 | (6) |
|
9.5.6 Multiple Decision Models |
|
|
252 | (1) |
|
9.5.7 Object Identification at Query Time |
|
|
253 | (2) |
|
9.5.8 OID Evolutive Maintenance |
|
|
255 | (3) |
|
9.6 Domain-Specific Object Identification Techniques |
|
|
258 | (4) |
|
|
259 | (2) |
|
|
261 | (1) |
|
9.7 Object Identification Techniques for Maps and Images |
|
|
262 | (10) |
|
9.7.1 Map Matching: Location-Based Matching |
|
|
263 | (2) |
|
9.7.2 Map Matching: Location- and Feature-Based Matching |
|
|
265 | (1) |
|
9.7.3 Map and Orthoimage Matching |
|
|
266 | (4) |
|
9.7.4 Digital Gazetteer Data Matching |
|
|
270 | (2) |
|
9.8 Privacy Preserving Object Identification |
|
|
272 | (4) |
|
9.8.1 Privacy Requirements |
|
|
273 | (2) |
|
9.8.2 Matching Techniques |
|
|
275 | (1) |
|
9.8.3 Analysis and Evaluation |
|
|
276 | (1) |
|
|
276 | (1) |
|
|
276 | (3) |
|
10 Data Quality Issues in Data Integration Systems |
|
|
279 | (30) |
|
|
279 | (2) |
|
10.2 Generalities on Data Integration Systems |
|
|
281 | (1) |
|
|
282 | (2) |
|
10.3 Techniques for Quality-Driven Query Processing |
|
|
284 | (6) |
|
10.3.1 The QP-alg: Quality-Driven Query Planning |
|
|
284 | (2) |
|
10.3.2 DaQuinCIS Query Processing |
|
|
286 | (2) |
|
10.3.3 Fusionplex Query Processing |
|
|
288 | (2) |
|
10.3.4 Comparison of Quality-Driven Query Processing Techniques |
|
|
290 | (1) |
|
10.4 Instance-Level Conflict Resolution |
|
|
290 | (14) |
|
10.4.1 Classification of Instance-Level Conflicts |
|
|
291 | (2) |
|
10.4.2 Overview of Techniques |
|
|
293 | (10) |
|
10.4.3 Comparison of Instance-Level Conflict Resolution Techniques |
|
|
303 | (1) |
|
10.5 Inconsistencies in Data Integration: A Theoretical Perspective |
|
|
304 | (3) |
|
10.5.1 A Formal Framework for Data Integration |
|
|
304 | (1) |
|
10.5.2 The Problem of Inconsistency |
|
|
305 | (2) |
|
|
307 | (2) |
|
11 Information Quality in Use |
|
|
309 | (44) |
|
|
309 | (2) |
|
11.2 A Historical Perspective on Information Quality in Business Processes and Decision Making |
|
|
311 | (1) |
|
11.3 Models of Utility and Objective vs. Contextual Metrics |
|
|
312 | (7) |
|
11.4 Cost-Benefit Classifications for Data Quality |
|
|
319 | (6) |
|
11.4.1 Cost Classifications |
|
|
319 | (5) |
|
11.4.2 Benefits Classification |
|
|
324 | (1) |
|
11.5 Methodologies for Cost-Benefit Management of Information Quality |
|
|
325 | (9) |
|
11.6 How to Relate Contextual Quality Metrics with Utility |
|
|
334 | (3) |
|
11.7 Information Quality and Decision Making |
|
|
337 | (14) |
|
11.7.1 Relationships Between Information Quality and Decision Making |
|
|
338 | (1) |
|
11.7.2 Information Quality Usage in the Decision Process |
|
|
339 | (7) |
|
11.7.3 Decision Making and Information Overload |
|
|
346 | (2) |
|
11.7.4 Value-Driven Decision Making |
|
|
348 | (3) |
|
|
351 | (2) |
|
12 Methodologies for Information Quality Assessment and Improvement |
|
|
353 | (50) |
|
|
353 | (1) |
|
12.2 Basics on Information Quality Methodologies |
|
|
354 | (7) |
|
12.2.1 Inputs and Outputs |
|
|
354 | (3) |
|
12.2.2 Classification of Methodologies |
|
|
357 | (1) |
|
12.2.3 Comparison Among Information-Driven and Process-Driven Strategies |
|
|
358 | (2) |
|
12.2.4 Basic Common Phases Among Methodologies |
|
|
360 | (1) |
|
12.3 Comparison of Methodologies |
|
|
361 | (7) |
|
|
362 | (2) |
|
|
364 | (2) |
|
12.3.3 Strategies and Techniques |
|
|
366 | (1) |
|
12.3.4 Comparison of Methodologies: Summary |
|
|
366 | (2) |
|
12.4 Detailed Comparative Analysis of Three General-Purpose Methodologies |
|
|
368 | (8) |
|
12.4.1 The TDQM Methodology |
|
|
369 | (2) |
|
|
371 | (3) |
|
12.4.3 The Istat Methodology |
|
|
374 | (2) |
|
12.5 Assessment Methodologies |
|
|
376 | (3) |
|
|
379 | (7) |
|
12.6.1 Reconstruct the State of Data |
|
|
379 | (1) |
|
12.6.2 Reconstruct Business Processes |
|
|
380 | (1) |
|
12.6.3 Reconstruct Macroprocesses and Rules |
|
|
381 | (1) |
|
12.6.4 Check Problems with Users |
|
|
382 | (1) |
|
12.6.5 Measure Data Quality |
|
|
382 | (1) |
|
12.6.6 Set New Target IQ Levels |
|
|
383 | (1) |
|
12.6.7 Choose Improvement Activities |
|
|
384 | (1) |
|
12.6.8 Choose Techniques for Data Activities |
|
|
384 | (1) |
|
12.6.9 Find Improvement Processes |
|
|
385 | (1) |
|
12.6.10 Choose the Optimal Improvement Process |
|
|
386 | (1) |
|
12.7 A Case Study in the e-Government Area |
|
|
386 | (12) |
|
12.7.1 Reconstruct the State of Data |
|
|
388 | (1) |
|
12.7.2 Reconstruct Business Processes |
|
|
388 | (1) |
|
12.7.3 Reconstruct Macroprocesses and Rules |
|
|
389 | (1) |
|
12.7.4 Check Problems with Users |
|
|
390 | (1) |
|
12.7.5 Measure Data Quality |
|
|
391 | (1) |
|
12.7.6 Set New Target Data Quality Levels |
|
|
392 | (1) |
|
12.7.7 Choose Improvement Activities |
|
|
393 | (3) |
|
12.7.8 Choose Techniques for Data Activities |
|
|
396 | (1) |
|
12.7.9 Find Improvement Processes |
|
|
396 | (1) |
|
12.7.10 Choose the Optimal Improvement Process |
|
|
397 | (1) |
|
12.8 Extension of CDQM to Heterogeneous Information Types |
|
|
398 | (4) |
|
|
402 | (1) |
|
13 Information Quality in Healthcare |
|
|
403 | (18) |
|
|
403 | (1) |
|
13.2 Definitions and Scopes |
|
|
404 | (1) |
|
13.3 Inherent Challenges of Healthcare |
|
|
405 | (4) |
|
13.3.1 Multiple Uses, Users, and Applications |
|
|
406 | (3) |
|
13.4 Health Information Quality Dimensions, Methodologies, and Initiatives |
|
|
409 | (4) |
|
13.5 The Relevance of Information Quality in the Healthcare Domain |
|
|
413 | (5) |
|
13.5.1 Health Information Quality and Its Consequences on Healthcare |
|
|
417 | (1) |
|
|
418 | (3) |
|
14 Quality of Web Data and Quality of Big Data: Open Problems |
|
|
421 | (1) |
|
|
421 | (2) |
|
14.2 Two Relevant Paradigms for Web Data Quality: Trustworthiness and Provenance |
|
|
423 | (9) |
|
|
423 | (4) |
|
|
427 | (5) |
|
14.3 Web Object Identification |
|
|
432 | (7) |
|
14.3.1 Object Identification and Time Variability |
|
|
433 | (4) |
|
14.3.2 Object Identification and Quality |
|
|
437 | (2) |
|
14.4 Quality of Big Data: A Classification of Big Data Sources |
|
|
439 | (1) |
|
14.5 Source-Specific Quality Issues in Sensor Data |
|
|
440 | (4) |
|
14.5.1 Information Quality in Sensors and Sensor Networks |
|
|
440 | (3) |
|
14.5.2 Techniques for Data Cleaning in Sensors and Sensor Networks |
|
|
443 | (1) |
|
14.6 Domain-Specific Quality Issues: Official Statistics |
|
|
444 | (4) |
|
14.6.1 On the Quality of Big Data for Official Statistics |
|
|
445 | (1) |
|
|
446 | (2) |
|
|
448 | |
Erratum to: Data and Information Quality: Dimensions, Principles and Techniques |
|
1 | (450) |
References |
|
451 | (32) |
Index |
|
483 | |