Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Data Profiling [Kõva köide]

Lukasz Golab, Felix Naumann, Ziawasch Abedjan, Thorsten Papenbrock

Formaat: Hardback, 154 pages, kõrgus x laius: 235x191 mm, kaal: 333 g
Sari: Synthesis Lectures on Data Management
Ilmumisaeg: 08-Nov-2018
Kirjastus: Morgan & Claypool Publishers
ISBN-10: 1681734486
ISBN-13: 9781681734484

Teised raamatud teemal:

Information technology: general issues - (Hetkel poes: 1 nimetust)
Databases - (Hetkel poes: 1 nimetust)
Data mining - (Hetkel poes: 1 nimetust)
Data capture & analysis

Kõva köide
Hind: 128,35 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 154 pages, kõrgus x laius: 235x191 mm, kaal: 333 g
Sari: Synthesis Lectures on Data Management
Ilmumisaeg: 08-Nov-2018
Kirjastus: Morgan & Claypool Publishers
ISBN-10: 1681734486
ISBN-13: 9781681734484

Teised raamatud teemal:

Information technology: general issues - (Hetkel poes: 1 nimetust)
Databases - (Hetkel poes: 1 nimetust)
Data mining - (Hetkel poes: 1 nimetust)
Data capture & analysis

Püsilink: https://www.kriso.ee/db/9781681734484.html

Data profiling refers to the activity of collecting data about data, i.e., metadata. Most IT professionals and researchers who work with data have engaged in data profiling, at least informally, to understand and explore an unfamiliar dataset or to determine whether a new dataset is appropriate for a particular task at hand. Data profiling results are also important in a variety of other situations, including query optimization, data integration, and data cleaning. Simple metadata are statistics, such as the number of rows and columns, schema and datatype information, the number of distinct values, statistical value distributions, and the number of null or empty values in each column. More complex types of metadata are statements about multiple columns and their correlation, such as candidate keys, functional dependencies, and other types of dependencies.

This book provides a classification of the various types of profilable metadata, discusses popular data profiling tasks, and surveys state-of-the-art profiling algorithms. While most of the book focuses on tasks and algorithms for relational data profiling, we also briefly discuss systems and techniques for profiling non-relational data such as graphs and text. We conclude with a discussion of data profiling challenges and directions for future work in this area.

Preface

Acknowledgments

xvii

1 Discovering Metadata

(6)

1.1 Motivation and Overview

(2)

1.2 Data Profiling and Data Mining

(1)

1.3 Use Cases

(2)

1.4 Organization of This Book

(1)

2 Data Profiling Tasks

(4)

2.1 Single-Column Analysis

(2)

2.2 Dependency Discovery

(1)

2.3 Relaxed Dependencies

(2)

3 Single-Column Analysis

(8)

3.1 Cardinalities

(1)

3.2 Value Distributions

(3)

3.3 Data Types, Patterns, and Domains

(1)

3.4 Data Completeness

(1)

3.5 Approximate Statistics

(1)

3.6 Summary and Discussion

(2)

4 Dependency Discovery

(56)

4.1 Dependency Definitions

(5)

4.1.1 Functional Dependencies

(1)

4.1.2 Unique Column Combinations

(1)

4.1.3 Inclusion Dependencies

(1)

4.2 Search Space and Data Structures

(7)

4.2.1 Lattices and Search Space Sizes

(3)

4.2.2 Position List Indexes and Search Space Validation

(2)

4.2.3 Search Complexity

(1)

4.2.4 Null Semantics

(1)

4.3 Discovering Unique Column Combinations

(8)

4.3.1 Gordian

(2)

4.3.2 HCA

(1)

4.3.3 Ducc

(2)

4.3.4 HyUCC

(1)

4.3.5 Swan

(1)

4.4 Discovering Functional Dependencies

(16)

4.4.1 Tane

(1)

4.4.2 Fun

(3)

4.4.3 FD_Mine

(1)

4.4.4 Dfd

(1)

4.4.5 Dep-Miner

(2)

4.4.6 FastFDs

(2)

4.4.7 Fdep V

(1)

4.4.8 HyFD

(4)

4.5 Discovering Inclusion Dependencies

(20)

4.5.1 SQL-Based IND Validation

(3)

4.5.2 B&B

(1)

4.5.3 DeMarchi

(1)

4.5.4 Binder

(2)

4.5.5 Spider

(2)

4.5.6 S-IndD

(2)

4.5.7 Sindy

(1)

4.5.8 Mind

(1)

4.5.9 Find2

(1)

4.5.10 ZigZag

(1)

4.5.11 Mind2

(3)

5 Relaxed and Other Dependencies

(12)

5.1 Relaxing the Extent of a Dependency

(3)

5.1.1 Partial Dependencies

(1)

5.1.2 Conditional Dependencies

(2)

5.2 Relaxing Attribute Comparisons

(5)

5.2.1 Metric and Matching Dependencies

(3)

5.2.2 Order and Sequential Dependencies

(2)

5.3 Approximating the Dependency Discovery

(1)

5.4 Generalizing Functional Dependencies

(4)

5.4.1 Denial Constraints

(1)

5.4.2 Multivalued Dependencies

(3)

6 Use Cases

(6)

6.1 Data Exploration

(1)

6.2 Schema Engineering

(1)

6.3 Data Cleaning

(1)

6.4 Query Optimization

(1)

6.5 Data Integration

(2)

7 Profiling Non-Relational Data

(4)

7.1 XML

(1)

7.2 RDF

(1)

7.3 Time Series

(1)

7.4 Graphs

(1)

7.5 Text

(1)

8 Data Profiling Tools

(6)

8.1 Research Prototypes

(2)

8.2 Commercial Tools

(4)

9 Data Profiling Challenges

103

(8)

9.1 Functional Challenges

103

(5)

9.1.1 Profiling Dynamic Data

103

(1)

9.1.2 Interactive Profiling

104

(1)

9.1.3 Profiling tor Integration

105

(1)

9.1.4 Interpreting Profiling Results

106

(2)

9.2 Non-Functional Challenges

108

(3)

9.2.1 Efficiency and Scalability

108

(1)

9.2.2 Profiling on New Architectures

109

(1)

9.2.3 Benchmarking Profiling Methods

109

(2)

10 Conclusions

111

(2)

Bibliography

113

(22)

Authors' Biographies

135

Data Profiling [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv