Muutke küpsiste eelistusi

E-raamat: Advanced Data Management: For SQL, NoSQL, Cloud and Distributed Databases

  • Formaat: 374 pages
  • Sari: De Gruyter Textbook
  • Ilmumisaeg: 29-Oct-2015
  • Kirjastus: De Gruyter
  • Keel: eng
  • ISBN-13: 9783110433074
  • Formaat - EPUB+DRM
  • Hind: 40,33 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 374 pages
  • Sari: De Gruyter Textbook
  • Ilmumisaeg: 29-Oct-2015
  • Kirjastus: De Gruyter
  • Keel: eng
  • ISBN-13: 9783110433074

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Advanced data management has always been at the core of efficient database and information systems. Recent trends like big data and cloud computing have aggravated the need for sophisticated and flexible data storage and processing solutions. This book provides a comprehensive coverage of the principles of data management developed in the last decades with a focus on data structures and query languages. It treats a wealth of different data models and surveys the foundations of structuring, processing, storing and querying data according these models.

Starting off with the topic of database design, it further discusses weaknesses of the relational data model, and then proceeds to convey the basics of graph data, tree-structured XML data, key-value pairs and nested, semi-structured JSON data, columnar and record-oriented data as well as object-oriented data. The final chapters round the book off with an analysis of fragmentation, replication and consistency strategies for data management in distributed databases as well as recommendations for handling polyglot persistence in multi-model databases and multi-database architectures.

While primarily geared towards students of Master-level courses in Computer Science and related areas, this book may also be of benefit to practitioners looking for a reference book on data modeling and query processing. It provides both theoretical depth and a concise treatment of open source technologies currently on the market.
Preface vii
Overview ix
List of Figures
xix
List of Tables
xxii
Part I Introduction
1 Background
3(14)
1.1 Database Properties
3(2)
1.2 Database Components
5(2)
1.3 Database Design
7(7)
1.3.1 Entity-Relationship Model
8(3)
1.3.2 Unified Modeling Language
11(3)
1.4 Bibliographic Notes
14(3)
2 Relational Database Management Systems
17(16)
2.1 Relational Data Model
17(2)
2.1.1 Database and Relation Schemas
17(1)
2.1.2 Mapping ER Models to Schemas
18(1)
2.2 Normalization
19(1)
2.3 Referential Integrity
20(2)
2.4 Relational Query Languages
22(2)
2.5 Concurrency Management
24(4)
2.5.1 Transactions
24(2)
2.5.2 Concurrency Control
26(2)
2.6 Bibliographic Notes
28(5)
Part II NOSQL And Non-Relational Databases
3 New Requirements, "Not only SQL" and the Cloud
33(8)
3.1 Weaknesses of the Relational Data Model
33(3)
3.1.1 Inadequate Representation of Data
33(1)
3.1.2 Semantic Overloading
34(1)
3.1.3 Weak Support for Recursion
34(1)
3.1.4 Homogeneity
35(1)
3.2 Weaknesses of RDBMSs
36(1)
3.3 New Data Management Challenges
37(2)
3.4 Bibliographic Notes
39(2)
4 Graph Databases
41(28)
4.1 Graphs and Graph Structures
41(4)
4.1.1 A Glimpse on Graph Theory
42(2)
4.1.2 Graph Traversal and Graph Problems
44(1)
4.2 Graph Data Structures
45(8)
4.2.1 Edge List
46(1)
4.2.2 Adjacency Matrix
46(2)
4.2.3 Incidence Matrix
48(2)
4.2.4 Adjacency List
50(1)
4.2.5 Incidence List
51(2)
4.3 The Property Graph Model
53(3)
4.4 Storing Property Graphs in Relational Tables
56(2)
4.5 Advanced Graph Models
58(4)
4.6 Implementations and Systems
62(6)
4.6.1 Apache TinkerPop
62(3)
4.6.2 Neo4J
65(1)
4.6.3 HyperGraphDB
66(2)
4.7 Bibliographic Notes
68(1)
5 XML Databases
69(36)
5.1 XML Background
69(12)
5.1.1 XML Documents
69(2)
5.1.2 Document Type Definition (DTD)
71(2)
5.1.3 XML Schema Definition (XSD)
73(2)
5.1.4 XML Parsers
75(1)
5.1.5 Tree Model of XML Documents
76(2)
5.1.6 Numbering Schemes
78(3)
5.2 XML Query Languages
81(3)
5.2.1 XPath
81(1)
5.2.2 XQuery
82(1)
5.2.3 XSLT
83(1)
5.3 Storing XML in Relational Databases
84(6)
5.3.1 SQL/XML
84(2)
5.3.2 Schema-Based Mapping
86(3)
5.3.3 Schemaless Mapping
89(1)
5.4 Native XML Storage
90(10)
5.4.1 XML Indexes
90(2)
5.4.2 Storage Management
92(5)
5.4.3 XML Concurrency Control
97(3)
5.5 Implementations and Systems
100(4)
5.5.1 eXistDB
100(2)
5.5.2 BaseX
102(2)
5.6 Bibliographic Notes
104(1)
6 Key-value Stores and Document Databases
105(38)
6.1 Key-Value Storage
105(4)
6.1.1 Map-Reduce
106(3)
6.2 Document Databases
109(9)
6.2.1 Java Script Object Notation
110(2)
6.2.2 JSON Schema
112(4)
6.2.3 Representational State Transfer
116(2)
6.3 Implementations and Systems
118(22)
6.3.1 Apache Hadoop MapReduce
118(3)
6.3.2 Apache Pig
121(6)
6.3.3 Apache Hive
127(1)
6.3.4 Apache Sqoop
128(1)
6.3.5 Riak
129(3)
6.3.6 Redis
132(1)
6.3.7 MongoDB
133(3)
6.3.8 CouchDB
136(3)
6.3.9 Couchbase
139(1)
6.4 Bibliographic Notes
140(3)
7 Column Stores
143(18)
7.1 Column-Wise Storage
143(8)
7.1.1 Column Compression
144(5)
7.1.2 Null Suppression
149(2)
7.2 Column striping
151(7)
7.3 Implementations and Systems
158(1)
7.3.1 MonetDB
158(1)
7.3.2 Apache Parquet
158(1)
7.4 Bibliographic Notes
159(2)
8 Extensible Record Stores
161(32)
8.1 Logical Data Model
161(5)
8.2 Physical storage
166(15)
8.2.1 Memtables and immutable sorted data files
166(3)
8.2.2 File format
169(2)
8.2.3 Redo logging
171(2)
8.2.4 Compaction
173(2)
8.2.5 Bloom filters
175(6)
8.3 Implementations and Systems
181(10)
8.3.1 Apache Cassandra
181(4)
8.3.2 Apache HBase
185(2)
8.3.3 Hypertable
187(2)
8.3.4 Apache Accumulo
189(2)
8.4 Bibliographic Notes
191(2)
9 Object Databases
193(42)
9.1 Object Orientation
193(9)
9.1.1 Object Identifiers
194(2)
9.1.2 Normalization for Objects
196(4)
9.1.3 Referential Integrity for Objects
200(1)
9.1.4 Object-Oriented Standards and Persistence Patterns
200(2)
9.2 Object-Relational Mapping
202(7)
9.2.1 Mapping Collection Attributes to Relations
203(1)
9.2.2 Mapping Reference Attributes to Relations
204(1)
9.2.3 Mapping Class Hierarchies to Relations
204(4)
9.2.4 Two-Level Storage
208(1)
9.3 Object Mapping APIs
209(8)
9.3.1 Java Persistence API (JPA)
209(6)
9.3.2 Apache Java Data Objects (JDO)
215(2)
9.4 Object-Relational Databases
217(5)
9.5 Object Databases
222(7)
9.5.1 Object Persistence
223(1)
9.5.2 Single-Level Storage
224(2)
9.5.3 Reference Management
226(1)
9.5.4 Pointer Swizzling
226(3)
9.6 Implementations and Systems
229(3)
9.6.1 DataNucleus
229(1)
9.6.2 ZooDB
230(2)
9.7 Bibliographic Notes
232(3)
Part III Distributed Data Management
10 Distributed Database Systems
235(10)
10.1 Scaling horizontally
235(1)
10.2 Distribution Transparency
236(1)
10.3 Failures in Distributed Systems
237(2)
10.4 Epidemic Protocols and Gossip Communication
239(5)
10.4.1 Hash Trees
241(2)
10.4.2 Death Certificates
243(1)
10.5 Bibliographic Notes
244(1)
11 Data Fragmentation
245(16)
11.1 Properties and Types of Fragmentation
245(4)
11.2 Fragmentation Approaches
249(6)
11.2.1 Fragmentation for Relational Tables
249(1)
11.2.2 XML Fragmentation
250(2)
11.2.3 Graph Partitioning
252(1)
11.2.4 Sharding for Key-Based Stores
253(1)
11.2.5 Object Fragmentation
254(1)
11.3 Data Allocation
255(4)
11.3.1 Cost-based allocation
256(1)
11.3.2 Consistent Hashing
257(2)
11.4 Bibliographic Notes
259(2)
12 Replication And Synchronization
261(34)
12.1 Replication Models
261(5)
12.1.1 Master-Slave Replication
262(1)
12.1.2 Multi-Master Replication
263(1)
12.1.3 Replication Factor and the Data Replication Problem
263(2)
12.1.4 Hinted Handoff and Read Repair
265(1)
12.2 Distributed Concurrency Control
266(10)
12.2.1 Two-Phase Commit
266(2)
12.2.2 Paxos Algorithm
268(8)
12.2.3 Multiversion Concurrency Control
276(1)
12.3 Ordering of Events and Vector Clocks
276(17)
12.3.1 Scalar Clocks
277(3)
12.3.2 Concurrency and Clock Properties
280(1)
12.3.3 Vector Clocks
281(3)
12.3.4 Version Vectors
284(5)
12.3.5 Optimizations of Vector Clocks
289(4)
12.4 Bibliographic Notes
293(2)
13 Consistency
295(16)
13.1 Strong Consistency
295(7)
13.1.1 Write and Read Quorums
298(2)
13.1.2 Snapshot Isolation
300(2)
13.2 Weak Consistency
302(4)
13.2.1 Data-Centric Consistency Models
303(2)
13.2.2 Client-Centric Consistency Models
305(1)
13.3 Consistency Trade-offs
306(1)
13.4 Bibliographic Notes
307(4)
Part IV Conclusion
14 Further Database Technologies
311(6)
14.1 Linked Data and RDF Data Management
311(1)
14.2 Data Stream Management
312(1)
14.3 Array Databases
313(1)
14.4 Geographic Information Systems
314(1)
14.5 In-Memory Databases
315(1)
14.6 NewSQL Databases
315(1)
14.7 Bibliographic Notes
316(1)
15 Concluding Remarks
317(16)
15.1 Database Reengineering
317(1)
15.2 Database Requirements
318(2)
15.3 Polyglot Database Architectures
320(4)
15.3.1 Polyglot Persistence
320(2)
15.3.2 Lambda Architecture
322(1)
15.3.3 Multi-Model Databases
322(2)
15.4 Implementations and Systems
324(7)
15.4.1 Apache Drill
324(2)
15.4.2 Apache Druid
326(1)
15.4.3 OrientDB
327(3)
15.4.4 ArangoDB
330(1)
15.5 Bibliographic Notes
331(2)
Bibliography 333(14)
Index 347
Lena Wiese, University of Göttingen, Germany.