Preface |
|
xxi | |
Introduction |
|
xxv | |
Part I: Introduction |
|
1 | (78) |
|
Chapter 1 Different Databases for Different Requirements |
|
|
3 | (36) |
|
Relational Database Design |
|
|
4 | (2) |
|
|
5 | (1) |
|
Early Database Management Systems |
|
|
6 | (13) |
|
Flat File Data Management Systems |
|
|
7 | (5) |
|
Organization of Flat File Data Management Systems |
|
|
7 | (2) |
|
|
9 | (1) |
|
Limitations of Flat File Data Management Systems |
|
|
9 | (3) |
|
Hierarchical Data Model Systems |
|
|
12 | (2) |
|
Organization of Hierarchical Data Management Systems |
|
|
12 | (2) |
|
Limitations of Hierarchical Data Management Systems |
|
|
14 | (1) |
|
Network Data Management Systems |
|
|
14 | (3) |
|
Organization of Network Data Management Systems |
|
|
15 | (2) |
|
Limitations of Network Data Management Systems |
|
|
17 | (1) |
|
Summary of Early Database Management Systems |
|
|
17 | (2) |
|
The Relational Database Revolution |
|
|
19 | (10) |
|
Relational Database Management Systems |
|
|
19 | (10) |
|
Organization of Relational Database Management Systems |
|
|
20 | (6) |
|
Organization of Applications Using Relational Database Management Systems |
|
|
26 | (1) |
|
Limitations of Relational Databases |
|
|
27 | (2) |
|
Motivations for Not Just/No SQL (NoSQL) Databases |
|
|
29 | (5) |
|
|
29 | (2) |
|
|
31 | (1) |
|
|
31 | (1) |
|
|
32 | (2) |
|
|
34 | (1) |
|
|
35 | (1) |
|
|
36 | (1) |
|
|
37 | (1) |
|
|
37 | (2) |
|
Chapter 2 Variety of NoSQL Databases |
|
|
39 | (40) |
|
Data Management with Distributed Databases |
|
|
41 | (13) |
|
|
41 | (1) |
|
Maintain Data Consistency |
|
|
42 | (2) |
|
|
44 | (5) |
|
Consistency of Database Transactions |
|
|
47 | (1) |
|
Availability and Consistency in Distributed Databases |
|
|
48 | (1) |
|
Balancing Response Times, Consistency, and Durability |
|
|
49 | (2) |
|
Consistency, Availability, and Partitioning: The CAP Theorem |
|
|
51 | (3) |
|
|
54 | (5) |
|
ACID: Atomicity, Consistency, Isolation, and Durability |
|
|
54 | (2) |
|
BASE: Basically Available, Soft State, Eventually Consistent |
|
|
56 | (1) |
|
Types of Eventual Consistency |
|
|
57 | (2) |
|
|
57 | (1) |
|
Read-Your-Writes Consistency |
|
|
57 | (1) |
|
|
58 | (1) |
|
Monotonic Read Consistency |
|
|
58 | (1) |
|
Monotonic Write Consistency |
|
|
58 | (1) |
|
Four Types of NoSQL Databases |
|
|
59 | (16) |
|
|
60 | (6) |
|
|
60 | (4) |
|
|
64 | (1) |
|
Differences Between Key-Value and Relational Databases |
|
|
65 | (1) |
|
|
66 | (3) |
|
|
66 | (1) |
|
|
67 | (1) |
|
Differences Between Document and Relational Databases |
|
|
68 | (1) |
|
|
69 | (2) |
|
Columns and Column Families |
|
|
69 | (1) |
|
Differences Between Column Family and Relational Databases |
|
|
70 | (1) |
|
|
71 | (11) |
|
|
72 | (1) |
|
Differences Between Graph and Relational Databases |
|
|
73 | (2) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (1) |
|
|
77 | (2) |
Part II: Key-Value Databases |
|
79 | (100) |
|
Chapter 3 Introduction to Key-Value Databases |
|
|
81 | (36) |
|
From Arrays to Key-Value Databases |
|
|
82 | (9) |
|
Arrays: Key Value Stores with Training Wheels |
|
|
82 | (2) |
|
Associative Arrays: Taking Off the Training Wheels |
|
|
84 | (1) |
|
Caches: Adding Gears to the Bike |
|
|
85 | (4) |
|
In-Memory and On-Disk Key-Value Database: From Bikes to Motorized Vehicles |
|
|
89 | (2) |
|
Essential Features of Key-Value Databases |
|
|
91 | (12) |
|
Simplicity: Who Needs Complicated Data Models Anyway? |
|
|
91 | (2) |
|
Speed: There Is No Such Thing as Too Fast |
|
|
93 | (2) |
|
Scalability: Keeping Up with the Rush |
|
|
95 | (8) |
|
Scaling with Master-Slave Replication |
|
|
95 | (3) |
|
Scaling with Masterless Replication |
|
|
98 | (5) |
|
Keys: More Than Meaningless Identifiers |
|
|
103 | (7) |
|
|
103 | (2) |
|
Using Keys to Locate Values |
|
|
105 | (5) |
|
Hash Functions: From Keys to Locations |
|
|
106 | (1) |
|
Keys Help Avoid Write Problems |
|
|
107 | (3) |
|
Values: Storing Just About Any Data You Want |
|
|
110 | (4) |
|
Values Do Not Require Strong Typing |
|
|
110 | (2) |
|
Limitations on Searching for Values |
|
|
112 | (2) |
|
|
114 | (1) |
|
|
115 | (1) |
|
|
116 | (1) |
|
|
116 | (1) |
|
Chapter 4 Key-Value Database Terminology |
|
|
117 | (26) |
|
Key-Value Database Data Modeling Terms |
|
|
118 | (13) |
|
|
121 | (2) |
|
|
123 | (1) |
|
|
124 | (2) |
|
|
126 | (3) |
|
|
129 | (1) |
|
|
129 | (2) |
|
Key-Value Architecture Terms |
|
|
131 | (6) |
|
|
131 | (2) |
|
|
133 | (2) |
|
|
135 | (2) |
|
Key-Value Implementation Terms |
|
|
137 | (4) |
|
|
137 | (1) |
|
|
138 | (1) |
|
|
139 | (2) |
|
|
141 | (1) |
|
|
141 | (1) |
|
|
142 | (1) |
|
Chapter 5 Designing for Key-Value Databases |
|
|
143 | (36) |
|
Key Design and Partitioning |
|
|
144 | (7) |
|
Keys Should Follow a Naming Convention |
|
|
145 | (1) |
|
Well-Designed Keys Save Code |
|
|
145 | (2) |
|
Dealing with Ranges of Values |
|
|
147 | (2) |
|
Keys Must Take into Account Implementation Limitations |
|
|
149 | (1) |
|
How Keys Are Used in Partitioning |
|
|
150 | (1) |
|
Designing Structured Values |
|
|
151 | (8) |
|
Structured Data Types Help Reduce Latency |
|
|
152 | (3) |
|
Large Values Can Lead to Inefficient Read and Write Operations |
|
|
155 | (4) |
|
Limitations of Key-Value Databases |
|
|
159 | (3) |
|
Look Up Values by Key Only |
|
|
160 | (1) |
|
Key-Value Databases Do Not Support Range Queries |
|
|
161 | (1) |
|
No Standard Query Language Comparable to SQL for Relational Databases |
|
|
161 | (1) |
|
Design Patterns for Key-Value Databases |
|
|
162 | (11) |
|
|
163 | (2) |
|
|
165 | (1) |
|
|
166 | (3) |
|
|
169 | (1) |
|
|
170 | (1) |
|
|
171 | (2) |
|
|
173 | (1) |
|
Case Study: Key-Value Databases for Mobile Application Configuration |
|
|
174 | (3) |
|
|
177 | (1) |
|
|
178 | (1) |
Part IA: Document Databases |
|
179 | (96) |
|
Chapter 6 Introduction to Document Databases |
|
|
181 | (32) |
|
|
182 | (17) |
|
Documents Are Not So Simple After All |
|
|
182 | (5) |
|
Documents and Key-Value Pairs |
|
|
187 | (1) |
|
Managing Multiple Documents in Collections |
|
|
188 | (13) |
|
Getting Started with Collections |
|
|
188 | (3) |
|
Tips on Designing Collections |
|
|
191 | (8) |
|
Avoid Explicit Schema Definitions |
|
|
199 | (2) |
|
Basic Operations on Document Databases |
|
|
201 | (9) |
|
Inserting Documents into a Collection |
|
|
202 | (2) |
|
Deleting Documents from a Collection |
|
|
204 | (2) |
|
Updating Documents in a Collection |
|
|
206 | (2) |
|
Retrieving Documents from a Collection |
|
|
208 | (2) |
|
|
210 | (1) |
|
|
210 | (1) |
|
|
211 | (2) |
|
Chapter 7 Document Database Terminology |
|
|
213 | (26) |
|
Document and Collection Terms |
|
|
214 | (10) |
|
|
215 | (2) |
|
Documents: Ordered Sets of Key-Value Pairs |
|
|
215 | (1) |
|
|
216 | (1) |
|
|
217 | (1) |
|
|
218 | (2) |
|
|
220 | (3) |
|
Schemaless Means More Flexibility |
|
|
221 | (1) |
|
Schemaless Means More Responsibility |
|
|
222 | (1) |
|
|
223 | (1) |
|
|
224 | (8) |
|
|
225 | (2) |
|
Horizontal Partitioning or Sharding |
|
|
227 | (5) |
|
Separating Data with Shard Keys |
|
|
229 | (1) |
|
Distributing Data with a Partitioning Algorithm |
|
|
230 | (2) |
|
Data Modeling and Query Processing |
|
|
232 | (5) |
|
|
233 | (2) |
|
|
235 | (1) |
|
|
235 | (2) |
|
|
237 | (1) |
|
|
237 | (1) |
|
|
238 | (1) |
|
Chapter 8 Designing for Document Databases |
|
|
239 | (36) |
|
Normalization, Denormalization, and the Search for Proper Balance |
|
|
241 | (14) |
|
|
242 | (1) |
|
|
243 | (1) |
|
|
243 | (2) |
|
Executing Joins: The Heavy Lifting of Relational Databases |
|
|
245 | (3) |
|
|
247 | (1) |
|
What Would a Document Database Modeler Do? |
|
|
248 | (7) |
|
The Joy of Denormalization |
|
|
249 | (2) |
|
Avoid Overusing Denormalization |
|
|
251 | (2) |
|
Just Say No to Joins, Sometimes |
|
|
253 | (2) |
|
Planning for Mutable Documents |
|
|
255 | (3) |
|
Avoid Moving Oversized Documents |
|
|
258 | (1) |
|
The Goldilocks Zone of Indexes |
|
|
258 | (3) |
|
|
259 | (1) |
|
|
260 | (1) |
|
Modeling Common Relations |
|
|
261 | (6) |
|
One-to-Many Relations in Document Databases |
|
|
262 | (1) |
|
Many-to-Many Relations in Document Databases |
|
|
263 | (2) |
|
Modeling Hierarchies in Document Databases |
|
|
265 | (4) |
|
Parent or Child References |
|
|
265 | (1) |
|
|
266 | (1) |
|
|
267 | (2) |
|
Case Study: Customer Manifests |
|
|
269 | (4) |
|
|
271 | (1) |
|
|
271 | (1) |
|
Separate Collections by Type? |
|
|
272 | (1) |
|
|
273 | (1) |
|
|
273 | (2) |
Part IV: Column Family Databases |
|
275 | (86) |
|
Chapter 9 Introduction to Column Family Databases |
|
|
277 | (30) |
|
In the Beginning, There Was Google BigTable |
|
|
279 | (7) |
|
Utilizing Dynamic Control over Columns |
|
|
280 | (1) |
|
Indexing by Row, Column Name, and Time Stamp |
|
|
281 | (1) |
|
Controlling Location of Data |
|
|
282 | (1) |
|
Reading and Writing Atomic Rows |
|
|
283 | (1) |
|
Maintaining Rows in Sorted Order |
|
|
284 | (2) |
|
Differences and Similarities to Key-Value and Document Databases |
|
|
286 | (7) |
|
Column Family Database Features |
|
|
286 | (1) |
|
Column Family Database Similarities to and Differences from Document Databases |
|
|
287 | (2) |
|
Column Family Database Versus Relational Databases |
|
|
289 | (4) |
|
Avoiding Multirow Transactions |
|
|
290 | (1) |
|
|
291 | (2) |
|
Architectures Used in Column Family Databases |
|
|
293 | (10) |
|
HBase Architecture: Variety of Nodes |
|
|
293 | (2) |
|
Cassandra Architecture: Peer-to-Peer |
|
|
295 | (1) |
|
Getting the Word Around: Gossip Protocol |
|
|
296 | (3) |
|
Thermodynamics and Distributed Database: Why We Need Anti-Entropy |
|
|
299 | (1) |
|
Hold This for Me: Hinted Handoff |
|
|
300 | (3) |
|
When to Use Column Family Databases |
|
|
303 | (1) |
|
|
304 | (1) |
|
|
304 | (1) |
|
|
305 | (2) |
|
Chapter 10 Column Family Database Terminology |
|
|
307 | (22) |
|
Basic Components of Column Family Databases |
|
|
308 | (5) |
|
|
309 | (1) |
|
|
309 | (1) |
|
|
310 | (2) |
|
|
312 | (1) |
|
Structures and Processes: Implementing Column Family Databases |
|
|
313 | (9) |
|
Internal Structures and Configuration Parameters of Column Family Databases |
|
|
313 | (1) |
|
Old Friends: Clusters and Partitions |
|
|
314 | (3) |
|
|
314 | (2) |
|
|
316 | (1) |
|
Taking a Look Under the Hood: More Column Family Database Components |
|
|
317 | (5) |
|
|
317 | (2) |
|
|
319 | (2) |
|
|
321 | (1) |
|
|
322 | (4) |
|
|
322 | (1) |
|
|
323 | (1) |
|
|
324 | (1) |
|
|
325 | (1) |
|
|
326 | (1) |
|
|
327 | (1) |
|
|
327 | (2) |
|
Chapter 11 Designing for Column Family Databases |
|
|
329 | (32) |
|
Guidelines for Designing Tables |
|
|
332 | (8) |
|
Denormalize Instead of Join |
|
|
333 | (1) |
|
Make Use of Valueless Columns |
|
|
334 | (1) |
|
Use Both Column Names and Column Values to Store Data |
|
|
334 | (1) |
|
Model an Entity with a Single Row |
|
|
335 | (2) |
|
Avoid Hotspotting in Row Keys |
|
|
337 | (1) |
|
Keep an Appropriate Number of Column Value Versions |
|
|
338 | (1) |
|
Avoid Complex Data Structures in Column Values |
|
|
339 | (1) |
|
|
340 | (8) |
|
When to Use Secondary Indexes Managed by the Column Family Database System |
|
|
341 | (4) |
|
When to Create and Manage Secondary Indexes Using Tables |
|
|
345 | (3) |
|
Tools for Working with Big Data |
|
|
348 | (8) |
|
Extracting, Transforming, and Loading Big Data |
|
|
350 | (1) |
|
|
351 | (4) |
|
Describing and Predicting with Statistics |
|
|
351 | (2) |
|
Finding Patterns with Machine Learning |
|
|
353 | (1) |
|
Tools for Analyzing Big Data |
|
|
354 | (1) |
|
Tools for Monitoring Big Data |
|
|
355 | (1) |
|
|
356 | (1) |
|
Case Study: Customer Data Analysis |
|
|
357 | (2) |
|
|
357 | (2) |
|
|
359 | (1) |
|
|
360 | (1) |
Part V: Graph Databases |
|
361 | (64) |
|
Chapter 12 Introduction to Graph Databases |
|
|
363 | (16) |
|
|
363 | (2) |
|
Graphs and Network Modeling |
|
|
365 | (7) |
|
Modeling Geographic Locations |
|
|
365 | (1) |
|
Modeling Infectious Diseases |
|
|
366 | (3) |
|
Modeling Abstract and Concrete Entities |
|
|
369 | (1) |
|
|
370 | (2) |
|
Advantages of Graph Databases |
|
|
372 | (4) |
|
Query Faster by Avoiding Joins |
|
|
372 | (3) |
|
|
375 | (1) |
|
Multiple Relations Between Entities |
|
|
375 | (1) |
|
|
376 | (1) |
|
|
376 | (1) |
|
|
377 | (2) |
|
Chapter 13 Graph Database Terminology |
|
|
379 | (20) |
|
|
380 | (5) |
|
|
380 | (1) |
|
|
381 | (2) |
|
|
383 | (1) |
|
|
384 | (1) |
|
|
385 | (3) |
|
|
385 | (1) |
|
|
386 | (1) |
|
|
387 | (1) |
|
Properties of Graphs and Nodes |
|
|
388 | (4) |
|
|
388 | (1) |
|
|
389 | (1) |
|
|
390 | (1) |
|
|
390 | (1) |
|
|
391 | (1) |
|
|
392 | (4) |
|
Undirected and Directed Graphs |
|
|
392 | (1) |
|
|
393 | (1) |
|
|
394 | (1) |
|
|
395 | (1) |
|
|
395 | (1) |
|
|
396 | (1) |
|
|
397 | (1) |
|
|
397 | (2) |
|
Chapter 14 Designing for Graph Databases |
|
|
399 | (26) |
|
Getting Started with Graph Design |
|
|
400 | (8) |
|
Designing a Social Network Graph Database |
|
|
401 | (4) |
|
Queries Drive Design (Again) |
|
|
405 | (3) |
|
|
408 | (7) |
|
Cypher: Declarative Querying |
|
|
408 | (2) |
|
Gremlin: Query by Graph Traversal |
|
|
410 | (5) |
|
|
410 | (2) |
|
Traversing a Graph with Depth-First and Breadth-First Searches |
|
|
412 | (3) |
|
Tips and Traps of Graph Database Design |
|
|
415 | (5) |
|
Use Indexes to Improve Retrieval Time |
|
|
415 | (1) |
|
Use Appropriate Types of Edges |
|
|
416 | (1) |
|
Watch for Cycles When Traversing Graphs |
|
|
417 | (1) |
|
Consider the Scalability of Your Graph Database |
|
|
418 | (2) |
|
|
420 | (1) |
|
Case Study: Optimizing Transportation Routes |
|
|
420 | (3) |
|
|
420 | (1) |
|
Designing a Graph Analysis Solution |
|
|
421 | (2) |
|
|
423 | (1) |
|
|
423 | (2) |
Part VI: Choosing A Database For Your Application |
|
425 | (16) |
|
Chapter 15 Guidelines for Selecting a Database |
|
|
427 | (14) |
|
Choosing a NoSQL Database |
|
|
428 | (6) |
|
Criteria for Selecting Key-Value Databases |
|
|
429 | (1) |
|
Use Cases and Criteria for Selecting Document Databases |
|
|
430 | (1) |
|
Use Cases and Criteria for Selecting Column Family Databases |
|
|
431 | (2) |
|
Use Cases and Criteria for Selecting Graph Databases |
|
|
433 | (1) |
|
Using NoSQL and Relational Databases Together |
|
|
434 | (2) |
|
|
436 | (1) |
|
|
436 | (1) |
|
|
437 | (4) |
Part VII: Appendices |
|
441 | (40) |
|
Appendix A Answers to Chapter Review Questions |
|
|
443 | (34) |
|
Appendix B List of NoSQL Databases |
|
|
477 | (4) |
Glossary |
|
481 | (10) |
Index |
|
491 | |