Preface |
|
ix | |
Foreword |
|
xiii | |
|
|
1 | (14) |
|
|
2 | (1) |
|
What Are Graph Analytics and Algorithms? |
|
|
3 | (3) |
|
Graph Processing, Databases, Queries, and Algorithms |
|
|
6 | (2) |
|
|
7 | (1) |
|
Why Should We Care About Graph Algorithms? |
|
|
8 | (4) |
|
Graph Analytics Use Cases |
|
|
12 | (1) |
|
|
13 | (2) |
|
2 Graph Theory and Concepts |
|
|
15 | (14) |
|
|
15 | (1) |
|
Graph Types and Structures |
|
|
16 | (2) |
|
Random, Small-World, Scale-Free Structures |
|
|
17 | (1) |
|
|
18 | (9) |
|
Connected Versus Disconnected Graphs |
|
|
19 | (1) |
|
Unweighted Graphs Versus Weighted Graphs |
|
|
19 | (2) |
|
Undirected Graphs Versus Directed Graphs |
|
|
21 | (1) |
|
Acyclic Graphs Versus Cyclic Graphs |
|
|
22 | (1) |
|
Sparse Graphs Versus Dense Graphs |
|
|
23 | (1) |
|
Monopartite, Bipartite, and k-Partite Graphs |
|
|
24 | (3) |
|
Types of Graph Algorithms |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
28 | (1) |
|
3 Graph Platforms and Processing |
|
|
29 | (10) |
|
Graph Platform and Processing Considerations |
|
|
29 | (2) |
|
|
29 | (1) |
|
Processing Considerations |
|
|
30 | (1) |
|
|
31 | (6) |
|
|
31 | (1) |
|
|
32 | (3) |
|
|
35 | (2) |
|
|
37 | (2) |
|
4 Pathfinding and Graph Search Algorithms |
|
|
39 | (38) |
|
Example Data: The Transport Graph |
|
|
41 | (4) |
|
Importing the Data into Apache Spark |
|
|
44 | (1) |
|
Importing the Data into Neo4j |
|
|
44 | (1) |
|
|
45 | (3) |
|
Breadth First Search with Apache Spark |
|
|
46 | (2) |
|
|
48 | (1) |
|
|
49 | (11) |
|
When Should I Use Shortest Path? |
|
|
50 | (1) |
|
|
51 | (2) |
|
Shortest Path (Weighted) with Neo4j |
|
|
53 | (1) |
|
Shortest Path (Weighted) with Apache Spark |
|
|
54 | (2) |
|
Shortest Path Variation: A |
|
|
56 | (2) |
|
Shortest Path Variation: Yen's k-Shortest Paths |
|
|
58 | (2) |
|
|
60 | (5) |
|
A Closer Look at All Pairs Shortest Path |
|
|
60 | (2) |
|
When Should I Use All Pairs Shortest Path? |
|
|
62 | (1) |
|
All Pairs Shortest Path with Apache Spark |
|
|
62 | (1) |
|
All Pairs Shortest Path with Neo4j |
|
|
63 | (2) |
|
Single Source Shortest Path |
|
|
65 | (5) |
|
When Should I Use Single Source Shortest Path? |
|
|
67 | (1) |
|
Single Source Shortest Path with Apache Spark |
|
|
67 | (2) |
|
Single Source Shortest Path with Neo4j |
|
|
69 | (1) |
|
|
70 | (4) |
|
When Should I Use Minimum Spanning Tree? |
|
|
71 | (1) |
|
Minimum Spanning Tree with Neo4j |
|
|
72 | (2) |
|
|
74 | (2) |
|
When Should I Use Random Walk? |
|
|
74 | (1) |
|
|
74 | (2) |
|
|
76 | (1) |
|
|
77 | (32) |
|
Example Graph Data: The Social Graph |
|
|
79 | (2) |
|
Importing the Data into Apache Spark |
|
|
80 | (1) |
|
Importing the Data into Neo4j |
|
|
81 | (1) |
|
|
81 | (3) |
|
|
81 | (1) |
|
When Should I Use Degree Centrality? |
|
|
82 | (1) |
|
Degree Centrality with Apache Spark |
|
|
83 | (1) |
|
|
84 | (8) |
|
When Should I Use Closeness Centrality? |
|
|
85 | (1) |
|
Closeness Centrality with Apache Spark |
|
|
86 | (2) |
|
Closeness Centrality with Neo4j |
|
|
88 | (1) |
|
Closeness Centrality Variation: Wasserman and Faust |
|
|
89 | (2) |
|
Closeness Centrality Variation: Harmonic Centrality |
|
|
91 | (1) |
|
|
92 | (7) |
|
When Should I Use Betweenness Centrality? |
|
|
94 | (1) |
|
Betweenness Centrality with Neo4j |
|
|
95 | (3) |
|
Betweenness Centrality Variation: Randomized-Approximate Brandes |
|
|
98 | (1) |
|
|
99 | (9) |
|
|
99 | (1) |
|
|
100 | (2) |
|
Iteration, Random Surfers, and Rank Sinks |
|
|
102 | (1) |
|
When Should I Use PageRank? |
|
|
103 | (1) |
|
PageRank with Apache Spark |
|
|
103 | (2) |
|
|
105 | (2) |
|
PageRank Variation: Personalized PageRank |
|
|
107 | (1) |
|
|
108 | (1) |
|
6 Community Detection Algorithms |
|
|
109 | (36) |
|
Example Graph Data: The Software Dependency Graph |
|
|
112 | (2) |
|
Importing the Data into Apache Spark |
|
|
114 | (1) |
|
Importing the Data into Neo4j |
|
|
114 | (1) |
|
Triangle Count and Clustering Coefficient |
|
|
114 | (5) |
|
Local Clustering Coefficient |
|
|
115 | (1) |
|
Global Clustering Coefficient |
|
|
116 | (1) |
|
When Should I Use Triangle Count and Clustering Coefficient? |
|
|
116 | (1) |
|
Triangle Count with Apache Spark |
|
|
117 | (1) |
|
|
117 | (1) |
|
Local Clustering Coefficient with Neo4j |
|
|
118 | (1) |
|
Strongly Connected Components |
|
|
119 | (5) |
|
When Should I Use Strongly Connected Components? |
|
|
120 | (1) |
|
Strongly Connected Components with Apache Spark |
|
|
120 | (1) |
|
Strongly Connected Components with Neo4j |
|
|
121 | (3) |
|
|
124 | (3) |
|
When Should I Use Connected Components? |
|
|
124 | (1) |
|
Connected Components with Apache Spark |
|
|
125 | (1) |
|
Connected Components with Neo4j |
|
|
126 | (1) |
|
|
127 | (6) |
|
Semi-Supervised Learning and Seed Labels |
|
|
129 | (1) |
|
When Should I Use Label Propagation? |
|
|
129 | (1) |
|
Label Propagation with Apache Spark |
|
|
130 | (1) |
|
Label Propagation with Neo4j |
|
|
131 | (2) |
|
|
133 | (9) |
|
When Should I Use Louvain? |
|
|
137 | (1) |
|
|
138 | (4) |
|
|
142 | (1) |
|
|
143 | (2) |
|
7 Graph Algorithms in Practice |
|
|
145 | (38) |
|
Analyzing Yelp Data with Neo4j |
|
|
145 | (22) |
|
|
146 | (1) |
|
|
147 | (1) |
|
|
147 | (1) |
|
A Quick Overview of the Yelp Data |
|
|
148 | (4) |
|
|
152 | (5) |
|
Travel Business Consulting |
|
|
157 | (5) |
|
Finding Similar Categories |
|
|
162 | (5) |
|
Analyzing Airline Flight Data with Apache Spark |
|
|
167 | (16) |
|
|
168 | (1) |
|
|
168 | (2) |
|
|
170 | (2) |
|
|
172 | (2) |
|
Interconnected Airports by Airline |
|
|
174 | (7) |
|
|
181 | (2) |
|
8 Using Graph Algorithms to Enhance Machine Learning |
|
|
183 | (42) |
|
Machine Learning and the Importance of Context |
|
|
183 | (2) |
|
Graphs, Context, and Accuracy |
|
|
184 | (1) |
|
Connected Feature Extraction and Selection |
|
|
185 | (5) |
|
|
187 | (1) |
|
|
188 | (2) |
|
Graphs and Machine Learning in Practice: Link Prediction |
|
|
190 | (33) |
|
|
191 | (1) |
|
Importing the Data into Neo4j |
|
|
192 | (1) |
|
|
193 | (1) |
|
Creating Balanced Training and Testing Datasets |
|
|
194 | (5) |
|
How We Predict Missing Links |
|
|
199 | (2) |
|
Creating a Machine Learning Pipeline |
|
|
201 | (1) |
|
Predicting Links: Basic Graph Features |
|
|
202 | (11) |
|
Predicting Links: Triangles and the Clustering Coefficient |
|
|
213 | (4) |
|
Predicting Links: Community Detection |
|
|
217 | (6) |
|
|
223 | (1) |
|
|
223 | (2) |
A Additional Information and Resources |
|
225 | (4) |
Index |
|
229 | |