Foreword |
|
xiii | |
Preface |
|
xv | |
|
Part I Getting Started with Presto |
|
|
|
|
3 | (16) |
|
The Problems with Big Data |
|
|
3 | (1) |
|
|
4 | (3) |
|
Designed for Performance and Scale |
|
|
5 | (1) |
|
|
6 | (1) |
|
Separation of Data Storage and Query Compute Resources |
|
|
7 | (1) |
|
|
7 | (5) |
|
One SQL Analytics Access Point |
|
|
7 | (1) |
|
Access Point to Data Warehouse and Source Systems |
|
|
8 | (1) |
|
Provide SQL-Based Access to Anything |
|
|
9 | (1) |
|
|
10 | (1) |
|
Semantic Layer for a Virtual Data Warehouse |
|
|
10 | (1) |
|
|
11 | (1) |
|
|
11 | (1) |
|
Better Insights Due to Faster Response Times |
|
|
11 | (1) |
|
Big Data, Machine Learning, and Artificial Intelligence |
|
|
12 | (1) |
|
|
12 | (1) |
|
|
12 | (4) |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
13 | (1) |
|
Source Code, License, and Version |
|
|
14 | (1) |
|
|
14 | (1) |
|
|
15 | (1) |
|
|
15 | (1) |
|
|
16 | (1) |
|
A Brief History of Presto |
|
|
16 | (1) |
|
|
17 | (2) |
|
2 Installing and Configuring Presto |
|
|
19 | (6) |
|
Trying Presto with the Docker Container |
|
|
19 | (1) |
|
Installing from Archive File |
|
|
20 | (3) |
|
|
20 | (1) |
|
|
21 | (1) |
|
|
21 | (1) |
|
|
22 | (1) |
|
|
23 | (1) |
|
|
24 | (1) |
|
|
24 | (1) |
|
|
25 | (18) |
|
Presto Command-Line Interface |
|
|
25 | (5) |
|
|
25 | (3) |
|
|
28 | (1) |
|
|
28 | (1) |
|
|
28 | (1) |
|
|
29 | (1) |
|
|
30 | (1) |
|
|
30 | (1) |
|
|
30 | (5) |
|
Downloading and Registering the Driver |
|
|
32 | (1) |
|
Establishing a Connection to Presto |
|
|
32 | (3) |
|
|
35 | (1) |
|
|
35 | (1) |
|
|
35 | (1) |
|
|
36 | (4) |
|
|
37 | (1) |
|
|
37 | (3) |
|
|
40 | (3) |
|
Part II Diving Deeper into Presto |
|
|
|
|
43 | (30) |
|
Coordinator and Workers in a Cluster |
|
|
43 | (2) |
|
|
45 | (1) |
|
|
46 | (1) |
|
|
46 | (1) |
|
Connector-Based Architecture |
|
|
47 | (1) |
|
Catalogs, Schemas, and Tables |
|
|
48 | (1) |
|
|
48 | (5) |
|
|
53 | (4) |
|
|
54 | (1) |
|
|
54 | (3) |
|
|
57 | (3) |
|
|
57 | (1) |
|
|
58 | (1) |
|
|
58 | (1) |
|
|
59 | (1) |
|
|
60 | (2) |
|
Lateral Join Decorrelation |
|
|
60 | (1) |
|
Semi-Join (IN) Decorrelation |
|
|
61 | (1) |
|
|
62 | (8) |
|
|
62 | (2) |
|
|
64 | (1) |
|
|
65 | (1) |
|
|
66 | (1) |
|
Table Statistics for Partitioned Tables |
|
|
67 | (1) |
|
|
68 | (1) |
|
Broadcast Versus Distributed Joins |
|
|
68 | (2) |
|
Working with Table Statistics |
|
|
70 | (2) |
|
|
70 | (1) |
|
Gathering Statistics When Writing to Disk |
|
|
71 | (1) |
|
|
71 | (1) |
|
Displaying Table Statistics |
|
|
72 | (1) |
|
|
72 | (1) |
|
5 Production-Ready Deployment |
|
|
73 | (12) |
|
|
73 | (1) |
|
|
73 | (2) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (1) |
|
|
77 | (2) |
|
|
79 | (1) |
|
|
80 | (2) |
|
Installation Directory Structure |
|
|
81 | (1) |
|
|
82 | (1) |
|
|
82 | (1) |
|
Installation in the Cloud |
|
|
82 | (1) |
|
Cluster Sizing Considerations |
|
|
83 | (1) |
|
|
84 | (1) |
|
|
85 | (24) |
|
|
86 | (1) |
|
RDBMS Connector Example PostgreSQL |
|
|
87 | (5) |
|
|
88 | (2) |
|
Parallelism and Concurrency |
|
|
90 | (1) |
|
|
90 | (2) |
|
|
92 | (1) |
|
Presto TPC-H and TPC-DS Connectors |
|
|
92 | (1) |
|
Hive Connector for Distributed Storage Data Sources |
|
|
93 | (11) |
|
|
94 | (1) |
|
|
95 | (1) |
|
|
96 | (1) |
|
Managed and External Tables |
|
|
97 | (1) |
|
|
98 | (2) |
|
|
100 | (2) |
|
File Formats and Compression |
|
|
102 | (1) |
|
|
103 | (1) |
|
Non-Relational Data Sources |
|
|
104 | (1) |
|
|
104 | (2) |
|
|
106 | (1) |
|
|
107 | (1) |
|
|
107 | (1) |
|
|
108 | (1) |
|
7 Advanced Connector Examples |
|
|
109 | (22) |
|
Connecting to HBase with Phoenix |
|
|
109 | (1) |
|
Key-Value Store Connector Example: Accumulo |
|
|
110 | (7) |
|
Using the Presto Accumulo Connector |
|
|
113 | (2) |
|
Predicate Pushdown in Accumulo |
|
|
115 | (2) |
|
Apache Cassandra Connector |
|
|
117 | (1) |
|
Streaming System Connector Example: Kafka |
|
|
118 | (2) |
|
Document Store Connector Example: Elasticsearch |
|
|
120 | (2) |
|
|
120 | (1) |
|
|
121 | (1) |
|
|
121 | (1) |
|
|
122 | (1) |
|
|
122 | (1) |
|
Query Federation in Presto |
|
|
122 | (7) |
|
Extract, Transform, Load and Federated Queries |
|
|
129 | (1) |
|
|
129 | (2) |
|
|
131 | (38) |
|
|
132 | (2) |
|
|
134 | (2) |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
138 | (1) |
|
|
139 | (6) |
|
Table and Column Properties |
|
|
141 | (1) |
|
Copying an Existing Table |
|
|
142 | (1) |
|
Creating a New Table from Query Results |
|
|
143 | (1) |
|
|
144 | (1) |
|
|
144 | (1) |
|
Table Limitations from Connectors |
|
|
144 | (1) |
|
|
145 | (1) |
|
Session Information and Configuration |
|
|
146 | (1) |
|
|
147 | (8) |
|
|
149 | (1) |
|
|
150 | (4) |
|
|
154 | (1) |
|
|
155 | (2) |
|
|
157 | (1) |
|
GROUP BY and HAVING Clauses |
|
|
158 | (1) |
|
ORDER BY and LIMIT Clauses |
|
|
159 | (1) |
|
|
160 | (1) |
|
UNION, INTERSECT, and EXCEPT Clauses |
|
|
161 | (1) |
|
|
162 | (2) |
|
|
164 | (1) |
|
|
165 | (2) |
|
|
165 | (1) |
|
|
166 | (1) |
|
|
166 | (1) |
|
Deleting Data from a Table |
|
|
167 | (1) |
|
|
167 | (2) |
|
|
169 | (30) |
|
Functions and Operators Introduction |
|
|
169 | (1) |
|
Scalar Functions and Operators |
|
|
170 | (1) |
|
|
171 | (1) |
|
|
172 | (1) |
|
Range Selection with the BETWEEN Statement |
|
|
173 | (1) |
|
Value Detection with IS (NOT) NULL |
|
|
174 | (1) |
|
Mathematical Functions and Operators |
|
|
174 | (1) |
|
|
175 | (1) |
|
Constant and Random Functions |
|
|
176 | (1) |
|
String Functions and Operators |
|
|
176 | (1) |
|
|
177 | (1) |
|
|
178 | (1) |
|
|
179 | (3) |
|
Unnesting Complex Data Types |
|
|
182 | (1) |
|
|
183 | (1) |
|
Date and Time Functions and Operators |
|
|
184 | (2) |
|
|
186 | (1) |
|
|
187 | (3) |
|
|
187 | (2) |
|
Approximate Aggregate Functions |
|
|
189 | (1) |
|
|
190 | (2) |
|
|
192 | (1) |
|
|
193 | (1) |
|
|
194 | (2) |
|
|
196 | (3) |
|
Part III Presto in Real-World Uses |
|
|
|
|
199 | (30) |
|
|
200 | (3) |
|
Password and LDAP Authentication |
|
|
201 | (2) |
|
|
203 | (6) |
|
|
204 | (3) |
|
|
207 | (2) |
|
|
209 | (8) |
|
Encrypting Presto Client-to-Coordinator Communication |
|
|
211 | (3) |
|
Creating Java Keystores and Java Truststores |
|
|
214 | (2) |
|
Encrypting Communication Within the Presto Cluster |
|
|
216 | (1) |
|
Certificate Authority Versus Self-Signed Certificates |
|
|
217 | (2) |
|
Certificate Authentication |
|
|
219 | (3) |
|
|
222 | (2) |
|
|
222 | (1) |
|
Kerberos Client Authentication |
|
|
222 | (1) |
|
Cluster Internal Kerberos |
|
|
223 | (1) |
|
Data Source Access and Configuration for Security |
|
|
224 | (1) |
|
Kerberos Authentication with the Hive Connector |
|
|
225 | (2) |
|
Hive Metastore Thrift Service Authentication |
|
|
226 | (1) |
|
|
227 | (1) |
|
|
227 | (1) |
|
|
227 | (2) |
|
11 Integrating Presto with Other Tools |
|
|
229 | (10) |
|
Queries, Visualizations, and More with Apache Superset |
|
|
229 | (1) |
|
Performance Improvements with RubiX |
|
|
230 | (1) |
|
Workflows with Apache Airflow |
|
|
231 | (1) |
|
Embedded Presto Example: Amazon Athena |
|
|
231 | (4) |
|
Starburst Enterprise Presto |
|
|
235 | (1) |
|
Other Integration Examples |
|
|
235 | (1) |
|
|
236 | (1) |
|
|
236 | (3) |
|
|
239 | (28) |
|
Monitoring with the Presto Web UI |
|
|
239 | (12) |
|
|
240 | (1) |
|
|
241 | (3) |
|
|
244 | (7) |
|
Tuning Presto SQL Queries |
|
|
251 | (3) |
|
|
254 | (4) |
|
|
258 | (1) |
|
|
258 | (1) |
|
Scheduling Splits per Task and per Node |
|
|
259 | (1) |
|
|
259 | (1) |
|
|
259 | (1) |
|
|
260 | (1) |
|
|
260 | (1) |
|
Tuning Java Virtual Machine |
|
|
260 | (2) |
|
|
262 | (4) |
|
Resource Group Definition |
|
|
264 | (1) |
|
|
265 | (1) |
|
Selector Rules Definition |
|
|
265 | (1) |
|
|
266 | (1) |
|
|
267 | (6) |
|
Deployment and Runtime Platforms |
|
|
267 | (1) |
|
|
268 | (2) |
|
Hadoop/Hive Migration Use Case |
|
|
270 | (1) |
|
|
270 | (1) |
|
|
271 | (1) |
|
|
272 | (1) |
|
|
273 | (2) |
Index |
|
275 | |