Foreword |
|
xi | |
Preface |
|
xiii | |
Acknowledgments |
|
xiv | |
About This Book |
|
xv | |
About The Author |
|
xix | |
About The Cover Illustration |
|
xx | |
Part 1 Getting Started With Kafka Streams |
|
1 | (54) |
|
1 Welcome to Kafka Streams |
|
|
3 | (19) |
|
1.1 The big data movement, and how it changed the programming landscape |
|
|
4 | (4) |
|
|
4 | (1) |
|
Important concepts from MapReduce |
|
|
5 | (3) |
|
Batch processing is not enough |
|
|
8 | (1) |
|
1.2 Introducing stream processing |
|
|
8 | (2) |
|
When to use stream processing, and when not to use it |
|
|
9 | (1) |
|
1.3 Handling a purchase transaction |
|
|
10 | (2) |
|
Weighing the stream-processing option |
|
|
10 | (1) |
|
Deconstructing the requirements into a graph |
|
|
11 | (1) |
|
1.4 Changing perspective on a purchase transaction |
|
|
12 | (3) |
|
|
12 | (1) |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
13 | (1) |
|
|
13 | (2) |
|
1.5 Kafka Streams as a graph of processing nodes |
|
|
15 | (1) |
|
1.6 Applying Kafka Streams to the purchase transaction flow |
|
|
16 | (6) |
|
|
16 | (1) |
|
The first processor: masking credit card numbers |
|
|
17 | (1) |
|
The second processor: purchase patterns |
|
|
18 | (1) |
|
The third processor: customer rewards |
|
|
19 | (1) |
|
The fourth processor-writing purchase records |
|
|
20 | (2) |
|
|
22 | (33) |
|
|
23 | (1) |
|
2.2 Using Kafka to handle data |
|
|
23 | (2) |
|
ZMart's original data platform |
|
|
23 | (1) |
|
A Kafka sales transaction data hub |
|
|
24 | (1) |
|
|
25 | (15) |
|
Kafka is a message broker |
|
|
26 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
28 | (1) |
|
Partitions group data by key |
|
|
29 | (1) |
|
Writing a custom partitioner |
|
|
30 | (1) |
|
Specifying a custom partitioner |
|
|
31 | (1) |
|
Determining the correct number of partitions |
|
|
32 | (1) |
|
|
32 | (1) |
|
ZooKeeper: leaders, followers, and replication |
|
|
33 | (1) |
|
|
33 | (1) |
|
|
34 | (1) |
|
|
34 | (1) |
|
Controller responsibilities |
|
|
35 | (2) |
|
|
37 | (1) |
|
|
37 | (1) |
|
|
38 | (2) |
|
2.4 Sending messages with producers |
|
|
40 | (4) |
|
|
42 | (1) |
|
Specifying partitions and timestamps |
|
|
42 | (1) |
|
|
43 | (1) |
|
|
43 | (1) |
|
2.5 Reading messages with consumers |
|
|
44 | (5) |
|
|
44 | (2) |
|
|
46 | (1) |
|
|
46 | (1) |
|
|
47 | (1) |
|
|
47 | (1) |
|
|
47 | (1) |
|
Finer-grained consumer assignment |
|
|
48 | (1) |
|
|
48 | (1) |
|
2.6 Installing and running Kafka |
|
|
49 | (8) |
|
Kafka local configuration |
|
|
49 | (1) |
|
|
50 | (2) |
|
Sending your first message |
|
|
52 | (3) |
Part 2 Kafka Streams Development |
|
55 | (118) |
|
3 Developing Kafka Streams |
|
|
57 | (27) |
|
3.1 The Streams Processor API |
|
|
58 | (1) |
|
3.2 Hello World for Kafka Streams |
|
|
58 | (7) |
|
Creating the topology for the Yelling App |
|
|
59 | (4) |
|
Kafka Streams configuration |
|
|
63 | (1) |
|
|
63 | (2) |
|
3.3 Working with customer data |
|
|
65 | (9) |
|
|
66 | (6) |
|
|
72 | (2) |
|
3.4 Interactive development |
|
|
74 | (2) |
|
|
76 | (8) |
|
|
76 | (5) |
|
Writing records outside of Kafka |
|
|
81 | (3) |
|
|
84 | (33) |
|
|
85 | (1) |
|
|
86 | (1) |
|
4.2 Applying stateful operations to Kafka Streams |
|
|
86 | (10) |
|
The transformValues processor |
|
|
87 | (1) |
|
Stateful customer rewards |
|
|
88 | (2) |
|
Initializing the value transformer |
|
|
90 | (1) |
|
Mapping the Purchase object to a RewardAccumulator using state |
|
|
90 | (4) |
|
Updating the rewards processor |
|
|
94 | (2) |
|
4.3 Using state stores for lookups and previously seen data |
|
|
96 | (4) |
|
|
96 | (1) |
|
Failure recovery and fault tolerance |
|
|
97 | (1) |
|
Using state stores in Kafka Streams |
|
|
98 | (1) |
|
Additional key/value store suppliers |
|
|
99 | (1) |
|
StateStore fault tolerance |
|
|
99 | (1) |
|
Configuring changelog topics |
|
|
99 | (1) |
|
4.4 Joining streams for added insight |
|
|
100 | (10) |
|
|
102 | (1) |
|
Generating keys containing customer IDs to perform joins |
|
|
103 | (1) |
|
|
104 | (5) |
|
|
109 | (1) |
|
4.5 Timestamps in Kafka Streams |
|
|
110 | (7) |
|
Provided TimestampExtractor implementations |
|
|
112 | (1) |
|
WallclockTimestampExtractor |
|
|
113 | (1) |
|
Custom TimestampExtractor |
|
|
114 | (1) |
|
Specifying a TimestampExtractor |
|
|
115 | (2) |
|
|
117 | (28) |
|
5.1 The relationship between streams and tables |
|
|
118 | (5) |
|
|
118 | (1) |
|
Updates to records or the changelog |
|
|
119 | (3) |
|
Event streams vs. update streams |
|
|
122 | (1) |
|
5.2 Record updates and KTable configuration |
|
|
123 | (3) |
|
Setting cache buffering size |
|
|
124 | (1) |
|
Setting the commit interval |
|
|
125 | (1) |
|
5.3 Aggregations and windowing operations |
|
|
126 | (19) |
|
Aggregating share volume by industry |
|
|
127 | (5) |
|
|
132 | (7) |
|
Joining KStreams and KTables |
|
|
139 | (1) |
|
|
140 | (3) |
|
|
143 | (2) |
|
|
145 | (28) |
|
6.1 The trade-offs of higher-level abstractions vs. more control |
|
|
146 | (1) |
|
6.2 Working with sources, processors, and sinks to create a topology |
|
|
146 | (6) |
|
|
147 | (1) |
|
|
148 | (3) |
|
|
151 | (1) |
|
6.3 Digging deeper into the Processor API with a stock analysis processor |
|
|
152 | (7) |
|
The stock-performance processor application |
|
|
153 | (4) |
|
|
157 | (1) |
|
|
158 | (1) |
|
6.4 The co-group processor |
|
|
159 | (11) |
|
Building the co-grouping processor |
|
|
161 | (9) |
|
6.5 Integrating the Processor API and the Kafka Streams API |
|
|
170 | (3) |
Part 3 Administering Kafka Streams |
|
173 | (42) |
|
7 Monitoring and performance |
|
|
175 | (24) |
|
7.1 Basic Kafka monitoring |
|
|
176 | (6) |
|
Measuring consumer and producer performance |
|
|
176 | (2) |
|
Checking for consumer lag |
|
|
178 | (1) |
|
Intercepting the producer and consumer |
|
|
179 | (3) |
|
|
182 | (17) |
|
|
184 | (1) |
|
How to hook into the collected metrics |
|
|
185 | (1) |
|
|
185 | (4) |
|
|
189 | (2) |
|
More Kafka Streams debugging techniques |
|
|
191 | (1) |
|
Viewing a representation of the application |
|
|
191 | (1) |
|
Getting notification on various states of the application |
|
|
192 | (1) |
|
|
193 | (2) |
|
|
195 | (3) |
|
Uncaught exception handler |
|
|
198 | (1) |
|
8 Testing a Kafka Streams application |
|
|
199 | (16) |
|
|
201 | (7) |
|
|
202 | (2) |
|
Testing a state store in the topology |
|
|
204 | (1) |
|
Testing processors and transformers |
|
|
205 | (3) |
|
|
208 | (9) |
|
Building an integration test |
|
|
209 | (6) |
Part 4 Advanced Concepts With Kafka Streams |
|
215 | (30) |
|
9 Advanced applications with Kafka Streams |
|
|
217 | (28) |
|
9.1 Integrating Kafka with other data sources |
|
|
218 | (8) |
|
Using Kafka Connect to integrate data |
|
|
219 | (1) |
|
|
219 | (3) |
|
|
222 | (4) |
|
9.2 Kicking your database to the curb |
|
|
226 | (11) |
|
How interactive queries work |
|
|
228 | (1) |
|
Distributing state stores |
|
|
229 | (1) |
|
Setting up and discovering a distributed state store |
|
|
230 | (2) |
|
Coding interactive queries |
|
|
232 | (2) |
|
|
234 | (3) |
|
|
237 | (8) |
|
|
238 | (1) |
|
|
238 | (2) |
|
Installing and running KSQL |
|
|
240 | (1) |
|
|
241 | (1) |
|
|
242 | (1) |
|
|
243 | (1) |
|
|
244 | (1) |
Appendix A Additional Configuration Information |
|
245 | (6) |
Appendix B Exactly Once Semantics |
|
251 | (2) |
Index |
|
253 | |