Preface |
|
xiii | |
Foreword |
|
xix | |
Acknowledgements |
|
xxi | |
List of acronyms |
|
xxii | |
Part I Fundamentals |
|
1 | (74) |
|
|
3 | (30) |
|
|
3 | (1) |
|
1.2 Towards continuous data processing: the requirements |
|
|
3 | (3) |
|
1.3 Stream processing foundations |
|
|
6 | (16) |
|
1.3.1 Data management technologies |
|
|
8 | (5) |
|
1.3.2 Parallel and distributed systems |
|
|
13 | (3) |
|
1.3.3 Signal processing, statistics, and data mining |
|
|
16 | (2) |
|
1.3.4 Optimization theory |
|
|
18 | (4) |
|
1.4 Stream processing - tying it all together |
|
|
22 | (2) |
|
|
24 | (9) |
|
2 Introduction to stream processing |
|
|
33 | (42) |
|
|
33 | (1) |
|
2.2 Stream Processing Applications |
|
|
33 | (7) |
|
2.2.1 Network monitoring for cybersecurity |
|
|
34 | (2) |
|
2.2.2 Transportation grid monitoring and optimization |
|
|
36 | (2) |
|
2.2.3 Healthcare and patient monitoring |
|
|
38 | (2) |
|
|
40 | (1) |
|
2.3 Information flow processing technologies |
|
|
40 | (5) |
|
|
41 | (1) |
|
|
42 | (1) |
|
2.3.3 Publish-subscribe systems |
|
|
42 | (1) |
|
2.3.4 Complex event processing systems |
|
|
43 | (1) |
|
2.3.5 ETL and SCADA systems |
|
|
44 | (1) |
|
2.4 Stream Processing Systems |
|
|
45 | (23) |
|
|
45 | (4) |
|
|
49 | (4) |
|
2.4.3 System architecture |
|
|
53 | (3) |
|
|
56 | (10) |
|
|
66 | (2) |
|
|
68 | (1) |
|
|
69 | (1) |
|
|
70 | (5) |
Part II Application development |
|
75 | (126) |
|
3 Application development - the basics |
|
|
77 | (29) |
|
|
77 | (1) |
|
3.2 Characteristics of SPAS |
|
|
77 | (3) |
|
3.3 Stream processing languages |
|
|
80 | (6) |
|
3.3.1 Features of stream processing languages |
|
|
80 | (3) |
|
3.3.2 Approaches to stream processing language design |
|
|
83 | (3) |
|
|
86 | (6) |
|
|
86 | (1) |
|
3.4.2 A "Hello World" application in SPL |
|
|
87 | (5) |
|
3.5 Common stream processing operators |
|
|
92 | (9) |
|
3.5.1 Stream relational operators |
|
|
92 | (4) |
|
|
96 | (1) |
|
3.5.3 Edge adapter operators |
|
|
97 | (4) |
|
|
101 | (1) |
|
3.7 Programming exercises |
|
|
101 | (2) |
|
|
103 | (3) |
|
4 Application development - data flow programming |
|
|
106 | (42) |
|
|
106 | (1) |
|
|
106 | (22) |
|
|
108 | (4) |
|
4.2.2 Dynamic composition |
|
|
112 | (10) |
|
|
122 | (6) |
|
|
128 | (16) |
|
|
128 | (3) |
|
4.3.2 Selectivity and arity |
|
|
131 | (1) |
|
|
132 | (2) |
|
4.3.4 Output assignments and output functions |
|
|
134 | (2) |
|
|
136 | (2) |
|
|
138 | (6) |
|
|
144 | (1) |
|
4.5 Programming exercises |
|
|
144 | (3) |
|
|
147 | (1) |
|
5 Large-scale development - modularity, extensibility, and distribution |
|
|
148 | (30) |
|
|
148 | (1) |
|
5.2 Modularity and extensibility |
|
|
148 | (16) |
|
|
149 | (2) |
|
|
151 | (2) |
|
5.2.3 Primitive operators |
|
|
153 | (8) |
|
5.2.4 Composite and custom operators |
|
|
161 | (3) |
|
5.3 Distributed programming |
|
|
164 | (8) |
|
5.3.1 Logical versus physical flow graphs |
|
|
164 | (2) |
|
|
166 | (4) |
|
|
170 | (2) |
|
|
172 | (1) |
|
5.5 Programming exercises |
|
|
173 | (3) |
|
|
176 | (2) |
|
6 Visualization and debugging |
|
|
178 | (23) |
|
|
178 | (1) |
|
|
178 | (10) |
|
6.2.1 Topology visualization |
|
|
179 | (5) |
|
6.2.2 Metrics visualization |
|
|
184 | (1) |
|
6.2.3 Status visualization |
|
|
185 | (1) |
|
|
186 | (2) |
|
|
188 | (11) |
|
|
189 | (5) |
|
6.3.2 User-defined operator debugging |
|
|
194 | (1) |
|
6.3.3 Deployment debugging |
|
|
194 | (1) |
|
6.3.4 Performance debugging |
|
|
195 | (4) |
|
|
199 | (1) |
|
|
200 | (1) |
Part III System architecture |
|
201 | (72) |
|
7 Architecture of a stream processing system |
|
|
203 | (15) |
|
|
203 | (1) |
|
7.2 Architectural building blocks |
|
|
203 | (4) |
|
7.2.1 Computational environment |
|
|
204 | (1) |
|
|
204 | (2) |
|
|
206 | (1) |
|
7.3 Architecture overview |
|
|
207 | (8) |
|
|
207 | (1) |
|
7.3.2 Resource management |
|
|
208 | (1) |
|
|
209 | (1) |
|
|
210 | (1) |
|
|
211 | (1) |
|
|
212 | (1) |
|
7.3.7 Logging and error reporting |
|
|
213 | (1) |
|
7.3.8 Security and access control |
|
|
213 | (1) |
|
|
214 | (1) |
|
|
214 | (1) |
|
7.4 Interaction with the system architecture |
|
|
215 | (1) |
|
|
215 | (1) |
|
|
215 | (3) |
|
8 InfoSphere Streams architecture |
|
|
218 | (55) |
|
|
218 | (1) |
|
8.2 Background and history |
|
|
218 | (1) |
|
|
219 | (1) |
|
|
220 | (12) |
|
|
222 | (1) |
|
8.4.2 Instance components |
|
|
223 | (4) |
|
|
227 | (2) |
|
|
229 | (3) |
|
|
232 | (36) |
|
|
232 | (4) |
|
8.5.2 Resource management and monitoring |
|
|
236 | (3) |
|
|
239 | (2) |
|
|
241 | (6) |
|
|
247 | (1) |
|
8.5.6 Logging, tracing, and error reporting |
|
|
248 | (3) |
|
8.5.7 Security and access control |
|
|
251 | (5) |
|
8.5.8 Application development support |
|
|
256 | (3) |
|
|
259 | (5) |
|
|
264 | (3) |
|
|
267 | (1) |
|
|
268 | (2) |
|
|
270 | (3) |
Part IV Application design and analytics |
|
273 | (166) |
|
9 Design principles and patterns for stream processing applications |
|
|
275 | (67) |
|
|
275 | (1) |
|
9.2 Functional design patterns and principles |
|
|
275 | (35) |
|
|
275 | (12) |
|
|
287 | (14) |
|
|
301 | (9) |
|
9.3 Non-functional principles and design patterns |
|
|
310 | (29) |
|
9.3.1 Application design and composition |
|
|
310 | (4) |
|
|
314 | (11) |
|
9.3.3 Performance optimization |
|
|
325 | (8) |
|
|
333 | (6) |
|
|
339 | (1) |
|
|
339 | (3) |
|
10 Stream analytics: data pre-processing and transformation |
|
|
342 | (46) |
|
|
342 | (1) |
|
|
342 | (2) |
|
|
344 | (1) |
|
10.4 Descriptive statistics |
|
|
345 | (8) |
|
10.4.1 Illustrative technique: BasicCounting |
|
|
348 | (5) |
|
|
353 | (1) |
|
|
353 | (5) |
|
10.5.1 Illustrative technique: reservoir sampling |
|
|
356 | (1) |
|
|
357 | (1) |
|
|
358 | (5) |
|
10.6.1 Illustrative technique: Count-Min sketch |
|
|
360 | (3) |
|
|
363 | (1) |
|
|
363 | (7) |
|
10.7.1 Illustrative techniques: binary clipping and moment preserving quantization |
|
|
366 | (3) |
|
|
369 | (1) |
|
10.8 Dimensionality reduction |
|
|
370 | (5) |
|
10.8.1 Illustrative technique: SPIRIT |
|
|
373 | (2) |
|
|
375 | (1) |
|
|
375 | (8) |
|
10.9.1 Illustrative technique: the Haar transform |
|
|
379 | (4) |
|
|
383 | (1) |
|
|
383 | (1) |
|
|
383 | (5) |
|
11 Stream analytics: modeling and evaluation |
|
|
388 | (51) |
|
|
388 | (1) |
|
11.2 Offline modeling and online evaluation |
|
|
389 | (5) |
|
11.3 Data stream classification |
|
|
394 | (9) |
|
11.3.1 Illustrative technique: VFDT |
|
|
398 | (4) |
|
|
402 | (1) |
|
11.4 Data stream clustering |
|
|
403 | (11) |
|
11.4.1 Illustrative technique: CluStream microclustering |
|
|
409 | (4) |
|
|
413 | (1) |
|
11.5 Data stream regression |
|
|
414 | (6) |
|
11.5.1 Illustrative technique: linear regression with SGD |
|
|
417 | (2) |
|
|
419 | (1) |
|
11.6 Data stream frequent pattern mining |
|
|
420 | (7) |
|
11.6.1 Illustrative technique: lossy counting |
|
|
425 | (1) |
|
|
426 | (1) |
|
|
427 | (6) |
|
11.7.1 Illustrative technique: micro-clustering-based anomaly detection |
|
|
432 | (1) |
|
|
432 | (1) |
|
|
433 | (1) |
|
|
433 | (6) |
Part V Case studies |
|
439 | (46) |
|
|
441 | (44) |
|
|
441 | (1) |
|
12.2 The Operations Monitoring application |
|
|
442 | (12) |
|
|
442 | (1) |
|
|
443 | (2) |
|
|
445 | (6) |
|
|
451 | (2) |
|
|
453 | (1) |
|
12.3 The Patient Monitoring application |
|
|
454 | (13) |
|
|
454 | (1) |
|
|
455 | (1) |
|
|
456 | (7) |
|
|
463 | (4) |
|
12.4 The Semiconductor Process Control application |
|
|
467 | (15) |
|
|
467 | (2) |
|
|
469 | (3) |
|
|
472 | (7) |
|
|
479 | (2) |
|
|
481 | (1) |
|
|
482 | (1) |
|
|
482 | (3) |
Part VI Closing notes |
|
485 | (15) |
|
|
487 | (13) |
|
|
487 | (1) |
|
13.2 Challenges and open problems |
|
|
488 | (8) |
|
13.2.1 Software engineering |
|
|
488 | (3) |
|
|
491 | (2) |
|
13.2.3 Scaling up and distributed computing |
|
|
493 | (2) |
|
|
495 | (1) |
|
13.3 Where do we go from here? |
|
|
496 | (1) |
|
|
497 | (3) |
Keywords and identifiers index |
|
500 | (4) |
Index |
|
504 | |