About the editors |
|
xv | |
About the contributors |
|
xvii | |
Foreword |
|
xxv | |
Foreword |
|
xxvii | |
Preface |
|
xxix | |
Acknowledgements |
|
xxxi | |
Introduction |
|
xxxiii | |
|
1 Big data analytics for security intelligence |
|
|
1 | (20) |
|
|
|
|
|
1.1 Introduction to big data analytics |
|
|
1 | (1) |
|
1.2 Big data: huge potentials for information security |
|
|
2 | (3) |
|
1.3 Big data challenges for cybersecurity |
|
|
5 | (1) |
|
1.4 Related work on decision engine techniques |
|
|
5 | (2) |
|
1.5 Big network anomaly detection |
|
|
7 | (1) |
|
1.6 Big data for large-scale security monitoring |
|
|
7 | (3) |
|
1.7 Mechanisms to prevent attacks |
|
|
10 | (2) |
|
1.8 Big data analytics for intrusion detection system |
|
|
12 | (3) |
|
|
12 | (1) |
|
|
12 | (3) |
|
|
15 | (6) |
|
|
15 | (1) |
|
|
15 | (1) |
|
|
16 | (5) |
|
2 Zero attraction data selective adaptive filtering algorithm for big data applications |
|
|
21 | (16) |
|
|
|
|
21 | (2) |
|
|
23 | (1) |
|
2.3 Proposed data preprocessing framework |
|
|
24 | (5) |
|
2.3.1 Proposed update rule |
|
|
26 | (1) |
|
2.3.2 Selection of thresholds |
|
|
27 | (1) |
|
|
28 | (1) |
|
|
29 | (4) |
|
|
33 | (4) |
|
|
33 | (4) |
|
3 Secure routing in software defined networking and Internet of Things for big data |
|
|
37 | (36) |
|
|
|
|
|
37 | (3) |
|
|
40 | (1) |
|
3.3 Intersection of big data and IoT |
|
|
41 | (1) |
|
|
42 | (3) |
|
3.4.1 Taxonomy of big data analytics |
|
|
42 | (1) |
|
3.4.2 Architecture of IoT big data |
|
|
43 | (2) |
|
3.5 Security and privacy challenges of big data |
|
|
45 | (1) |
|
3.6 Routing protocols in IoT |
|
|
46 | (1) |
|
3.7 Security challenges and existing solutions in IoT routing |
|
|
47 | (2) |
|
3.7.1 Selective forwarding attacks |
|
|
47 | (1) |
|
|
47 | (1) |
|
3.7.3 HELLO flood and acknowledgment spoofing attacks |
|
|
48 | (1) |
|
|
48 | (1) |
|
|
48 | (1) |
|
|
48 | (1) |
|
3.7.7 Denial-of-service (DoS) attacks |
|
|
49 | (1) |
|
3.8 The arrival of SDN into big data and IoT |
|
|
49 | (1) |
|
|
50 | (4) |
|
|
54 | (4) |
|
3.11 Attacks on SDN and existing solutions |
|
|
58 | (6) |
|
3.11.1 Conflicting flow rules |
|
|
58 | (3) |
|
|
61 | (1) |
|
|
62 | (1) |
|
3.11.4 Information disclosure |
|
|
62 | (1) |
|
3.11.5 Denial-of-service (DoS) attacks |
|
|
63 | (1) |
|
3.11.6 Exploiting vulnerabilities in OpenFlow switches |
|
|
63 | (1) |
|
3.11.7 Exploiting vulnerabilities in SDN controllers |
|
|
63 | (1) |
|
3.12 Can SDN be applied to IoT? |
|
|
64 | (1) |
|
|
65 | (8) |
|
|
66 | (7) |
|
4 Efficient ciphertext-policy attribute-based signcryption for secure big data storage in cloud |
|
|
73 | (30) |
|
|
|
|
|
74 | (2) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
76 | (4) |
|
|
78 | (2) |
|
|
80 | (2) |
|
4.3.1 System architecture of ECP-ABSC |
|
|
80 | (1) |
|
4.3.2 Formal definition of ECP-ABSC |
|
|
81 | (1) |
|
|
82 | (1) |
|
4.4 Construction of ECP-ABSC scheme |
|
|
82 | (5) |
|
|
82 | (1) |
|
|
83 | (1) |
|
|
84 | (1) |
|
|
85 | (2) |
|
|
87 | (7) |
|
4.6 Performance evaluation |
|
|
94 | (5) |
|
|
99 | (4) |
|
|
99 | (4) |
|
5 Privacy-preserving techniques in big data |
|
|
103 | (24) |
|
|
|
|
103 | (2) |
|
5.2 Big data privacy in data generation phase |
|
|
105 | (2) |
|
|
105 | (1) |
|
|
106 | (1) |
|
5.3 Big data privacy in data storage phase |
|
|
107 | (2) |
|
5.3.1 Attribute-based encryption |
|
|
107 | (1) |
|
5.3.2 Identity-based encryption |
|
|
107 | (1) |
|
5.3.3 Homomorphic encryption |
|
|
108 | (1) |
|
5.3.4 Storage path encryption |
|
|
108 | (1) |
|
5.3.5 Usage of hybrid clouds |
|
|
109 | (1) |
|
5.4 Big data privacy in data processing phase |
|
|
109 | (2) |
|
5.4.1 Protect data from unauthorized disclosure |
|
|
109 | (2) |
|
5.4.2 Extract significant information without trampling privacy |
|
|
111 | (1) |
|
5.5 Traditional privacy-preserving techniques and its scalability in big data |
|
|
111 | (4) |
|
|
112 | (3) |
|
|
115 | (1) |
|
5.6 Recent privacy preserving techniques in big data |
|
|
115 | (6) |
|
|
115 | (2) |
|
5.6.2 Differential privacy |
|
|
117 | (2) |
|
5.6.3 Hiding a needle in a haystack: privacy-preserving a priori algorithm in MapReduce framework |
|
|
119 | (2) |
|
5.7 Privacy-preserving solutions in resource constrained devices |
|
|
121 | (1) |
|
|
122 | (5) |
|
|
123 | (4) |
|
6 Big data and behaviour analytics |
|
|
127 | (18) |
|
|
|
|
6.1 Introduction about big data and behaviour analytics |
|
|
128 | (2) |
|
|
130 | (3) |
|
|
133 | (1) |
|
6.4 Importance and benefits of big data and behaviour analytics |
|
|
133 | (1) |
|
6.4.1 Importance of big data analytics |
|
|
133 | (1) |
|
6.5 Existing algorithms, tools available for data analytics and behaviour analytics |
|
|
134 | (2) |
|
|
135 | (1) |
|
|
135 | (1) |
|
|
135 | (1) |
|
6.5.4 Konstanz Information Miner |
|
|
135 | (1) |
|
|
135 | (1) |
|
|
136 | (1) |
|
|
136 | (1) |
|
6.6 Open issues and challenges with big data analytics and behaviour analytics |
|
|
136 | (2) |
|
6.6.1 Challenges with big data analytics |
|
|
136 | (1) |
|
6.6.2 Issues with big data analytics (BDA) |
|
|
137 | (1) |
|
6.7 Opportunities for future researchers |
|
|
138 | (1) |
|
6.8 A taxonomy for analytics and its related terms |
|
|
139 | (1) |
|
|
139 | (6) |
|
|
140 | (2) |
|
|
142 | (3) |
|
7 Analyzing events for traffic prediction on loT data streams in a smart city scenario |
|
|
145 | (24) |
|
|
|
|
146 | (2) |
|
|
148 | (1) |
|
7.3 Research preliminaries |
|
|
148 | (7) |
|
7.3.1 Dataset description |
|
|
148 | (2) |
|
|
150 | (1) |
|
7.3.3 Complex event processing |
|
|
151 | (1) |
|
7.3.4 Clustering approaches |
|
|
151 | (1) |
|
|
152 | (1) |
|
|
153 | (2) |
|
|
155 | (5) |
|
7.4.1 Statistical approach to optimize the number of retrainings |
|
|
159 | (1) |
|
7.5 Experimental results and discussion |
|
|
160 | (4) |
|
|
164 | (5) |
|
|
165 | (1) |
|
|
165 | (4) |
|
8 Gender-based classification on e-commerce big data |
|
|
169 | (28) |
|
|
Venkata Lakshmi Narayana Somayajulu Durvasula |
|
|
|
|
170 | (4) |
|
8.1.1 E-Commerce and big data |
|
|
171 | (3) |
|
8.2 Gender prediction methodology |
|
|
174 | (20) |
|
8.2.1 Gender prediction based on gender value |
|
|
174 | (11) |
|
8.2.2 Classification using random forest |
|
|
185 | (3) |
|
8.2.3 Classification using gradient-boosted trees (GBTs) |
|
|
188 | (2) |
|
8.2.4 Experimental results with state-of-the-art classifiers |
|
|
190 | (4) |
|
|
194 | (3) |
|
|
195 | (2) |
|
9 On recommender systems with big data |
|
|
197 | (32) |
|
|
|
|
|
198 | (2) |
|
9.1.1 Big data and recommender systems |
|
|
199 | (1) |
|
9.2 Recommender systems challenges |
|
|
200 | (4) |
|
9.2.1 Big-data-specific challenges in RS |
|
|
202 | (2) |
|
9.3 Techniques and approaches for recommender systems |
|
|
204 | (14) |
|
9.3.1 Early recommender systems |
|
|
205 | (7) |
|
9.3.2 Big-data recommender systems |
|
|
212 | (5) |
|
9.3.3 X-aware recommender systems |
|
|
217 | (1) |
|
9.4 Leveraging big data analytics on recommender systems |
|
|
218 | (2) |
|
|
218 | (1) |
|
|
219 | (1) |
|
9.4.3 Banking and finance |
|
|
219 | (1) |
|
|
220 | (1) |
|
|
220 | (1) |
|
9.6 Popular datasets for recommender systems |
|
|
221 | (2) |
|
|
223 | (6) |
|
|
223 | (6) |
|
10 Analytics in e-commerce at scale |
|
|
229 | (12) |
|
|
|
|
229 | (1) |
|
|
230 | (2) |
|
10.2.1 Business and system metrics |
|
|
230 | (2) |
|
|
232 | (1) |
|
|
232 | (2) |
|
|
233 | (1) |
|
|
233 | (1) |
|
|
234 | (1) |
|
|
234 | (1) |
|
|
234 | (5) |
|
|
235 | (1) |
|
10.4.2 Data preprocessing |
|
|
236 | (1) |
|
10.4.3 Batch data processing |
|
|
237 | (1) |
|
10.4.4 Streaming processing |
|
|
238 | (1) |
|
10.4.5 Report visualization |
|
|
238 | (1) |
|
|
239 | (1) |
|
|
239 | (1) |
|
|
239 | (2) |
|
11 Big data regression via parallelized radial basis function neural network in Apache Spark |
|
|
241 | (10) |
|
|
|
|
241 | (1) |
|
|
242 | (1) |
|
|
242 | (1) |
|
|
242 | (1) |
|
11.5 Proposed methodology |
|
|
243 | (3) |
|
11.5.1 Introduction to K-means++ |
|
|
243 | (1) |
|
11.5.2 Introduction to K-means |
|
|
243 | (1) |
|
11.5.3 Introduction to parallel bisecting A"-means |
|
|
244 | (1) |
|
11.5.4 PRBFNN: the proposed approach |
|
|
244 | (2) |
|
|
246 | (1) |
|
|
246 | (2) |
|
11.8 Results and discussion |
|
|
248 | (1) |
|
11.9 Conclusion and future directions |
|
|
248 | (3) |
|
|
249 | (2) |
|
12 Visual sentiment analysis of bank customer complaints using parallel self-organizing maps |
|
|
251 | (22) |
|
|
|
|
|
|
|
251 | (2) |
|
|
253 | (1) |
|
|
254 | (1) |
|
|
254 | (1) |
|
12.5 Description of the techniques used |
|
|
255 | (1) |
|
12.5.1 Self-organizing feature maps |
|
|
255 | (1) |
|
12.5.2 Compute Unified Device Architecture |
|
|
255 | (1) |
|
|
256 | (2) |
|
12.6.1 Text preprocessing |
|
|
257 | (1) |
|
12.6.2 Implementation of CUDASOM |
|
|
257 | (1) |
|
12.6.3 Segmentation of customer complaints using SOM |
|
|
258 | (1) |
|
|
258 | (2) |
|
|
259 | (1) |
|
12.7.2 Preprocessing steps |
|
|
259 | (1) |
|
|
260 | (1) |
|
12.8 Results and discussion |
|
|
260 | (8) |
|
12.8.1 Segmentation of customer complaints using CUDASOM |
|
|
260 | (5) |
|
12.8.2 Performance of CUDASOM |
|
|
265 | (3) |
|
12.9 Conclusions and future directions |
|
|
268 | (5) |
|
|
268 | (1) |
|
|
269 | (4) |
|
13 Wavelet neural network for big data analytics in banking via GPU |
|
|
273 | (12) |
|
|
|
|
273 | (1) |
|
|
274 | (3) |
|
|
277 | (1) |
|
13.4 Proposed methodology |
|
|
277 | (1) |
|
|
278 | (1) |
|
13.5.1 Datasets description |
|
|
278 | (1) |
|
13.5.2 Experimental procedure |
|
|
279 | (1) |
|
13.6 Results and discussion |
|
|
279 | (3) |
|
13.7 Conclusion and future work |
|
|
282 | (3) |
|
|
282 | (3) |
|
14 Stock market movement prediction using evolving spiking neural networks |
|
|
285 | (28) |
|
|
|
|
|
|
|
|
286 | (1) |
|
|
287 | (1) |
|
|
288 | (1) |
|
14.4 The proposed SI-eSNN model for stock trend prediction based on stock indicators |
|
|
289 | (5) |
|
14.4.1 Overall architecture |
|
|
289 | (2) |
|
|
291 | (1) |
|
|
292 | (1) |
|
14.4.4 Learning in the output neurons |
|
|
292 | (1) |
|
14.4.5 Algorithm for eSNN training |
|
|
293 | (1) |
|
14.4.6 Testing (recall of the model on new data) |
|
|
294 | (1) |
|
14.5 The proposed CUDA-eSNN model: a parallel eSNN model for GPU machines |
|
|
294 | (1) |
|
14.6 Dataset description and experiments with the SI-eSNN and the CUDA-eSNN models |
|
|
295 | (2) |
|
14.7 Sliding window (SW)-eSNN for incremental learning and stock movement prediction |
|
|
297 | (8) |
|
14.8 Gaussian receptive fields influence |
|
|
305 | (3) |
|
14.9 Conclusion and future directions |
|
|
308 | (5) |
|
|
309 | (4) |
|
15 Parallel hierarchical clustering of big text corpora |
|
|
313 | (30) |
|
|
|
313 | (4) |
|
15.2 Parallel hierarchical clustering algorithms |
|
|
317 | (6) |
|
15.2.1 Agglomerative clustering |
|
|
318 | (1) |
|
15.2.2 Graph-based clustering |
|
|
318 | (1) |
|
15.2.3 Partitional clustering algorithms |
|
|
319 | (1) |
|
15.2.4 Parallel clustering on SIMD/MIMD machines |
|
|
320 | (1) |
|
15.2.5 Density-based clustering algorithms |
|
|
321 | (1) |
|
15.2.6 Transform-based clustering |
|
|
321 | (1) |
|
15.2.7 Grid-based clustering |
|
|
322 | (1) |
|
15.2.8 Evolutionary clustering |
|
|
322 | (1) |
|
15.2.9 Spectral clustering |
|
|
322 | (1) |
|
15.2.10 Latent model-based clustering |
|
|
323 | (1) |
|
15.3 Parallel document clustering algorithms |
|
|
323 | (2) |
|
15.4 Parallel hierarchical algorithms for big text clustering |
|
|
325 | (12) |
|
15.4.1 Parallel hierarchical cut clustering |
|
|
326 | (2) |
|
15.4.2 Parallel hierarchical latent semantic analysis |
|
|
328 | (3) |
|
15.4.3 Parallel hierarchical modularity-based spectral clustering |
|
|
331 | (3) |
|
15.4.4 Parallel hierarchical latent Dirichlet allocation |
|
|
334 | (1) |
|
15.4.5 PHCUT vs. PHLSA vs. PHMS vs. PHLDA |
|
|
335 | (2) |
|
15.4.6 Research challenges addressed |
|
|
337 | (1) |
|
15.5 Open research challenges |
|
|
337 | (1) |
|
|
338 | (5) |
|
|
339 | (4) |
|
16 Contract-driven financial reporting: building automated analytics pipelines with algorithmic contracts, Big Data and Distributed Ledger technology |
|
|
343 | (24) |
|
|
|
|
|
343 | (3) |
|
16.2 The ACTUS methodology |
|
|
346 | (2) |
|
16.3 The mathematics of ACTUS |
|
|
348 | (6) |
|
16.3.1 Contract terms, contract algorithms and cash flow streams |
|
|
348 | (2) |
|
16.3.2 Description of cash flow streams |
|
|
350 | (1) |
|
16.3.3 Standard analytics as linear operators |
|
|
351 | (3) |
|
16.4 ACTUS in action: proof of concept with a bond portfolio |
|
|
354 | (5) |
|
16.5 Scalable financial analytics |
|
|
359 | (5) |
|
16.6 Towards future automated reporting |
|
|
364 | (3) |
|
|
367 | (1) |
Acknowledgements |
|
367 | (1) |
References |
|
367 | (4) |
Overall conclusions |
|
371 | (2) |
|
|
Index |
|
373 | |