Muutke küpsiste eelistusi

Big Data Analytics with Applications in Insider Threat Detection [Kõva köide]

, , (The University of Texas at Dallas, USA), (University of Texas at Dallas, Richardson, Texas, USA)
  • Formaat: Hardback, 544 pages, kõrgus x laius: 254x178 mm, kaal: 1200 g, 50 Illustrations, black and white
  • Ilmumisaeg: 01-Dec-2017
  • Kirjastus: Auerbach Publishers Inc.
  • ISBN-10: 1498705472
  • ISBN-13: 9781498705479
Teised raamatud teemal:
  • Formaat: Hardback, 544 pages, kõrgus x laius: 254x178 mm, kaal: 1200 g, 50 Illustrations, black and white
  • Ilmumisaeg: 01-Dec-2017
  • Kirjastus: Auerbach Publishers Inc.
  • ISBN-10: 1498705472
  • ISBN-13: 9781498705479
Teised raamatud teemal:

Today's malware mutates randomly to avoid detection, but reactively adaptive malware is more intelligent, learning and adapting to new computer defenses on the fly. Using the same algorithms that antivirus software uses to detect viruses, reactively adaptive malware deploys those algorithms to outwit antivirus defenses and to go undetected. This book provides details of the tools, the types of malware the tools will detect, implementation of the tools in a cloud computing framework and the applications for insider threat detection.

Preface xxiii
Acknowledgments xxvii
Permissions xxix
Authors xxxiii
Chapter 1 Introduction
1(12)
1.1 Overview
1(1)
1.2 Supporting Technologies
2(1)
1.3 Stream Data Analytics
3(1)
1.4 Applications of Stream Data Analytics for Insider Threat Detection
3(1)
1.5 Experimental BDMA and BDSP Systems
4(1)
1.6 Next Steps in BDMA and BDSP
4(1)
1.7 Organization of This Book
5(4)
1.8 Next Steps
9(4)
Part I Supporting Technologies for BDMA and BDSP
Introduction to Part I
13(2)
Chapter 2 Data Security and Privacy
15(12)
2.1 Overview
15(1)
2.2 Security Policies
16(5)
2.2.1 Access Control Policies
16(4)
2.2.1.1 Authorization-Based Access Control Policies
16(2)
2.2.1.2 Role-Based Access Control
18(1)
2.2.1.3 Usage Control
19(1)
2.2.1.4 Attribute-Based Access Control
19(1)
2.2.2 Administration Policies
20(1)
2.2.3 Identification and Authentication
20(1)
2.2.4 Auditing: A Database System
21(1)
2.2.5 Views for Security
21(1)
2.3 Policy Enforcement and Related Issues
21(3)
2.3.1 SQL Extensions for Security
22(1)
2.3.2 Query Modification
23(1)
2.3.3 Discretionary Security and Database Functions
23(1)
2.4 Data Privacy
24(1)
2.5 Summary and Directions
25(1)
References
26(1)
Chapter 3 Data Mining Techniques
27(16)
3.1 Introduction
27(1)
3.2 Overview of Data Mining Tasks and Techniques
27(1)
3.3 Artificial Neural Networks
28(3)
3.4 Support Vector Machines
31(1)
3.5 Markov Model
32(3)
3.6 Association Rule Mining (ARM)
35(2)
3.7 Multiclass Problem
37(1)
3.8 Image Mining
38(2)
3.8.1 Overview
38(1)
3.8.2 Feature Selection
39(1)
3.8.3 Automatic Image Annotation
39(1)
3.8.4 Image Classification
40(1)
3.9 Summary
40(1)
References
40(3)
Chapter 4 Data Mining for Security Applications
43(8)
4.1 Overview
43(1)
4.2 Data Mining for Cyber Security
43(4)
4.2.1 Cyber Security Threats
43(3)
4.2.1.1 Cyber Terrorism, Insider Threats, and External Attacks
43(2)
4.2.1.2 Malicious Intrusions
45(1)
4.2.1.3 Credit Card Fraud and Identity Theft
45(1)
4.2.1.4 Attacks on Critical Infrastructures
45(1)
4.2.2 Data Mining for Cyber Security
46(1)
4.3 Data Mining Tools
47(1)
4.4 Summary and Directions
48(1)
References
48(3)
Chapter 5 Cloud Computing and Semantic Web Technologies
51(16)
5.1 Introduction
51(1)
5.2 Cloud Computing
51(6)
5.2.1 Overview
51(1)
5.2.2 Preliminaries
52(1)
5.2.2.1 Cloud Deployment Models
53(1)
5.2.2.2 Service Models
53(1)
5.2.3 Virtualization
53(1)
5.2.4 Cloud Storage and Data Management
54(2)
5.2.5 Cloud Computing Tools
56(1)
5.2.5.1 Apache Hadoop
56(1)
5.2.5.2 MapReduce
56(1)
5.2.5.3 CouchDB
56(1)
5.2.5.4 HBase
56(1)
5.2.5.5 MongoDB
56(1)
5.2.5.6 Hive
56(1)
5.2.5.7 Apache Cassandra
57(1)
5.3 Semantic Web
57(4)
5.3.1 XML
58(1)
5.3.2 RDF
58(1)
5.3.3 SPARQL
58(1)
5.3.4 OWL
59(1)
5.3.5 Description Logics
59(1)
5.3.6 Inferencing
60(1)
5.3.7 SWRL
61(1)
5.4 Semantic Web and Security
61(2)
5.4.1 XML Security
62(1)
5.4.2 RDF Security
62(1)
5.4.3 Security and Ontologies
63(1)
5.4.4 Secure Query and Rules Processing
63(1)
5.5 Cloud Computing Frameworks Based on Semantic Web Technologies
63(2)
5.5.1 RDF Integration
63(1)
5.5.2 Provenance Integration
64(1)
5.6 Summary and Directions
65(1)
References
65(2)
Chapter 6 Data Mining and Insider Threat Detection
67(12)
6.1 Introduction
67(1)
6.2 Insider Threat Detection
67(1)
6.3 The Challenges, Related Work, and Our Approach
68(1)
6.4 Data Mining for Insider Threat Detection
69(6)
6.4.1 Our Solution Architecture
69(1)
6.4.2 Feature Extraction and Compact Representation
70(2)
6.4.2.1 Vector Representation of the Content
70(1)
6.4.2.2 Subspace Clustering
71(1)
6.4.3 RDF Repository Architecture
72(1)
6.4.4 Data Storage
73(1)
6.4.4.1 File Organization
73(1)
6.4.5 Answering Queries Using Hadoop MapReduce
74(1)
6.4.6 Data Mining Applications
74(1)
6.5 Comprehensive Framework
75(1)
6.6 Summary and Directions
76(1)
References
77(2)
Chapter 7 Big Data Management and Analytics Technologies
79(8)
7.1 Introduction
79(1)
7.2 Infrastructure Tools to Host BDMA Systems
79(2)
7.3 BDMA Systems and Tools
81(2)
7.3.1 Apache Hive
81(1)
7.3.2 Google BigQuery
81(1)
7.3.3 NoSQL Database
81(1)
7.3.4 Google BigTable
82(1)
7.3.5 Apache HBase
82(1)
7.3.6 MongoDB
82(1)
7.3.7 Apache Cassandra
82(1)
7.3.8 Apache CouchDB
82(1)
7.3.9 Oracle NoSQL Database
82(1)
7.3.10 Weka
83(1)
7.3.11 Apache Mahout
83(1)
7.4 Cloud Platforms
83(1)
7.4.1 Amazon Web Services' DynamoDB
83(1)
7.4.2 Microsoft Azure's Cosmos DB
83(1)
7.4.3 IBM's Cloud-Based Big Data Solutions
84(1)
7.4.4 Google's Cloud-Based Big Data Solutions
84(1)
7.5 Summary and Directions
84(1)
References
84(3)
Conclusion to Part I
87(4)
Part II Stream Data Analytics
Introduction to Part II
91(2)
Chapter 8 Challenges for Stream Data Classification
93(12)
8.1 Introduction
93(1)
8.2 Challenges
93(1)
8.3 Infinite Length and Concept Drift
94(1)
8.4 Concept Evolution
95(3)
8.5 Limited Labeled Data
98(1)
8.6 Experiments
99(1)
8.7 Our Contributions
100(1)
8.8 Summary and Directions
101(1)
References
101(4)
Chapter 9 Survey of Stream Data Classification
105(10)
9.1 Introduction
105(1)
9.2 Approach to Data Stream Classification
105(1)
9.3 Single-Model Classification
106(1)
9.4 Ensemble Classification and Baseline Approach
107(1)
9.5 Novel Class Detection
108(1)
9.5.1 Novelty Detection
108(1)
9.5.2 Outlier Detection
108(1)
9.5.3 Baseline Approach
109(1)
9.6 Data Stream Classification with Limited Labeled Data
109(1)
9.6.1 Semisupervised Clustering
109(1)
9.6.2 Baseline Approach
110(1)
9.7 Summary and Directions
110(1)
References
111(4)
Chapter 10 A Multi-Partition, Multi-Chunk Ensemble for Classifying Concept-Drifting Data Streams
115(12)
10.1 Introduction
115(1)
10.2 Ensemble Development
115(6)
10.2.1 Multiple Partitions of Multiple Chunks
115(1)
10.2.1.1 An Ensemble Built on MPC
115(1)
10.2.1.2 MPC Ensemble Updating Algorithm
115(1)
10.2.2 Error Reduction Using MPC Training
116(5)
10.2.2.1 Time Complexity of MPC
121(1)
10.3 Experiments
121(4)
10.3.1 Datasets and Experimental Setup
122(1)
10.3.1.1 Real (Botnet) Dataset
122(1)
10.3.1.2 Baseline Methods
122(1)
10.3.2 Performance Study
122(3)
10.4 Summary and Directions
125(1)
References
126(1)
Chapter 11 Classification and Novel Class Detection in Concept-Drifting Data Streams
127(22)
11.1 Introduction
127(1)
11.2 ECSMiner
127(6)
11.2.1 Overview
127(1)
11.2.2 High Level Algorithm
128(1)
11.2.3 Nearest Neighborhood Rule
129(1)
11.2.4 Novel Class and Its Properties
130(1)
11.2.5 Base Learners
131(1)
11.2.6 Creating Decision Boundary during Training
132(1)
11.3 Classification with Novel Class Detection
133(8)
11.3.1 High-Level Algorithm
133(1)
11.3.2 Classification
134(1)
11.3.3 Novel Class Detection
134(3)
11.3.4 Analysis and Discussion
137(4)
11.3.4.1 Justification of the Novel Class Detection Algorithm
137(1)
11.3.4.2 Deviation between Approximate and Exact q-NSC Computation
138(2)
11.3.4.3 Time and Space Complexity
140(1)
11.4 Experiments
141(7)
11.4.1 Datasets
141(1)
11.4.1.1 Synthetic Data with only Concept Drift (SynC)
141(1)
11.4.1.2 Synthetic Data with Concept Drift and Novel Class (SynCN)
141(1)
11.4.1.3 Real Data-KDDCup 99 Network Intrusion Detection (KDD)
141(1)
11.4.1.4 Real Data-Forest Covers Dataset from UCI Repository (Forest)
142(1)
11.4.2 Experimental Set-Up
142(1)
11.4.3 Baseline Approach
142(1)
11.4.4 Performance Study
143(9)
11.4.4.1 Evaluation Approach
143(1)
11.4.4.2 Results
143(5)
11.5 Summary and Directions
148(1)
References
148(1)
Chapter 12 Data Stream Classification with Limited Labeled Training Data
149(22)
12.1 Introduction
149(1)
12.2 Description of ReaSC
149(3)
12.3 Training with Limited Labeled Data
152(4)
12.3.1 Problem Description
152(1)
12.3.2 Unsupervised K-Means Clustering
152(1)
12.3.3 K-Means Clustering with Cluster-Impurity Minimization
152(2)
12.3.4 Optimizing the Objective Function with Expectation Maximization (E-M)
154(1)
12.3.5 Storing the Classification Model
155(1)
12.4 Ensemble Classification
156(4)
12.4.1 Classification Overview
156(1)
12.4.2 Ensemble Refinement
156(4)
12.4.3 Ensemble Update
160(1)
12.4.4 Time Complexity
160(1)
12.5 Experiments
160(8)
12.5.1 Dataset
160(2)
12.5.2 Experimental Setup
162(1)
12.5.3 Comparison with Baseline Methods
163(2)
12.5.4 Running Times, Scalability, and Memory Requirement
165(1)
12.5.5 Sensitivity to Parameters
166(2)
12.6 Summary and Directions
168(1)
References
168(3)
Chapter 13 Directions in Data Stream Classification
171(6)
13.1 Introduction
171(1)
13.2 Discussion of the Approaches
171(1)
13.2.1 MPC Ensemble Approach
171(1)
13.2.2 Classification and Novel Class Detection in Data Streams (ECSMiner)
172(1)
13.2.3 Classification with Scarcely Labeled Data (ReaSC)
172(1)
13.3 Extensions
172(3)
13.4 Summary and Directions
175(1)
References
175(2)
Conclusion to Part II
177(4)
Part III Stream Data Analytics for Insider Threat Detection
Introduction to Part III
181(2)
Chapter 14 Insider Threat Detection as a Stream Mining Problem
183(6)
14.1 Introduction
183(1)
14.2 Sequence Stream Data
184(1)
14.3 Big Data Issues
184(1)
14.4 Contributions
185(1)
14.5 Summary and Directions
186(1)
References
186(3)
Chapter 15 Survey of Insider Threat and Stream Mining
189(8)
15.1 Introduction
189(1)
15.2 Insider Threat Detection
189(2)
15.3 Stream Mining
191(1)
15.4 Big Data Techniques for Scalability
192(1)
15.5 Summary and Directions
193(1)
References
194(3)
Chapter 16 Ensemble-Based Insider Threat Detection
197(6)
16.1 Introduction
197(1)
16.2 Ensemble Learning
197(2)
16.3 Ensemble for Unsupervised Learning
199(1)
16.4 Ensemble for Supervised Learning
200(1)
16.5 Summary and Directions
201(1)
References
201(2)
Chapter 17 Details of Learning Classes
203(4)
17.1 Introduction
203(1)
17.2 Supervised Learning
203(1)
17.3 Unsupervised Learning
203(2)
17.3.1 GBAD-MDL
204(1)
17.3.2 GBAD-P
204(1)
17.3.3 GBAD-MPS
205(1)
17.4 Summary and Directions
205(1)
References
205(2)
Chapter 18 Experiments and Results for Non sequence Data
207(10)
18.1 Introduction
207(1)
18.2 Dataset
207(2)
18.3 Experimental Setup
209(1)
18.3.1 Supervised Learning
209(1)
18.3.2 Unsupervised Learning
210(1)
18.4 Results
210(5)
18.4.1 Supervised Learning
210(2)
18.4.2 Unsupervised Learning
212(3)
18.5 Summary and Directions
215(1)
References
215(2)
Chapter 19 Insider Threat Detection for Sequence Data
217(10)
19.1 Introduction
217(1)
19.2 Classifying Sequence Data
217(3)
19.3 Unsupervised Stream-Based Sequence Learning (USSL)
220(3)
19.3.1 Construct the LZW Dictionary by Selecting the Patterns in the Data Stream
221(1)
19.3.2 Constructing the Quantized Dictionary
222(1)
19.4 Anomaly Detection
223(1)
19.5 Complexity Analysis
224(1)
19.6 Summary and Directions
224(1)
References
225(2)
Chapter 20 Experiments and Results for Sequence Data
227(10)
20.1 Introduction
227(1)
20.2 Dataset
227(1)
20.3 Concept Drift in the Training Set
228(2)
20.4 Results
230(5)
20.4.1 Choice of Ensemble Size
233(2)
20.5 Summary and Directions
235(1)
References
235(2)
Chapter 21 Scalability Using Big Data Technologies
237(14)
21.1 Introduction
237(1)
21.2 Hadoop Mapreduce Platform
237(1)
21.3 Scalable LZW and QD Construction Using Mapreduce Job
238(6)
21.3.1 2MRJ Approach
238(3)
21.3.2 1MRJ Approach
241(3)
21.4 Experimental Setup and Results
244(4)
21.4.1 Hadoop Cluster
244(1)
21.4.2 Big Dataset for Insider Threat Detection
244(1)
21.4.3 Results for Big Data Set Related to Insider Threat Detection
245(7)
21.4.3.1 On OD Dataset
245(1)
21.4.3.2 On DBD Dataset
246(2)
21.5 Summary and Directions
248(1)
References
249(2)
Chapter 22 Stream Mining and Big Data for Insider Threat Detection
251(6)
22.1 Introduction
251(1)
22.2 Discussion
251(1)
22.3 Future Work
252(1)
22.3.1 Incorporate User Feedback
252(1)
22.3.2 Collusion Attack
252(1)
22.3.3 Additional Experiments
252(1)
22.3.4 Anomaly Detection in Social Network and Author Attribution
252(1)
22.3.5 Stream Mining as a Big Data Mining Problem
253(1)
22.4 Summary and Directions
253(1)
References
254(3)
Conclusion to Part III
257(4)
Part IV Experimental BDMA and BDSP Systems
Introduction to Part IV
261(2)
Chapter 23 Cloud Query Processing System for Big Data Management
263(26)
23.1 Introduction
263(1)
23.2 Our Approach
264(1)
23.3 Related Work
265(2)
23.4 Architecture
267(2)
23.5 Mapreduce Framework
269(10)
23.5.1 Overview
269(1)
23.5.2 Input Files Selection
270(1)
23.5.3 Cost Estimation for Query Processing
270(4)
23.5.4 Query Plan Generation
274(3)
23.5.5 Breaking Ties by Summary Statistics
277(1)
23.5.6 MapReduce Join Execution
278(1)
23.6 Results
279(2)
23.6.1 Experimental Setup
279(1)
23.6.2 Evaluation
280(1)
23.7 Security Extensions
281(4)
23.7.1 Access Control Model
282(1)
23.7.2 Access Token Assignment
283(1)
23.7.3 Conflicts
284(1)
23.8 Summary and Directions
285(1)
References
286(3)
Chapter 24 Big Data Analytics for Multipurpose Social Media Applications
289(18)
24.1 Introduction
289(1)
24.2 Our Premise
290(1)
24.3 Modules of InXite
291(11)
24.3.1 Overview
291(1)
24.3.2 Information Engine
291(2)
24.3.2.1 Entity Extraction
292(1)
24.3.2.2 Information Integration
293(1)
24.3.3 Person of Interest Analysis
293(5)
24.3.3.1 InXite Person of Interest Profile Generation and Analysis
293(1)
24.3.3.2 InXite POI Threat Analysis
294(2)
24.3.3.3 InXite Psychosocial Analysis
296(1)
24.3.3.4 Other features
297(1)
24.3.4 InXite Threat Detection and Prediction
298(2)
24.3.5 Application of SNOD
300(1)
24.3.5.1 SNOD++
300(1)
24.3.5.2 Benefits of SNOD++
300(1)
24.3.6 Expert Systems Support
300(1)
24.3.7 Cloud-Design of InXite to Handle Big Data
301(1)
24.3.8 Implementation
302(1)
24.4 Other Applications
302(1)
24.5 Related Work
303(1)
24.6 Summary and Directions
304(1)
References
304(3)
Chapter 25 Big Data Management and Cloud for Assured Information Sharing
307(24)
25.1 Introduction
307(1)
25.2 Design Philosophy
308(1)
25.3 System Design
309(12)
25.3.1 Design of CAISS
309(3)
25.3.2 Design of CAISS++
312(9)
25.3.2.1 Limitations of CAISS
312(9)
25.3.3 Formal Policy Analysis
321(1)
25.3.4 Implementation Approach
321(1)
25.4 Related Work
321(5)
25.4.1 Our Related Research
322(2)
25.4.2 Overall Related Research
324(2)
25.4.3 Commercial Developments
326(1)
25.5 Extensions for Big Data-Based Social Media Applications
326(1)
25.6 Summary and Directions
327(1)
References
327(4)
Chapter 26 Big Data Management for Secure Information Integration
331(8)
26.1 Introduction
331(1)
26.2 Integrating Blackbook with Amazon s3
331(5)
26.3 Experiments
336(1)
26.4 Summary and Directions
336(1)
References
336(3)
Chapter 27 Big Data Analytics for Malware Detection
339(16)
27.1 Introduction
339(1)
27.2 Malware Detection
340(2)
27.2.1 Malware Detection as a Data Stream Classification Problem
340(1)
27.2.2 Cloud Computing for Malware Detection
341(1)
27.2.3 Our Contributions
341(1)
27.3 Related Work
342(2)
27.4 Design and Implementation of the System
344(3)
27.4.1 Ensemble Construction and Updating
344(1)
27.4.2 Error Reduction Analysis
344(1)
27.4.3 Empirical Error Reduction and Time Complexity
345(1)
27.4.4 Hadoop/MapReduce Framework
345(2)
27.5 Malicious Code Detection
347(2)
27.5.1 Overview
347(1)
27.5.2 Nondistributed Feature Extraction and Selection
347(1)
27.5.3 Distributed Feature Extraction and Selection
348(1)
27.6 Experiments
349(2)
27.6.1 Datasets
349(1)
27.6.2 Baseline Methods
350(1)
27.7 Discussion
351(1)
27.8 Summary and Directions
352(1)
References
353(2)
Chapter 28 A Semantic Web-Based Inference Controller for Provenance Big Data
355(18)
28.1 Introduction
355(1)
28.2 Architecture for the Inference Controller
356(4)
28.3 Semantic Web Technologies and Provenance
360(1)
28.3.1 Semantic Web-Based Models
360(1)
28.3.2 Graphical Models and Rewriting
361(1)
28.4 Inference Control through Query Modification
361(4)
28.4.1 Our Approach
361(1)
28.4.2 Domains and Provenance
362(1)
28.4.3 Inference Controller with Two Users
363(1)
28.4.4 SPARQL Query Modification
364(1)
28.5 Implementing the Inference Controller
365(2)
28.5.1 Our Approach
365(1)
28.5.2 Implementation of a Medical Domain
365(1)
28.5.3 Generating and Populating the Knowledge Base
366(1)
28.5.4 Background Generator Module
366(1)
28.6 Big Data Management and Inference Control
367(1)
28.7 Summary and Directions
368(1)
References
368(5)
Conclusion to Part IV
373(4)
Part V Next Steps for BDMA and BDSP
Introduction to Part V
377(2)
Chapter 29 Confidentiality, Privacy, and Trust for Big Data Systems
379(12)
29.1 Introduction
379(1)
29.2 Trust, Privacy, and Confidentiality
379(2)
29.2.1 Current Successes and Potential Failures
380(1)
29.2.2 Motivation for a Framework
381(1)
29.3 CPT Framework
381(3)
29.3.1 The Role of the Server
381(1)
29.3.2 CPT Process
382(1)
29.3.3 Advanced CPT
382(1)
29.3.4 Trust, Privacy, and Confidentiality Inference Engines
383(1)
29.4 Our Approach to Confidentiality Management
384(1)
29.5 Privacy for Social Media Systems
385(2)
29.6 Trust for Social Networks
387(1)
29.7 Integrated System
387(1)
29.8 CPT within the Context of Big Data and Social Networks
388(2)
29.9 Summary and Directions
390(1)
References
390(1)
Chapter 30 Unified Framework for Secure Big Data Management and Analytics
391(12)
30.1 Overview
391(1)
30.2 Integrity Management and Data Provenance for Big Data Systems
391(6)
30.2.1 Need for Integrity
391(1)
30.2.2 Aspects of Integrity
392(1)
30.2.3 Inferencing, Data Quality, and Data Provenance
393(1)
30.2.4 Integrity Management, Cloud Services and Big Data
394(2)
30.2.5 Integrity for Big Data
396(1)
30.3 Design of Our Framework
397(3)
30.4 The Global Big Data Security and Privacy Controller
400(1)
30.5 Summary and Directions
401(1)
References
401(2)
Chapter 31 Big Data, Security, and the Internet of Things
403(10)
31.1 Introduction
403(1)
31.2 Use Cases
404(2)
31.3 Layered Framework for Secure IoT
406(1)
31.4 Protecting the Data
407(1)
31.5 Scalable Analytics for IoT Security Applications
408(3)
31.6 Summary and Directions
411(1)
References
411(2)
Chapter 32 Big Data Analytics for Malware Detection in Smartphones
413(20)
32.1 Introduction
413(1)
32.2 Our Approach
414(5)
32.2.1 Challenges
414(1)
32.2.2 Behavioral Feature Extraction and Analysis
415(2)
32.2.2.1 Graph-Based Behavior Analysis
415(1)
32.2.2.2 Sequence-Based Behavior Analysis
416(1)
32.2.2.3 Evolving Data Stream Classification
416(1)
32.2.3 Reverse Engineering Methods
417(1)
32.2.4 Risk-Based Framework
417(1)
32.2.5 Application to Smartphones
418(1)
32.2.5.1 Data Gathering
419(1)
32.2.5.2 Malware Detection
419(1)
32.2.5.3 Data Reverse Engineering of Smartphone Applications
419(1)
32.3 Our Experimental Activities
419(2)
32.3.1 Covert Channel Attack in Mobile Apps
420(1)
32.3.2 Detecting Location Spoofing in Mobile Apps
420(1)
32.3.3 Large Scale, Automated Detection of SSL/TLS Man-in-the-Middle Vulnerabilities in Android Apps
421(1)
32.4 Infrastructure Development
421(8)
32.4.1 Virtual Laboratory Development
421(5)
32.4.1.1 Laboratory Setup
421(2)
32.4.1.2 Programming Projects to Support the Virtual Lab
423(1)
32.4.1.3 An Intelligent Fuzzier for the Automatic Android GUI Application Testing
423(1)
32.4.1.4 Problem Statement
423(1)
32.4.1.5 Understanding the Interface
423(1)
32.4.1.6 Generating Input Events
424(1)
32.4.1.7 Mitigating Data Leakage in Mobile Apps Using a Transactional Approach
424(1)
32.4.1.8 Technical Challenges
425(1)
32.4.1.9 Experimental System
425(1)
32.4.1.10 Policy Engine
426(1)
32.4.2 Curriculum Development
426(7)
32.4.2.1 Extensions to Existing Courses
426(2)
32.4.2.2 New Capstone Course on Secure Mobile Computing
428(1)
32.5 Summary and Directions
429(1)
References
429(4)
Chapter 33 Toward a Case Study in Healthcare for Big Data Analytics and Security
433(20)
33.1 Introduction
433(1)
33.2 Motivation
433(3)
33.2.1 The Problem
433(2)
33.2.2 Air Quality Data
435(1)
33.2.3 Need for Such a Case Study
435(1)
33.3 Methodologies
436(1)
33.4 The Framework Design
437(11)
33.4.1 Storing and Retrieving Multiple Types of Scientific Data
437(3)
33.4.1.1 The Problem and Challenges
437(1)
33.4.1.2 Current Systems and Their Limitations
438(1)
33.4.1.3 The Future System
439(1)
33.4.2 Privacy and Security Aware Data Management for Scientific Data
440(2)
33.4.2.1 The Problem and Challenges
440(1)
33.4.2.2 Current Systems and Their Limitations
440(1)
33.4.2.3 The Future System
441(1)
33.4.3 Offline Scalable Statistical Analytics
442(4)
33.4.3.1 The Problem and Challenges
442(1)
33.4.3.2 Current Systems and Their Limitations
443(1)
33.4.3.3 The Future System
444(1)
33.4.3.4 Mixed Continuous and Discrete Domains
444(2)
33.4.4 Real-Time Stream Analytics
446(1)
33.4.4.1 The Problem and Challenges
446(1)
33.4.5 Current Systems and Their Limitations
446(8)
33.4.5.1 The Future System
446(2)
33.5 Summary and Directions
448(1)
References
448(5)
Chapter 34 Toward an Experimental Infrastructure and Education Program for BDMA and BDSP
453(16)
34.1 Introduction
453(1)
34.2 Current Research and Infrastructure Activities in BDMA and BDSP
454(1)
34.2.1 Big Data Analytics for Insider Threat Detection
454(1)
34.2.2 Secure Data Provenance
454(1)
34.2.3 Secure Cloud Computing
454(1)
34.2.4 Binary Code Analysis
455(1)
34.2.5 Cyber-Physical Systems Security
455(1)
34.2.6 Trusted Execution Environment
455(1)
34.2.7 Infrastructure Development
455(1)
34.3 Education and Infrastructure Program in BDMA
455(4)
34.3.1 Curriculum Development
455(2)
34.3.2 Experimental Program
457(2)
34.3.2.1 Geospatial Data Processing on GDELT
458(1)
34.3.2.2 Coding for Political Event Data
458(1)
34.3.2.3 Timely Health Indicator
459(1)
34.4 Security and Privacy for Big Data
459(6)
34.4.1 Our Approach
459(1)
34.4.2 Curriculum Development
460(1)
34.4.2.1 Extensions to Existing Courses
460(1)
34.4.2.2 New Capstone Course on BDSP
461(1)
34.4.3 Experimental Program
461(8)
34.4.3.1 Laboratory Setup
461(1)
34.4.3.2 Programming Projects to Support the Lab
462(3)
34.5 Summary and Directions
465(1)
References
465(4)
Chapter 35 Directions for BDSP and BDMA
469(14)
35.1 Introduction
469(1)
35.2 Issues in BDSP
469(3)
35.2.1 Introduction
469(1)
35.2.2 Big Data Management and Analytics
470(1)
35.2.3 Security and Privacy
471(1)
35.2.4 Big Data Analytics for Security Applications
472(1)
35.2.5 Community Building
472(1)
35.3 Summary of Workshop Presentations
472(2)
35.3.1 Keynote Presentations
473(1)
35.3.1.1 Toward Privacy Aware Big Data Analytics
473(1)
35.3.1.2 Formal Methods for Preserving Privacy While Loading Big Data
473(1)
35.3.1.3 Authenticity of Digital Images in Social Media
473(1)
35.3.1.4 Business Intelligence Meets Big Data: An Overview of Security and Privacy
473(1)
35.3.1.5 Toward Risk-Aware Policy-Based Framework for BDSP
473(1)
35.3.1.6 Big Data Analytics: Privacy Protection Using Semantic Web Technologies
473(1)
35.3.1.7 Securing Big Data in the Cloud: Toward a More Focused and Data-Driven Approach
473(1)
35.3.1.8 Privacy in a World of Mobile Devices
474(1)
35.3.1.9 Access Control and Privacy Policy Challenges in Big Data
474(1)
35.3.1.10 Timely Health Indicators Using Remote Sensing and Innovation for the Validity of the Environment
474(1)
35.3.1.11 Additional Presentations
474(1)
35.3.1.12 Final Thoughts on the Presentations
474(1)
35.4 Summary of the Workshop Discussions
474(7)
35.4.1 Introduction
474(1)
35.4.2 Philosophy for BDSP
475(1)
35.4.3 Examples of Privacy-Enhancing Techniques
475(1)
35.4.4 Multiobjective Optimization Framework for Data Privacy
476(1)
35.4.5 Research Challenges and Multidisciplinary Approaches
477(3)
35.4.6 BDMA for Cyber Security
480(1)
35.5 Summary and Directions
481(1)
References
481(2)
Conclusion to Part V
483(2)
Chapter 36 Summary and Directions
485(8)
36.1 About This
Chapter
485(1)
36.2 Summary of This Book
485(5)
36.3 Directions for BDMA and BDSP
490(1)
36.4 Where Do We Go from Here?
491(2)
Appendix A: Data Management Systems: Developments and Trends 493(14)
Appendix B: Database Management Systems 507(18)
Index 525
Dr. Bhavani Thuraisingham is the Louis A. Beecherl, Jr. Distinguished Professor of Computer Science and the Executive Director of the Cyber Security Research and Education Institute (CSI) at the University of Texas at Dallas.



 Dr. Kevin W. Hamlen is an Assistant Professor in CS at UTD where he directs the Software Security Lab.



Dr. Latifur R. Khan is currently an Associate Professor in CS at UTD.



Dr. Mehedy Masud is an associate professor at the College of Information Technology, United Arab Emirates University.