Muutke küpsiste eelistusi

E-raamat: Parallel Computing Architectures and APIs: IoT Big Data Stream Processing [Taylor & Francis e-raamat]

(Corporate IT Strategy Consultant, Thane (West), India)
  • Formaat: 406 pages, 19 Tables, black and white; 65 Illustrations, black and white
  • Ilmumisaeg: 02-Dec-2019
  • Kirjastus: CRC Press
  • ISBN-13: 9781351029223
Teised raamatud teemal:
  • Taylor & Francis e-raamat
  • Hind: 240,04 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 342,91 €
  • Säästad 30%
  • Formaat: 406 pages, 19 Tables, black and white; 65 Illustrations, black and white
  • Ilmumisaeg: 02-Dec-2019
  • Kirjastus: CRC Press
  • ISBN-13: 9781351029223
Teised raamatud teemal:
Parallel Computing Architectures and APIs: IoT Big Data Stream Processing commences from the point high-performance uniprocessors were becoming increasingly complex, expensive, and power-hungry. A basic trade-off exists between the use of one or a small number of such complex processors, at one extreme, and a moderate to very large number of simpler processors, at the other. When combined with a high-bandwidth, interprocessor communication facility leads to significant simplification of the design process. However, two major roadblocks prevent the widespread adoption of such moderately to massively parallel architectures: the interprocessor communication bottleneck, and the difficulty and high cost of algorithm/software development.

One of the most important reasons for studying parallel computing architectures is to learn how to extract the best performance from parallel systems. Specifically, you must understand its architectures so that you will be able to exploit those architectures during programming via the standardized APIs.

This book would be useful for analysts, designers and developers of high-throughput computing systems essential for big data stream processing emanating from IoT-driven cyber-physical systems (CPS).

This pragmatic book:











Devolves uniprocessors in terms of a ladder of abstractions to ascertain (say) performance characteristics at a particular level of abstraction





Explains limitations of uniprocessor high performance because of Moores Law





Introduces basics of processors, networks and distributed systems





Explains characteristics of parallel systems, parallel computing models and parallel algorithms





Explains the three primary categorical representatives of parallel computing architectures, namely, shared memory, message passing and stream processing





Introduces the three primary categorical representatives of parallel programming APIs, namely, OpenMP, MPI and CUDA





Provides an overview of Internet of Things (IoT), wireless sensor networks (WSN), sensor data processing, Big Data and stream processing





Provides introduction to 5G communications, Edge and Fog computing

Parallel Computing Architectures and APIs: IoT Big Data Stream Processing discusses stream processing that enables the gathering, processing and analysis of high-volume, heterogeneous, continuous Internet of Things (IoT) big data streams, to extract insights and actionable results in real time. Application domains requiring data stream management include military, homeland security, sensor networks, financial applications, network management, web site performance tracking, real-time credit card fraud detection, etc.
Preface xvii
Acknowledgments xxiii
Author xxv
1 Uniprocessor Computers
1(16)
1.1 Types of Computers
1(2)
1.1.1 Microcomputers
2(1)
1.1.2 Midrange Computers
2(1)
1.1.3 Mainframe Computers
2(1)
1.1.4 Supercomputers
3(1)
1.2 Computer System
3(4)
1.2.1 Hardware
3(2)
1.2.2 Software
5(1)
1.2.3 Network
6(1)
1.3 Hardware and Software Logical Equivalence
7(1)
1.4 Ladder of Abstraction
8(6)
1.4.1 Modeling-Level Architecture
9(1)
1.4.2 Algorithm-Level Architecture
9(2)
1.4.3 High-Level Architecture
11(1)
1.4.4 Assembly-Level Architecture
12(1)
1.4.5 System or Instruction Set Architecture-Level Architecture
12(1)
1.4.6 Machine or Microarchitecture-Level Architecture
12(1)
1.4.7 Control or Logic-Level Architecture
13(1)
1.4.8 Device-Level Architecture
13(1)
1.5 Application Programming Interfaces
14(1)
1.6 Summary
15(2)
2 Processor Physics and Moore's Law
17(20)
2.1 Speed of Processing and Power Problem
17(2)
2.2 Area, Delay, and Power Consumption
19(5)
2.2.1 Area Consumption
19(1)
2.2.2 Delay Consumption
20(2)
2.2.3 Power Consumption
22(2)
2.3 Area, Latency, and Power Trade-offs
24(4)
2.3.1 Area versus Delay Trade-off
24(2)
2.3.2 Delay versus Power Trade-off
26(1)
2.3.3 Area versus Delay versus Power Trade-off
27(1)
2.4 Moore's Law
28(3)
2.4.1 Leveraging Moore's Law
30(1)
2.4.1.1 Reconfigurable Computing
31(1)
2.5 Performance Wall
31(1)
2.5.1 Power Wall
32(1)
2.5.2 Memory Wall
32(1)
2.5.3 Instruction-Level Parallelism Wall
32(1)
2.6 Summary
32(5)
Section I Genesis of Parallel Computing
3 Processor Basics
37(14)
3.1 Processor
37(1)
3.2 Aspects of Processor Performance
38(4)
3.2.1 Potential for Speedup
38(2)
3.2.2 Scalability
40(1)
3.2.3 Speedup versus Communication Overhead
40(2)
3.3 Enhancing Uniprocessor Performance
42(8)
3.3.1 Improving CPU Performance
43(1)
3.3.2 Increasing Processor Clock Frequency
43(1)
3.3.3 Parallelizing Arithmetic Logic Unit (ALU) Structure
44(1)
3.3.4 Pipelining
45(1)
3.3.5 Memory Hierarchy
46(1)
3.3.5.1 Cache Memory
46(1)
3.3.6 Very Long Instruction Word (VLIW) Processors
46(1)
3.3.7 Superscalarity
47(1)
3.3.8 Instruction-Level Parallelism
47(1)
3.3.9 Multicore Architectures
48(2)
3.3.10 Multithreading
50(1)
3.4 Summary
50(1)
4 Networking Basics
51(14)
4.1 Network Principles
51(3)
4.1.1 Protocol
51(1)
4.1.2 Protocol Layers
52(1)
4.1.3 Protocol Suite
52(2)
4.1.4 Datagram
54(1)
4.2 Types of Networks
54(2)
4.2.1 Personal Area Networks
54(1)
4.2.2 Local Area Networks
55(1)
4.2.3 Metropolitan Area Networks
56(1)
4.2.4 Wide Area Networks
56(1)
4.3 Network Models
56(5)
4.3.1 OSI Reference Model
57(2)
4.3.2 TCP/IP Reference Model
59(1)
4.3.2.1 Link Layer
60(1)
4.3.2.2 Internet Layer
60(1)
4.3.2.3 Transport Layer
60(1)
4.3.2.4 Application Layer
61(1)
4.4 Interconnection Networks
61(3)
4.4.1 Ethernet
62(1)
4.4.2 Switches
63(1)
4.5 Summary
64(1)
5 Distributed Systems Basics
65(18)
5.1 Distributed Systems
65(12)
5.1.1 Distributed Computing
66(2)
5.1.1.1 System Architectural Styles
68(1)
5.1.1.2 Software Architectural Styles
69(5)
5.1.1.3 Technologies for Distributed Computing
74(3)
5.2 Distributed System Benefits
77(1)
5.3 Distributed Computation Systems
78(1)
5.4 Summary
79(4)
Section II Road to Parallel Computing
6 Parallel Systems
83(16)
6.1 Flynn's Taxonomy for Parallel Computer Architectures
83(3)
6.2 Types of Parallel Computers
86(6)
6.2.1 Shared Memory Multiprocessor Systems
86(2)
6.2.2 Distributed Memory Multicomputers
88(1)
6.2.2.1 Interconnection Network (IN)
88(4)
6.3 Characteristics of Parallel Systems
92(5)
6.3.1 Coupling, Parallelism, Concurrency, and Granularity
92(1)
6.3.2 Shared Memory Systems versus Message-Passing Systems
93(1)
6.3.3 Distributed Communication
94(1)
6.3.3.1 Blocking/Non-blocking, Synchronous/Asynchronous Primitives
94(2)
6.3.3.2 Processor Synchrony
96(1)
6.3.4 Synchronous versus Asynchronous Executions
96(1)
6.4 Summary
97(2)
7 Parallel Computing Models
99(16)
7.1 Shared Memory Models
99(5)
7.1.1 Theoretical Models
99(1)
7.1.1.1 RAM Model
99(1)
7.1.1.2 PRAM Model
99(2)
7.1.2 Practical Models
101(1)
7.1.2.1 Bulk Synchronous Parallel (BSP) Model
101(2)
7.1.2.2 LogP Model
103(1)
7.2 Interconnection Network Models
104(7)
7.2.1 Theoretical Models
105(1)
7.2.1.1 Mesh
105(1)
7.2.1.2 Mesh of Trees
106(1)
7.2.1.3 Cube
107(1)
7.2.1.4 Cube-Connected Cycles
108(1)
7.2.1.5 Butterfly
108(1)
7.2.2 Practical Models
109(1)
7.2.2.1 Cluster
109(1)
7.2.2.2 Grids
110(1)
7.3 Dataflow Model
111(2)
7.4 Summary
113(2)
8 Parallel Algorithms
115(16)
8.1 Classes of Problems Solvable through Parallelization
115(3)
8.1.1 Parallelizable Tasks
116(2)
8.2 Types of Parallelization
118(4)
8.2.1 Functional Parallelization
118(1)
8.2.2 Data Parallelization
119(1)
8.2.3 Recursive Parallelization
120(1)
8.2.4 Exploratory Parallelization
121(1)
8.2.5 Speculative Parallelization
121(1)
8.3 Granularity of Parallelization
122(1)
8.4 Assigning Computational Tasks to Processors
123(1)
8.5 Illustrating Design of a Parallel Algorithm
124(1)
8.6 Parallel Algorithms for Conventional Computations
125(3)
8.6.1 Parallel Prefix and Suffix Computations on a Linked List
125(3)
8.7 Parallel Algorithms for Unconventional Computations
128(1)
8.8 Summary
128(3)
Section III Parallel Computing Architectures
9 Parallel Computing Architecture Basics
131(10)
9.1 High-Performance Distributed Computing
131(1)
9.2 Performance Evaluation
132(3)
9.2.1 Benchmarks
133(2)
9.3 Application and Architecture
135(2)
9.3.1 Multiprocessor Architectures
136(1)
9.4 Maximum Performance Computing Approach
137(1)
9.5 Parallel Computing Basics
138(2)
9.6 Parallel Computing Paradigms
140(1)
9.7 Summary
140(1)
10 Shared Memory Architecture
141(16)
10.1 Shared Memory Paradigm
141(2)
10.1.1 Multicore Architecture
141(1)
10.1.2 Multi-Socket Multicore Architecture
142(1)
10.1.3 Asymmetric Multicore Architecture
142(1)
10.2 Cache
143(4)
10.2.1 Number of Caches
145(1)
10.2.2 Cache Sizes
145(1)
10.2.3 Mapping of Memory Blocks to Cache Blocks
145(2)
10.3 Write Policy
147(1)
10.3.1 Write-Through Policy
148(1)
10.3.2 Write-Back Policy
148(1)
10.4 Cache Coherency
148(6)
10.4.1 Snooping Protocols
150(1)
10.4.2 Directory-Based Protocols
151(3)
10.5 Memory Consistency
154(1)
10.5.1 Sequential Consistency
154(1)
10.6 Summary
155(2)
11 Message-Passing Architecture
157(10)
11.1 Message-Passing Paradigm
157(1)
11.1.1 Tightly Coupled Distributed Memory Architecture
157(1)
11.1.2 Loosely Coupled Distributed Memory Architecture
157(1)
11.2 Routing
158(3)
11.2.1 Routing Algorithms for Broadcasting and Multicasting
159(1)
11.2.2 Deadlocks and Routing Algorithms
160(1)
11.3 Switching
161(5)
11.3.1 Circuit Switching
163(1)
11.3.2 Packet Switching
163(1)
11.3.2.1 Store-and-Forward Routing
163(2)
11.3.2.2 Cut-Through Routing
165(1)
11.4 Summary
166(1)
12 Stream Processing Architecture
167(14)
12.1 Dataflow Paradigm
167(4)
12.2 Parallel Accelerators
171(1)
12.3 Stream Processors
172(5)
12.3.1 Stream Processor Architecture
173(1)
12.3.2 Execution Overview
174(2)
12.3.3 Locality and Bandwidth Hierarchy
176(1)
12.4 Summary
177(4)
Section IV Parallel Computing Programming
13 Parallel Computing Programming Basics
181(32)
13.1 Shared-Memory Programming
181(5)
13.1.1 Programming APIs
182(1)
13.1.1.1 OpenMP
182(1)
13.1.1.2 Pthreads
183(1)
13.1.2 Programming Languages
184(1)
13.1.2.1 Haskell
184(2)
13.2 Message-Passing Programming
186(7)
13.2.1 Programming APIs
187(1)
13.2.1.1 PVM
187(1)
13.2.1.2 MPI
188(1)
13.2.1.3 MapReduce
189(2)
13.2.2 Programming Languages
191(1)
13.2.2.1 Erlang
191(2)
13.3 Stream Programming
193(5)
13.3.1 Programming APIs
194(1)
13.3.1.1 CUDA
194(1)
13.3.1.2 OpenCL
194(1)
13.3.1.3 OpenACC
195(1)
13.3.2 Programming Languages
195(1)
13.3.2.1 Scala
195(3)
13.4 Summary
198(1)
Appendix 13A Functional Programming
199(3)
13.A.1 Characteristics of Functional Programming
200(1)
13.A.2 Advantages of Functional Programming
201(1)
13.A.3 Disadvantages of Functional Programming
202(1)
Appendix 13.B Hadoop MapReduce
202(11)
13.B.1 MapReduce Processing
204(1)
13.B.1.1 JobTracker
204(1)
13.B.1.2 TaskTracker
205(1)
13.B.2 MapReduce Enhancements and Extensions
206(1)
13.B.2.1 Supporting Iterative Processing
206(4)
13.B.2.2 Column Storage
210(1)
13.B.3 YARN
211(2)
14 Shared-memory Parallel Programming with OpenMP
213(10)
14.1 OpenMP
213(1)
14.2 Overview of Features
214(4)
14.3 Additional Feature Details
218(3)
14.3.1 OpenMP Directives
218(1)
14.3.1.1 Parallel Region Construct
218(1)
14.3.1.2 Work-sharing Constructs
218(1)
14.3.1.3 Directive Clauses
219(1)
14.3.2 Synchronization
219(1)
14.3.3 Runtime Library Routines
220(1)
14.3.3.1 Execution Environment Routines
220(1)
14.3.3.2 Lock Routines
220(1)
14.3.3.3 Timing routines
221(1)
14.4 Summary
221(2)
15 Message Passing Parallel Programming with MPI
223(16)
15.1 Introduction to MPI
223(1)
15.2 Basic Point-to-point Communication Routines
224(1)
15.3 Basic MPI Collective Communication Routines
225(4)
15.4 Environment Management Routines
229(2)
15.5 Point-to-point Communication Routines
231(5)
15.5.1 Blocking Message Passing Routines
231(3)
15.5.2 Non-blocking Message Passing Routines
234(2)
15.6 Collective Communication Routines
236(2)
15.6.1 Synchronization
236(1)
15.6.2 Data Movement
236(2)
15.7 Summary
238(1)
16 Stream Processing Programming with CUDA, OpenCL, and OpenACC
239(16)
16.1 CUDA
239(2)
16.2 OpenCL
241(7)
16.2.1 Synchronization functions
247(1)
16.3 OpenACC
248(3)
16.3.1 Common Directives
249(1)
16.3.2 Data Management
250(1)
16.3.3 Asynchronous Processing and Synchronization
251(1)
16.3.4 Device Management
251(1)
16.4 Summary
251(4)
Section V Internet of Things Big Data Stream Processing
17 Internet of Things (IoT) Technologies
255(32)
17.1 Introduction to IoT
256(8)
17.1.1 IoT Building Blocks
258(2)
17.1.2 IoT Architecture
260(1)
17.1.3 Widened Address Space with IPv6
261(3)
17.2 RFID (Radio Frequency Identification)
264(1)
17.3 Sensor Networks
265(11)
17.3.1 Wireless Networks
265(4)
17.3.2 Sensor
269(4)
17.3.3 WSNs
273(2)
17.3.3.1 WSN Characteristics
275(1)
17.4 Summary
276(1)
Appendix 17.A Internet of Things (IoT) in 5G Mobile Technologies
277(5)
17.A.1 5G Trends
278(1)
17.A.2 5G Requirements
279(1)
17.A.3 5G Objectives
279(1)
17.A.4 5G-enabling Technologies
280(2)
Appendix 17.B Edge Computing and Fog Computing
282(5)
17.B.1 Implementation of Edge Computing
284(1)
17.B.1.1 Data Reduction Techniques
284(1)
17B.1.2 Fog Computing
285(2)
18 Sensor Data Processing
287(26)
18.1 Sensor Data-Gathering and Data-Dissemination Mechanisms
287(6)
18.1.1 Mechanisms Based on Storage Location
288(1)
18.1.1.1 Database with Geographic Information
288(1)
18.1.2 Classification of Data-Gathering Mechanisms Based on the Direction of Diffusion
289(1)
18.1.2.1 Directed Diffusion
290(1)
18.1.2.2 SPIN
291(1)
18.1.3 Mechanisms Based on the Structure of Dissemination
291(2)
18.2 Time Windows
293(1)
18.3 Sensor Database
294(3)
18.4 Data-Fusion Mechanisms
297(2)
18.4.1 Classification of Data-fusion Mechanisms Based on Functions
298(1)
18.4.2 System Architectures of Data Fusion
298(1)
18.4.3 Trade-offs of Resources
299(1)
18.5 Data-fusion Techniques, Methods, and Algorithms
299(2)
18.5.1 Inference
300(1)
18.5.2 Estimates
300(1)
18.6 Data-fusion Architectures and Models
301(6)
18.6.1 Data-based Models
302(1)
18.6.2 Activity-based Models
303(2)
18.6.3 Role-based Models
305(2)
18.7 Summary
307(1)
Appendix 18.A Wireless Sensor Networks Anomalies
307(6)
18.A.1 Architectural Design Guidelines
310(3)
19 Big Data Computing
313(22)
19.1 Introduction to Big Data
314(6)
19.1.1 What Is Big Data?
314(1)
19.1.1.1 Data Volume
314(3)
19.1.1.2 Data Velocity
317(1)
19.1.1.3 Data Variety
317(1)
19.1.1.4 Data Veracity
318(1)
19.1.2 Common Characteristics of Big Data Computing Systems
319(1)
19.2 Tools and Techniques of Big Data
320(6)
19.2.1 Processing Approach
320(2)
19.2.2 Big Data System Architecture
322(1)
19.2.2.1 BASE (Basically Available, Soft State, Eventual Consistency)
322(1)
19.2.2.2 Functional Decomposition
323(1)
19.2.2.3 Master-Slave Replication
323(1)
19.2.3 Row Partitioning or Sharding
324(1)
19.2.4 Row Versus Column-Oriented Data Layouts
324(1)
19.2.5 NoSQL Data Management
325(1)
19.3 NoSQL Databases
326(3)
19.3.1 Column-Oriented Stores or Databases
327(1)
19.3.2 Key-Value Stores (K-V Stores) or Databases
327(1)
19.3.3 Document-Oriented Databases
328(1)
19.3.4 Graph Stores or Databases
329(1)
19.4 Aadhaar Project
329(3)
19.5 Summary
332(1)
Appendix 19.A Compute-intensive Big Compute versus Data-intensive Big Data Computing
332(3)
20 Big Data Stream Processing
335(26)
20.1 Big Data Stream Processing
335(11)
20.1.1 History of Data Stream Processing
338(2)
20.1.2 Data Stream Processing
340(1)
20.1.2.1 Data Stream Processing Systems
340(4)
20.1.2.2 DSMS
344(1)
20.1.2.3 CEPs
344(1)
20.1.2.4 Stream Processing Platforms/Engine
345(1)
20.2 Stream Processing System Implementations
346(9)
20.2.1 Academic or Research Prototype Systems
346(5)
20.2.2 Open-Source Systems
351(1)
20.2.3 Commercial Systems
352(1)
20.2.3.1 Streambase
352(1)
20.2.3.2 IBM InfoSphere Streams
353(1)
20.2.3.3 TIBCO BusinessEvents
354(1)
20.2.3.4 Oracle CEP
354(1)
20.3 Summary
355(1)
Appendix 20A Spark
356(5)
20A.1 Spark Components
357(1)
20A.2 Spark Concepts
358(2)
20A.3 Benefits of Spark
360(1)
Epilogue: Quantum Computing 361(4)
References 365(2)
Index 367
Vivek Kale has more than two decades of professional IT experience during which he has handled and consulted on various aspects of enterprise-wide information modeling, enterprise architectures, business process re-design, and, e-business architectures. He has been Group CIO of Essar Group, the steel/oil & gas major of India, as well as, Raymond Ltd., the textile & apparel major of India. He is a seasoned practitioner in enhancing business agility through digital transformation of business models, enterprise architecture and business processes, and, enhancing IT-enabled enterprise intelligence (EQ). He has authored books on Cloud Computing and Big Data Computing. He is also author of Big Data Computing: A Guide for Business and Technology Managers(CRC Press, 2016), Agile Network Businesses: Collaboration, Coordination, and Competitive Advantage (CRC Press 2017), and, Digital Transformation of Enterprise Architecture (CRC Press 2020).