Muutke küpsiste eelistusi

Big Data 2.0 Processing Systems: A Systems Overview Second Edition 2020 [Kõva köide]

  • Formaat: Hardback, 145 pages, kõrgus x laius: 235x155 mm, kaal: 454 g, 19 Illustrations, color; 51 Illustrations, black and white, 1 Hardback
  • Ilmumisaeg: 10-Jul-2020
  • Kirjastus: Springer Nature Switzerland AG
  • ISBN-10: 3030441865
  • ISBN-13: 9783030441869
  • Kõva köide
  • Hind: 59,47 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Tavahind: 79,29 €
  • Säästad 25%
  • Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 3-4 nädalat
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Tellimisaeg 2-4 nädalat
  • Lisa soovinimekirja
  • Formaat: Hardback, 145 pages, kõrgus x laius: 235x155 mm, kaal: 454 g, 19 Illustrations, color; 51 Illustrations, black and white, 1 Hardback
  • Ilmumisaeg: 10-Jul-2020
  • Kirjastus: Springer Nature Switzerland AG
  • ISBN-10: 3030441865
  • ISBN-13: 9783030441869
This book provides readers the big picture and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems.





After Chapter 1 presents the general background of the big data phenomena, Chapter 2 provides an overview of various general-purpose big data processing systems that allow their users to develop various big data processing jobs for different application domains. In turn, Chapter 3 examines various systems that have been introduced to support the SQL flavor on top of the Hadoop infrastructure and provide competing and scalable performance in the processing of large-scale structured data. Chapter 4 discusses several systems that have been designed to tackle the problem of large-scale graph processing, while the main focus of Chapter 5 is on several systems that have been designed to provide scalable solutions for processing big data streams, and on other sets of systems that have been introduced to support the development of data pipelines between various types of big data processing jobs and systems. Next, Chapter 6 focuses on covering the emerging frameworks and systems in the domain of scalable machine learning and deep learning processing. Lastly, Chapter 7 shares conclusions and an outlook on future research challenges. This new and considerably enlarged second edition not only contains the completely new chapter 6, but also offers a refreshed content for the state-of-the-art in all domains of big data processing over the last years.





Overall, the book offers a valuable reference guide for professional, students, and researchers in the domain of big data processing systems. Further, its comprehensive content will hopefully encourage readers to pursue further research on the subject.

Arvustused

This short book is well written and informative. As a survey book, the author succeeds in raising awareness for the topic and reinforcing the view of its depth. As a research tool, the book works as a stepping stone for the curious manager or researcher wanting a short introduction to a wide range of big data areas. An easy read on the topic . Its many references provide a solid foundation for further study. (Jean-Pierre Kuilboer, Computing Reviews, August 12, 2022)

1 Introduction
1(16)
1.1 The Big Data Phenomena
1(4)
1.2 Big Data and Cloud Computing
5(2)
1.3 Big Data Storage Systems
7(4)
1.4 Big Data Processing and Analytics Systems
11(3)
1.5 Book Road Map
14(3)
2 General-Purpose Big Data Processing Systems
17(28)
2.1 The Big Data Star: The Hadoop Framework
17(13)
2.1.1 The Original Architecture
17(4)
2.1.2 Enhancements of the MapReduce Framework
21(8)
2.1.3 Hadoop's Ecosystem
29(1)
2.2 Spark
30(6)
2.3 Flink
36(4)
2.4 Hyracks/ASTERIX
40(5)
3 Large-Scale Processing Systems of Structured Data
45(14)
3.1 Why SQL-On-Hadoop?
45(1)
3.2 Hive
46(3)
3.3 Impala
49(1)
3.4 IBM Big SQL
50(1)
3.5 SPARK SQL
51(2)
3.6 HadoopDB
53(1)
3.7 Presto
54(2)
3.8 Tajo
56(1)
3.9 Google Big Query
57(1)
3.10 Phoenix
57(1)
3.11 Polybase
58(1)
4 Large-Scale Graph Processing Systems
59(36)
4.1 The Challenges of Big Graphs
59(2)
4.2 Does Hadoop Work Well for Big Graphs?
61(3)
4.3 Pregel Family of Systems
64(8)
4.3.1 The Original Architecture
64(3)
4.3.2 Giraph: BSP + Hadoop for Graph Processing
67(3)
4.3.3 Pregel Extensions
70(2)
4.4 GraphLab Family of Systems
72(3)
4.4.1 GraphLab
72(1)
4.4.2 PowerGraph
73(1)
4.4.3 GraphChi
73(2)
4.5 Spark-Based Large-Scale Graph Processing Systems
75(2)
4.6 Gradoop
77(2)
4.7 Other Systems
79(2)
4.8 Large-Scale RDF Processing Systems
81(14)
4.8.1 NoSQL-Based RDF Systems
82(3)
4.8.2 Hadoop-Based RDF Systems
85(2)
4.8.3 Spark-Based RDF Systems
87(2)
4.8.4 Other Distributed RDF Systems
89(6)
5 Large-Scale Stream Processing Systems
95(22)
5.1 The Big Data Streaming Problem
95(3)
5.2 Hadoop for Big Streams?!
98(3)
5.3 Storm
101(2)
5.4 Infosphere Streams
103(2)
5.5 Other Big Stream Processing Systems
105(5)
5.6 Big Data Pipelining Frameworks
110(7)
5.6.1 Pig Latin
110(2)
5.6.2 Tez
112(2)
5.6.3 Other Pipelining Systems
114(3)
6 Large-Scale Machine/Deep Learning Frameworks
117(10)
6.1 Harnessing the Value of Big Data
117(1)
6.2 Big Machine Learning Frameworks
118(5)
6.3 Deep Learning Frameworks
123(4)
7 Conclusions and Outlook
127
References
135
Sherif Sakr is the Head of Data Systems Group at the Institute of Computer Science, University of Tartu, Estonia. His research interest is data and information management in general, particularly in big data processing systems, big data analytics, data science and big data management in cloud computing platforms. He has published more than 150 refereed research publications in international journals and conferences. Sherif is an ACM Senior Member and an IEEE Senior Member, and in 2017, he has been appointed to serve as an ACM Distinguished Speaker and as an IEEE Distinguished Speaker. In addition, he is serving as the Editor-in-Chief of the Springer Encyclopedia of Big Data Technologies, and is also serving as a Co-Chair for the European Big Data Value Association (BDVA) TF6-Data Technology Architectures Group. In 2019, he received the best Arab scholar award from the Abdul Hammed Shoman Foundation.