Muutke küpsiste eelistusi

SQL on Big Data: Technology, Architecture, and Innovation 1st ed. [Pehme köide]

  • Formaat: Paperback / softback, 157 pages, kõrgus x laius: 235x155 mm, kaal: 2818 g, 52 Illustrations, color; 28 Illustrations, black and white; XVII, 157 p. 80 illus., 52 illus. in color., 1 Paperback / softback
  • Ilmumisaeg: 18-Nov-2016
  • Kirjastus: APress
  • ISBN-10: 1484222466
  • ISBN-13: 9781484222461
Teised raamatud teemal:
  • Pehme köide
  • Hind: 32,95 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Tavahind: 38,76 €
  • Säästad 15%
  • Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Tellimisaeg 2-4 nädalat
  • Lisa soovinimekirja
  • Formaat: Paperback / softback, 157 pages, kõrgus x laius: 235x155 mm, kaal: 2818 g, 52 Illustrations, color; 28 Illustrations, black and white; XVII, 157 p. 80 illus., 52 illus. in color., 1 Paperback / softback
  • Ilmumisaeg: 18-Nov-2016
  • Kirjastus: APress
  • ISBN-10: 1484222466
  • ISBN-13: 9781484222461
Teised raamatud teemal:
Learn various commercial and open source products that perform SQL on Big Data platforms. You will understand the architectures of the various SQL engines being used and how the tools work internally in terms of execution, data movement, latency, scalability, performance, and system requirements.



This book consolidates in one place solutions to the challenges associated with the requirements of speed, scalability, and the variety of operations needed for data integration and SQL operations. After discussing the history of the how and why of SQL on Big Data, the book provides in-depth insight into the products, architectures, and innovations happening in this rapidly evolving space.





SQL on Big Data discusses in detail the innovations happening, the capabilities on the horizon, and how they solve the issues of performance and scalability and the ability to handle different data types. The book covers how SQL on Big Data engines are permeating the OLTP, OLAP, and Operational analytics space and the rapidly evolving HTAP systems.

You will learn the details of:





Batch ArchitecturesUnderstand the internals and how the existing Hive engine is built and how it is evolving continually to support new features and provide lower latency on queries

Interactive ArchitecturesUnderstanding how SQL engines are architected to support low latency on large data sets

Streaming ArchitecturesUnderstanding how SQL engines are architected to support queries on data in motion using in-memory and lock-free data structures

Operational ArchitecturesUnderstanding how SQL engines are architected for transactional and operational systems to support transactions on Big Data platforms

Innovative ArchitecturesExplore the rapidly evolving newer SQL engines on Big Data with innovative ideas and concepts





Who This Book Is For: Business analysts, BI engineers, developers, data scientists and architects, and quality assurance professionals

Arvustused

SQL on big data is a well-designed, nicely written book readable by both the technologist and the business decision maker. From a rapidly expanding field, the author has done a good job at introducing a sample of relevant technologies. the book is a good introduction to the topic. (Computing Reviews, May, 2017)

About the Author xi
About the Technical Reviewer xiii
Acknowledgements xv
Introduction xvii
Chapter 1 Why SQL on Big Data?
1(16)
Why SQL on Big Data?
3(1)
Why RDBMS Cannot Scale
4(1)
SQL-on-Big-Data Goals
4(3)
SQL-on-Big-Data Landscape
7(2)
Open Source Tools
9(2)
Commercial Tools
11(2)
Appliances and Analytic DB Engines
13(1)
How to Choose an SQL-on-Big-Data Solution
14(1)
Summary
15(2)
Chapter 2 SQL-on-Big-Data Challenges & Solutions
17(18)
Types of SQL
17(1)
Query Workloads
18(2)
Types of Data: Structured, Semi-Structured, and Unstructured
20(1)
Semi-Structured Data
20(1)
Unstructured Data
20(1)
How to Implement SQL Engines on Big Data
20(1)
SQL Engines on Traditional Databases
21(1)
How an SQL Engine Works in an Analytic Database
22(2)
Approaches to Solving SQL on Big Data
24(1)
Approaches to Reduce Latency on SQL Queries
25(8)
Summary
33(2)
Chapter 3 Batch SQL---Architecture
35(26)
Hive
35(1)
Hive Architecture Deep Dive
36(1)
How Hive Translates SQL into MR
37(3)
Analytic Functions in Hive
40(3)
ACID Support in Hive
43(4)
Performance Improvements in Hive
47(9)
CBO Optimizers
56(2)
Recommendations to Speed Up Hive
58(1)
Upcoming Features in Hive
59(1)
Summary
59(2)
Chapter 4 Interactive SQL---Architecture
61(36)
Why Is Interactive SQL So Important?
61(1)
SQL Engines for Interactive Workloads
62(1)
Spark
62(2)
Spark SQL
64(6)
General Architecture Pattern
70(1)
Impala
71(3)
Impala Optimizations
74(4)
Apache Drill
78(5)
Vertica
83(4)
Jethro Data
87(2)
Others
89(1)
MPP vs. Batch---Comparisons
89(2)
Capabilities and Characteristics to Look for in the SQL Engine
91(4)
Summary
95(2)
Chapter 5 SQL for Streaming, Semi-Structured, and Operational Analytics
97(30)
SQL on Semi-Structured Data
97(1)
Apache Drill---JSON
98(3)
Apache Spark---JSON
101(2)
Apache Spark---Mongo
103(1)
SQL on Streaming Data
104(1)
Apache Spark
105(2)
PipelineDB
107(2)
Apache Calcite
109(2)
SQL for Operational Analytics on Big Data Platforms
111(1)
Trafodion
112(5)
Optimizations
117(1)
Apache Phoenix with HBase
118(4)
Kudu
122(4)
Summary
126(1)
Chapter 6 Innovations and the Road Ahead
127(20)
BlinkDB
127(2)
How Does It Work
129(1)
Data Sample Management
129(1)
Execution
130(1)
GPU Is the New CPU---SQL Engines Based on GPUs
130(1)
MapD (Massively Parallel Database)
131(1)
Architecture of MapD
132(1)
GPUdb
133(1)
SQream
133(1)
Apache Kylin
134(3)
Apache Lens
137(2)
Apache Tajo
139(1)
HTAP
140(3)
Advantages of HTAP
143(1)
TPC Benchmark
144(1)
Summary
145(2)
Appendix 147(6)
Index 153
Sumit Pal is a big data and data science consultant working with multiple clients and advising them on their data architectures and big data solutions as well as providing hands on coding with Spark, Scala, Java and Python. He is a big data, visualization and data science consultant, and a software architect and big data enthusiast and builds end-to-end data-driven analytic systems. He has more than 22 years of experience in the software industry in various roles spanning companies from startups to enterprises.





Sumit has worked for Microsoft (SQL server development team), Oracle (OLAP development team) and Verizon (big data analytics team) in a career spanning 22 years. He has extensive experience in building scalable systems across the stack from middle-tier, data tier to visualization for analytics applications, using big data, and NoSQL DB. Sumit has deep expertise in database Internals, data warehouses, dimensional modeling, data science with Java and Python, and SQL.









Sumit has MS and BS in Computer Science.