Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Professional Hadoop [Wiley Online]

Benoy Antony, Cazen Lee, Branky Shao, Kai Sasaki, Cheryl Adams, Konstantin Boudnik

Formaat: 216 pages
Ilmumisaeg: 01-Jul-2016
Kirjastus: Wrox Press
ISBN-10: 1119281326
ISBN-13: 9781119281320

Teised raamatud teemal:

Data mining

Wiley Online
Hind: 52,87 €*
* hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks

Formaat: 216 pages
Ilmumisaeg: 01-Jul-2016
Kirjastus: Wrox Press
ISBN-10: 1119281326
ISBN-13: 9781119281320

Teised raamatud teemal:

Data mining

Rohkem infot Wiley Online kohta

Raamatu kodulehekülg: https://onlinelibrary.wiley.com/doi/book/10.1002/9781119281320

The professional's one-stop guide to this open-source, Java-based big data framework Professional Hadoop is the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings. Written by an expert team of certified Hadoop developers, committers, and Summit speakers, this book details every key aspect of Hadoop technology to enable optimal processing of large data sets. Designed expressly for the professional developer, this book skips over the basics of database development to get you acquainted with the framework's processes and capabilities right away. The discussion covers each key Hadoop component individually, culminating in a sample application that brings all of the pieces together to illustrate the cooperation and interplay that make Hadoop a major big data solution. Coverage includes everything from storage and security to computing and user experience, with expert guidance on integrating other software and more.

Hadoop is quickly reaching significant market usage, and more and more developers are being called upon to develop big data solutions using the Hadoop framework. This book covers the process from beginning to end, providing a crash course for professionals needing to learn and apply Hadoop quickly.

Configure storage, UE, and in-memory computing Integrate Hadoop with other programs including Kafka and Storm Master the fundamentals of Apache Big Top and Ignite Build robust data security with expert tips and advice

Hadoop's popularity is largely due to its accessibility. Open-source and written in Java, the framework offers almost no barrier to entry for experienced database developers already familiar with the skills and requirements real-world programming entails. Professional Hadoop gives you the practical information and framework-specific skills you need quickly.

Introduction

xix

Chapter 1 Hadoop Introduction

(14)

Business Analytics and Big Data

(1)

The Components of Hadoop

(1)

The Distributed File System (HDFS)

(1)

What Is MapReduce?

(1)

What Is YARN?

(1)

What Is ZooKeeper?

(1)

What Is Hive?

(1)

Integration with Other Systems

(7)

The Hadoop Ecosystem

(2)

Data Integration and Hadoop

(4)

Summary

(2)

Chapter 2 Storage

(32)

Basics of Hadoop HDFS

(10)

Concept

(3)

Architecture

(3)

Interface

(4)

Setting Up the HDFS Cluster in Distributed Mode

(4)

Install

(4)

Advanced Features of HDFS

(11)

Snapshots

(2)

Offline Viewer

(5)

Tiered Storage

(2)

Erasure Coding

(2)

File Format

(3)

Cloud Storage

(1)

Summary

(2)

Chapter 3 Computation

(20)

Basics of Hadoop MapReduce

(7)

Concept

(2)

Architecture

(4)

How to Launch a MapReduce Job

(6)

Writing a Map Task

(1)

Writing a Reduce Task

(1)

Writing a MapReduce Job

(2)

Configurations

(1)

Advanced Features of MapReduce

(4)

Distributed Cache

(2)

Counter

(1)

Job History Server

(1)

The Difference from a Spark Job

(1)

Summary

(2)

Chapter 4 User Experience

(22)

Apache Hive

(8)

Hive Installation

(1)

HiveQL

(3)

UDF/SerDe

(2)

Hive Tuning

(1)

Apache Pig

(3)

Pig Installation

(1)

Pig Latin

(2)

UDF

(1)

Hue

(2)

Features

(1)

Apache Oozie

(7)

Oozie Installation

(2)

How Oozie Works

(1)

Workflow/Coordinator

(3)

Oozie CLI

(1)

Summary

(1)

Chapter 5 Integration with Other Systems

(20)

Apache Sqoop

(3)

How It Works

(3)

Apache Flume

(4)

How It works

(4)

Apache Kafka

(5)

How It Works

(2)

Kafka Connect

100

(1)

Stream Processing

101

(1)

Apache Storm

102

(5)

How It Works

103

(2)

Trident

105

(1)

Kafka Integration

105

(2)

Summary

107

(2)

Chapter 6 Hadoop Security

109

(32)

Securing the Hadoop Cluster

110

(14)

Perimeter Security

110

(2)

Authentication Using Kerberos

112

(4)

Service Level Authorization in Hadoop

116

(3)

Impersonation

119

(2)

Securing the HTTP Channel

121

(3)

Securing Data

124

(10)

Data Classification

125

(1)

Bringing Data to the Cluster

125

(4)

Protecting Data in the Cluster

129

(5)

Securing Applications

134

(4)

YARN Architecture

134

(1)

Application Submission in YARN

134

(4)

Summary

138

(3)

Chapter 7 Ecosystem at Large: Hadoop with Apache Bigtop

141

(20)

Basics Concepts

142

(2)

Software Stacks

142

(1)

Test Stacks

143

(1)

Works on My Laptop

143

(1)

Developing a Custom-Tailored Stack

144

(5)

Apache Bigtop: The History

144

(1)

Apache Bigtop: The Concept and Philosophy

145

(1)

The Structure of the Project

146

(1)

Meet the Build System

147

(1)

Toolchain and Development Environment

148

(1)

BOM Definition

148

(1)

Deployment

149

(5)

Bigtop Provisioner

149

(1)

Master-less Puppet Deployment of a Cluster

150

(2)

Configuration Management with Puppet

152

(2)

Integration Validation

154

(5)

iTests and Validation Applications

154

(1)

Stack Integration Test Development

155

(2)

Validating the Stack

157

(1)

Cluster Failure Tests

158

(1)

Smoke the Stack

158

(1)

Putting It All Together

159

(1)

Summary

159

(2)

Chapter 8 In-Memory Computing in Hadoop Stack

161

(22)

Introduction to In-Memory Computing

162

(2)

Apache Ignite: Memory First

164

(6)

System Architecture of Apache Ignite

165

(1)

Data Grid

165

(2)

A Discourse on High Availability

167

(1)

Compute Grid

168

(1)

Service Grid

169

(1)

Memory Management

169

(1)

Persistence Store

170

(1)

Legacy Hadoop Acceleration with Ignite

170

(5)

Benefits of In-Memory Storage

171

(1)

Memory Filesystem: HDFS Caching

171

(1)

In-Memory MapReduce

172

(3)

Advanced Use of Apache Ignite

175

(6)

Spark and Ignite

175

(1)

Sharing the State

176

(1)

In-Memory SQL on Hadoop

177

(1)

SQL with Ignite

178

(2)

Streaming with Apache Ignite

180

(1)

Summary

181

(2)

Glossary

183

(4)

Index

187

About the authors

Benoy Antony is an Apache Hadoop Committer and Hadoop Architect at eBay.

Konstantin Boudnik is co-founder and CEO of Memcore.io, and is one of the early developers of Hadoop and a co-author of Apache Bigtop.

Cheryl Adams is a Senior Cloud Data & Infrastructure Architect in the healthcare data realm.

Branky Shao is a software engineer at eBay, and a contributor to the Cascading project.

Cazen Lee is a Software Architect at Samsung SDS.

Kai Sasaki is a Software Engineer at Treasure Data Inc.

Visit us at wrox.com where you have access to free code samples, Programmer to Programmer forums, and discussions on the latest happenings in the industry from around the world.

Püsilink: https://www.kriso.ee/db/9781119281320_pe.html

Märksõnad:

E-raamat: Professional Hadoop [Wiley Online]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Kirjastuste teemad

Vali ostukorv