Tasuta saatmine! | Klienditugi: 7440010 (E-R 10-18)

E-raamat: Microsoft Big Data Solutions

3.62/5 (13 hinnangut Goodreads-ist)

Adam Jorgensen (Pragmatic Works), James Rowland-Jones, Brian Mitchell, Christopher Price (Bayer Diagnostics Europe), Dan Clark, John Welch

Formaat: EPUB+DRM
Ilmumisaeg: 24-Feb-2014
Kirjastus: John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781118729557

Teised raamatud teemal:

Databases

Formaat - EPUB+DRM
Hind: 37,04 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: EPUB+DRM
Ilmumisaeg: 24-Feb-2014
Kirjastus: John Wiley & Sons Inc
Keel: eng
ISBN-13: 9781118729557

Teised raamatud teemal:

Databases

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Tap the power of Big Data with Microsoft technologies

Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies.

Best of all, it helps you integrate these new solutions with technologies you already know, such as SQL Server and Hadoop.

Walks you through how to integrate Big Data solutions in your company using Microsoft's HDInsight Server, HortonWorks Data Platform for Windows, and open source tools Explores both on-premises and cloud-based solutions Shows how to store, manage, analyze, and share Big Data through the enterprise Covers topics such as Microsoft's approach to Big Data, installing and configuring HortonWorks Data Platform for Windows, integrating Big Data with SQL Server, visualizing data with Microsoft and HortonWorks BI tools, and more Helps you build and execute a Big Data plan Includes contributions from the Microsoft and HortonWorks Big Data product teams

If you need a detailed roadmap for designing and implementing a fully deployed Big Data solution, you'll want Microsoft Big Data Solutions.

Introduction

Part I What Is Big Data?

(36)

Chapter 1 Industry Needs and Solutions

(16)

What's So Big About Big Data?

(1)

A Brief History of Hadoop

(1)

Google

(1)

Nutch

(1)

What Is Hadoop?

(13)

Derivative Works and Distributions

(1)

Hadoop Distributions

(1)

Core Hadoop Ecosystem

(2)

Important Apache Projects for Hadoop

(6)

The Future for Hadoop

(1)

Summary

(2)

Chapter 2 Microsoft's Approach to Big Data

(18)

A Story of "Better Together"

(1)

Competition in the Ecosystem

(5)

SQL on Hadoop Today

(1)

Hortonworks and Stinger

(2)

Cloudera and Impala

(2)

Microsoft's Contribution to SQL in Hadoop

(1)

Deploying Hadoop

(12)

Deployment Factors

(3)

Deployment Topologies

(4)

Deployment Scorecard

(3)

Summary

(1)

Part II Setting Up for Big Data with Microsoft

(28)

Chapter 3 Configuring Your First Big Data Environment

(26)

Getting Started

(1)

Getting the Install

(1)

Running the Installation

(15)

On-Premise Installation: Single-Node Installation

(10)

HDInsight Service: Installing in the Cloud

(1)

Windows Azure Storage Explorer Options

(3)

Validating Your New Cluster

(3)

Logging into HDInsight Service

(2)

Verify HDP Functionality in the Logs

(1)

Common Post-Setup Tasks

(7)

Loading Your First Files

(2)

Verifying Hive and Pig

(3)

Summary

(2)

Part III Storing and Managing Big Data

(86)

Chapter 4 HDFS, Hive, HBase, and HCatalog

(18)

Exploring the Hadoop Distributed File System

(7)

Explaining the HDFS Architecture

(3)

Interacting with HDFS

(3)

Exploring Hive: The Hadoop Data Warehouse Platform

(3)

Designing, Building, and Loading Tables

(1)

Querying Data

(1)

Configuring the Hive ODBC Driver

(1)

Exploring HCatalog: HDFS Table and Metadata Management

(2)

Exploring HBase: An HDFS Column-Oriented Database

(5)

Columnar Databases

(1)

Defining and Populating an HBase Table

(1)

Using Query Operations

(1)

Summary

(1)

Chapter 5 Storing and Managing Data in HDFS

(20)

Understanding the Fundamentals of HDFS

(6)

HDFS Architecture

(2)

NameNodes and DataNodes

(1)

Data Replication

(2)

Using Common Commands to Interact with HDFS

(1)

Interfaces for Working with HDFS

(8)

File Manipulation Commands

(3)

Administrative Functions in HDFS

(3)

Moving and Organizing Data in HDFS

100

(5)

Moving Data in HDFS

100

(1)

Implementing Data Structures for Easier Management

101

(1)

Rebalancing Data

102

(1)

Summary

103

(2)

Chapter 6 Adding Structure with Hive

105

(28)

Understanding Hive's Purpose and Role

106

(11)

Providing Structure for Unstructured Data

107

(7)

Enabling Data Access and Transformation

114

(1)

Differentiating Hive from Traditional RDBMS Systems

115

(1)

Working with Hive

116

(1)

Creating and Querying Basic Tables

117

(9)

Creating Databases

117

(1)

Creating Tables

118

(3)

Adding and Deleting Data

121

(2)

Querying a Table

123

(3)

Using Advanced Data Structures with Hive

126

(7)

Setting Up Partitioned Tables

126

(2)

Loading Partitioned Tables

128

(1)

Using Views

129

(1)

Creating Indexes for Tables

130

(1)

Summary

131

(2)

Chapter 7 Expanding Your Capability with HBase and HCatalog

133

(18)

Using HBase

134

(6)

Creating HBase Tables

134

(2)

Loading Data into an HBase Table

136

(2)

Performing a Fast Lookup

138

(1)

Loading and Querying HBase

139

(1)

Managing Data with HCatalog

140

(3)

Working with HCatalog and Hive

140

(1)

Defining Data Structures

141

(2)

Creating Indexes

143

(1)

Creating Partitions

143

(2)

Integrating HCatalog with Pig and Hive

145

(4)

Using HBase or Hive as a Data Warehouse

149

(2)

Summary

150

(1)

Part IV Working with Your Big Data

151

(52)

Chapter 8 Effective Big Data ETL with SSIS, Pig, and Sqoop

153

(24)

Combining Big Data and SQL Server Tools for Better Solutions

154

(2)

Why Move the Data?

154

(1)

Transferring Data Between Hadoop and SQL Server

155

(1)

Working with SSIS and Hive

156

(5)

Connecting to Hive

157

(4)

Configuring Your Packages

161

(6)

Loading Data into Hadoop

165

(2)

Getting the Best Performance from SSIS

167

(1)

Transferring Data with Sqoop

167

(4)

Copying Data from SQL Server

168

(2)

Copying Data to SQL Server

170

(1)

Using Pig for Data Movement

171

(4)

Transforming Data with Pig

171

(3)

Using Pig and SSIS Together

174

(1)

Choosing the Right Tool

175

(2)

Use Cases for SSIS

175

(1)

Use Cases for Pig

175

(1)

Use Cases for Sqoop

176

(1)

Summary

176

(1)

Chapter 9 Data Research and Advanced Data Cleansing with Pig and Hive

177

(26)

Getting to Know Pig

178

(14)

When to Use Pig

178

(1)

Taking Advantage of Built-in Functions

179

(1)

Executing User-defined Functions

180

(2)

Using UDFs

182

(7)

Building Your Own UDFs for Pig

189

(3)

Using Hive

192

(11)

Data Analysis with Hive

192

(1)

Types of Hive Functions

192

(3)

Extending Hive with Map-reduce Scripts

195

(3)

Creating a Custom Map-reduce Script

198

(1)

Creating Your Own UDFs for Hive

199

(2)

Summary

201

(2)

Part V Big Data and SQL Server Together

203

(132)

Chapter 10 Data Warehouses and Hadoop Integration

205

(52)

State of the Union

206

(1)

Challenges Faced by Traditional Data Warehouse Architectures

207

(9)

Technical Constraints

207

(6)

Business Challenges

213

(3)

Hadoop's Impact on the Data Warehouse Market

216

(4)

Keep Everything

216

(1)

Code First (Schema Later)

217

(1)

Model the Value

218

(1)

Throw Compute at the Problem

218

(2)

Introducing Parallel Data Warehouse (PDW)

220

(15)

What Is PDW?

221

(1)

Why Is PDW Important?

222

(2)

How PDW Works

224

(11)

Project Polybase

235

(22)

Polybase Architecture

235

(14)

Business Use Cases for Polybase Today

249

(2)

Speculating on the Future for Polybase

251

(4)

Summary

255

(2)

Chapter 11 Visualizing Big Data with Microsoft BI

257

(28)

An Ecosystem of Tools

258

(5)

Excel

258

(1)

PowerPivot

258

(1)

Power View

259

(2)

Power Map

261

(1)

Reporting Services

261

(2)

Self-service Big Data with PowerPivot

263

(14)

Setting Up the ODBC Driver

263

(2)

Loading Data

265

(7)

Updating the Model

272

(1)

Adding Measures

273

(1)

Creating Pivot Tables

274

(3)

Rapid Big Data Exploration with Power View

277

(4)

Spatial Exploration with Power Map

281

(4)

Summary

283

(2)

Chapter 12 Big Data Analytics

285

(12)

Data Science, Data Mining, and Predictive Analytics

286

(2)

Data Mining

286

(1)

Predictive Analytics

287

(1)

Introduction to Mahout

288

(1)

Building a Recommendation Engine

289

(8)

Getting Started

291

(1)

Running a User-to-user Recommendation Job

292

(3)

Running an Item-to-item Recommendation Job

295

(1)

Summary

296

(1)

Chapter 13 Big Data and the Cloud

297

(26)

Defining the Cloud

298

(1)

Exploring Big Data Cloud Providers

299

(1)

Amazon

299

(1)

Microsoft

300

(1)

Setting Up a Big Data Sandbox in the Cloud

300

(15)

Getting Started with Amazon EMR

301

(6)

Getting Started with HDInsight

307

(8)

Storing Your Data in the Cloud

315

(8)

Storing Data

316

(1)

Uploading Your Data

317

(1)

Exploring Big Data Storage Tools

318

(1)

Integrating Cloud Data

319

(2)

Other Cloud Data Sources

321

(1)

Summary

321

(2)

Chapter 14 Big Data in the Real World

323

(12)

Common Industry Analytics

324

(3)

Telco

324

(1)

Energy

325

(1)

Retail

325

(1)

Data Services

326

(1)

IT/Hosting Optimization

326

(1)

Marketing Social Sentiment

327

(1)

Operational Analytics

327

(8)

Failing Fast

328

(1)

A New Ecosystem of Technologies

328

(2)

User Audiences

330

(3)

Summary

333

(2)

Part VI Moving Your Big Data Forward

335

(44)

Chapter 15 Building and Executing Your Big Data Plan

337

(14)

Gaining Sponsor and Stakeholder Buy-In

338

(4)

Problem Definition

338

(1)

Scope Management

339

(2)

Stakeholder Expectations

341

(1)

Defining the Criteria for Success

342

(1)

Identifying Technical Challenges

342

(3)

Environmental Challenges

342

(2)

Challenges in Skillset

344

(1)

Identifying Operational Challenges

345

(3)

Planning for Setup/Configuration

345

(2)

Planning for Ongoing Maintenance

347

(1)

Going Forward

348

(3)

The Hand Off to Operations

348

(1)

After Deployment

349

(1)

Summary

350

(1)

Chapter 16 Operational Big Data Management

351

(28)

Hybrid Big Data Environments: Cloud and On-Premise Solutions Working Together

352

(1)

Ongoing Data Integration with Cloud and On-Premise Solutions

353

(1)

Integration Thoughts for Big Data

354

(2)

Backups and High Availability Your Big Data Environment

356

(3)

High Availability

356

(2)

Disaster Recovery

358

(1)

Big Data Solution Governance

359

(1)

Creating Operational Analytics

360

(19)

System Center Operations Manager for HDP

361

(1)

Installing the Ambari SCOM Management Pack

362

(9)

Monitoring with the Ambari SCOM Management Pack

371

(6)

Summary

377

(2)

Index

379

Adam Jorgensen is the President of Pragmatic Works and the Executive Vice President of PASS. He has extensive experience with data warehousing, analytics, and NoSQL architectures.

James Rowland-Jones is a principal consultant for The Big Bang Data Company. He specializes in big data warehouse solutions that leverage SQL Server Parallel Data Warehouse and Hadoop ecosystems.

John Welch is Vice President of Software Development at Pragmatic Works, where he leads the development of a suite of BI and data products for SQL Server and related technologies.

Dan Clark is a senior BI consultant for Pragmatic Works. Dan has published several books and numerous articles on .NET programming and BI development.

Christopher Price is a senior consultant with Microsoft. His focus is on ETL, data integration, data quality, MDM, SSAS, SharePoint, and all things big data.

Brian Mitchell is the lead architect of the Microsoft Big Data Center of Expertise. He focuses exclusively on DW/BI solutions.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97811187295576e.html

Märksõnad:

E-raamat: Microsoft Big Data Solutions

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv