Muutke küpsiste eelistusi

E-raamat: Microsoft Big Data Solutions

  • Formaat: EPUB+DRM
  • Ilmumisaeg: 24-Feb-2014
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118729557
Teised raamatud teemal:
  • Formaat - EPUB+DRM
  • Hind: 37,04 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: EPUB+DRM
  • Ilmumisaeg: 24-Feb-2014
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118729557
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Tap the power of Big Data with Microsoft technologies

Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies.

Best of all, it helps you integrate these new solutions with technologies you already know, such as SQL Server and Hadoop.





Walks you through how to integrate Big Data solutions in your company using Microsoft's HDInsight Server, HortonWorks Data Platform for Windows, and open source tools Explores both on-premises and cloud-based solutions Shows how to store, manage, analyze, and share Big Data through the enterprise Covers topics such as Microsoft's approach to Big Data, installing and configuring HortonWorks Data Platform for Windows, integrating Big Data with SQL Server, visualizing data with Microsoft and HortonWorks BI tools, and more Helps you build and execute a Big Data plan Includes contributions from the Microsoft and HortonWorks Big Data product teams

If you need a detailed roadmap for designing and implementing a fully deployed Big Data solution, you'll want Microsoft Big Data Solutions.
Introduction xv
Part I What Is Big Data?
1(36)
Chapter 1 Industry Needs and Solutions
3(16)
What's So Big About Big Data?
4(1)
A Brief History of Hadoop
5(1)
Google
5(1)
Nutch
6(1)
What Is Hadoop?
6(13)
Derivative Works and Distributions
7(1)
Hadoop Distributions
8(1)
Core Hadoop Ecosystem
9(2)
Important Apache Projects for Hadoop
11(6)
The Future for Hadoop
17(1)
Summary
17(2)
Chapter 2 Microsoft's Approach to Big Data
19(18)
A Story of "Better Together"
19(1)
Competition in the Ecosystem
20(5)
SQL on Hadoop Today
21(1)
Hortonworks and Stinger
21(2)
Cloudera and Impala
23(2)
Microsoft's Contribution to SQL in Hadoop
25(1)
Deploying Hadoop
25(12)
Deployment Factors
26(3)
Deployment Topologies
29(4)
Deployment Scorecard
33(3)
Summary
36(1)
Part II Setting Up for Big Data with Microsoft
37(28)
Chapter 3 Configuring Your First Big Data Environment
39(26)
Getting Started
39(1)
Getting the Install
40(1)
Running the Installation
40(15)
On-Premise Installation: Single-Node Installation
41(10)
HDInsight Service: Installing in the Cloud
51(1)
Windows Azure Storage Explorer Options
52(3)
Validating Your New Cluster
55(3)
Logging into HDInsight Service
55(2)
Verify HDP Functionality in the Logs
57(1)
Common Post-Setup Tasks
58(7)
Loading Your First Files
58(2)
Verifying Hive and Pig
60(3)
Summary
63(2)
Part III Storing and Managing Big Data
65(86)
Chapter 4 HDFS, Hive, HBase, and HCatalog
67(18)
Exploring the Hadoop Distributed File System
68(7)
Explaining the HDFS Architecture
69(3)
Interacting with HDFS
72(3)
Exploring Hive: The Hadoop Data Warehouse Platform
75(3)
Designing, Building, and Loading Tables
76(1)
Querying Data
77(1)
Configuring the Hive ODBC Driver
77(1)
Exploring HCatalog: HDFS Table and Metadata Management
78(2)
Exploring HBase: An HDFS Column-Oriented Database
80(5)
Columnar Databases
81(1)
Defining and Populating an HBase Table
82(1)
Using Query Operations
83(1)
Summary
84(1)
Chapter 5 Storing and Managing Data in HDFS
85(20)
Understanding the Fundamentals of HDFS
86(6)
HDFS Architecture
87(2)
NameNodes and DataNodes
89(1)
Data Replication
90(2)
Using Common Commands to Interact with HDFS
92(1)
Interfaces for Working with HDFS
92(8)
File Manipulation Commands
94(3)
Administrative Functions in HDFS
97(3)
Moving and Organizing Data in HDFS
100(5)
Moving Data in HDFS
100(1)
Implementing Data Structures for Easier Management
101(1)
Rebalancing Data
102(1)
Summary
103(2)
Chapter 6 Adding Structure with Hive
105(28)
Understanding Hive's Purpose and Role
106(11)
Providing Structure for Unstructured Data
107(7)
Enabling Data Access and Transformation
114(1)
Differentiating Hive from Traditional RDBMS Systems
115(1)
Working with Hive
116(1)
Creating and Querying Basic Tables
117(9)
Creating Databases
117(1)
Creating Tables
118(3)
Adding and Deleting Data
121(2)
Querying a Table
123(3)
Using Advanced Data Structures with Hive
126(7)
Setting Up Partitioned Tables
126(2)
Loading Partitioned Tables
128(1)
Using Views
129(1)
Creating Indexes for Tables
130(1)
Summary
131(2)
Chapter 7 Expanding Your Capability with HBase and HCatalog
133(18)
Using HBase
134(6)
Creating HBase Tables
134(2)
Loading Data into an HBase Table
136(2)
Performing a Fast Lookup
138(1)
Loading and Querying HBase
139(1)
Managing Data with HCatalog
140(3)
Working with HCatalog and Hive
140(1)
Defining Data Structures
141(2)
Creating Indexes
143(1)
Creating Partitions
143(2)
Integrating HCatalog with Pig and Hive
145(4)
Using HBase or Hive as a Data Warehouse
149(2)
Summary
150(1)
Part IV Working with Your Big Data
151(52)
Chapter 8 Effective Big Data ETL with SSIS, Pig, and Sqoop
153(24)
Combining Big Data and SQL Server Tools for Better Solutions
154(2)
Why Move the Data?
154(1)
Transferring Data Between Hadoop and SQL Server
155(1)
Working with SSIS and Hive
156(5)
Connecting to Hive
157(4)
Configuring Your Packages
161(6)
Loading Data into Hadoop
165(2)
Getting the Best Performance from SSIS
167(1)
Transferring Data with Sqoop
167(4)
Copying Data from SQL Server
168(2)
Copying Data to SQL Server
170(1)
Using Pig for Data Movement
171(4)
Transforming Data with Pig
171(3)
Using Pig and SSIS Together
174(1)
Choosing the Right Tool
175(2)
Use Cases for SSIS
175(1)
Use Cases for Pig
175(1)
Use Cases for Sqoop
176(1)
Summary
176(1)
Chapter 9 Data Research and Advanced Data Cleansing with Pig and Hive
177(26)
Getting to Know Pig
178(14)
When to Use Pig
178(1)
Taking Advantage of Built-in Functions
179(1)
Executing User-defined Functions
180(2)
Using UDFs
182(7)
Building Your Own UDFs for Pig
189(3)
Using Hive
192(11)
Data Analysis with Hive
192(1)
Types of Hive Functions
192(3)
Extending Hive with Map-reduce Scripts
195(3)
Creating a Custom Map-reduce Script
198(1)
Creating Your Own UDFs for Hive
199(2)
Summary
201(2)
Part V Big Data and SQL Server Together
203(132)
Chapter 10 Data Warehouses and Hadoop Integration
205(52)
State of the Union
206(1)
Challenges Faced by Traditional Data Warehouse Architectures
207(9)
Technical Constraints
207(6)
Business Challenges
213(3)
Hadoop's Impact on the Data Warehouse Market
216(4)
Keep Everything
216(1)
Code First (Schema Later)
217(1)
Model the Value
218(1)
Throw Compute at the Problem
218(2)
Introducing Parallel Data Warehouse (PDW)
220(15)
What Is PDW?
221(1)
Why Is PDW Important?
222(2)
How PDW Works
224(11)
Project Polybase
235(22)
Polybase Architecture
235(14)
Business Use Cases for Polybase Today
249(2)
Speculating on the Future for Polybase
251(4)
Summary
255(2)
Chapter 11 Visualizing Big Data with Microsoft BI
257(28)
An Ecosystem of Tools
258(5)
Excel
258(1)
PowerPivot
258(1)
Power View
259(2)
Power Map
261(1)
Reporting Services
261(2)
Self-service Big Data with PowerPivot
263(14)
Setting Up the ODBC Driver
263(2)
Loading Data
265(7)
Updating the Model
272(1)
Adding Measures
273(1)
Creating Pivot Tables
274(3)
Rapid Big Data Exploration with Power View
277(4)
Spatial Exploration with Power Map
281(4)
Summary
283(2)
Chapter 12 Big Data Analytics
285(12)
Data Science, Data Mining, and Predictive Analytics
286(2)
Data Mining
286(1)
Predictive Analytics
287(1)
Introduction to Mahout
288(1)
Building a Recommendation Engine
289(8)
Getting Started
291(1)
Running a User-to-user Recommendation Job
292(3)
Running an Item-to-item Recommendation Job
295(1)
Summary
296(1)
Chapter 13 Big Data and the Cloud
297(26)
Defining the Cloud
298(1)
Exploring Big Data Cloud Providers
299(1)
Amazon
299(1)
Microsoft
300(1)
Setting Up a Big Data Sandbox in the Cloud
300(15)
Getting Started with Amazon EMR
301(6)
Getting Started with HDInsight
307(8)
Storing Your Data in the Cloud
315(8)
Storing Data
316(1)
Uploading Your Data
317(1)
Exploring Big Data Storage Tools
318(1)
Integrating Cloud Data
319(2)
Other Cloud Data Sources
321(1)
Summary
321(2)
Chapter 14 Big Data in the Real World
323(12)
Common Industry Analytics
324(3)
Telco
324(1)
Energy
325(1)
Retail
325(1)
Data Services
326(1)
IT/Hosting Optimization
326(1)
Marketing Social Sentiment
327(1)
Operational Analytics
327(8)
Failing Fast
328(1)
A New Ecosystem of Technologies
328(2)
User Audiences
330(3)
Summary
333(2)
Part VI Moving Your Big Data Forward
335(44)
Chapter 15 Building and Executing Your Big Data Plan
337(14)
Gaining Sponsor and Stakeholder Buy-In
338(4)
Problem Definition
338(1)
Scope Management
339(2)
Stakeholder Expectations
341(1)
Defining the Criteria for Success
342(1)
Identifying Technical Challenges
342(3)
Environmental Challenges
342(2)
Challenges in Skillset
344(1)
Identifying Operational Challenges
345(3)
Planning for Setup/Configuration
345(2)
Planning for Ongoing Maintenance
347(1)
Going Forward
348(3)
The Hand Off to Operations
348(1)
After Deployment
349(1)
Summary
350(1)
Chapter 16 Operational Big Data Management
351(28)
Hybrid Big Data Environments: Cloud and On-Premise Solutions Working Together
352(1)
Ongoing Data Integration with Cloud and On-Premise Solutions
353(1)
Integration Thoughts for Big Data
354(2)
Backups and High Availability Your Big Data Environment
356(3)
High Availability
356(2)
Disaster Recovery
358(1)
Big Data Solution Governance
359(1)
Creating Operational Analytics
360(19)
System Center Operations Manager for HDP
361(1)
Installing the Ambari SCOM Management Pack
362(9)
Monitoring with the Ambari SCOM Management Pack
371(6)
Summary
377(2)
Index 379
Adam Jorgensen is the President of Pragmatic Works and the Executive Vice President of PASS. He has extensive experience with data warehousing, analytics, and NoSQL architectures.

James Rowland-Jones is a principal consultant for The Big Bang Data Company. He specializes in big data warehouse solutions that leverage SQL Server Parallel Data Warehouse and Hadoop ecosystems.

John Welch is Vice President of Software Development at Pragmatic Works, where he leads the development of a suite of BI and data products for SQL Server and related technologies.

Dan Clark is a senior BI consultant for Pragmatic Works. Dan has published several books and numerous articles on .NET programming and BI development.

Christopher Price is a senior consultant with Microsoft. His focus is on ETL, data integration, data quality, MDM, SSAS, SharePoint, and all things big data.

Brian Mitchell is the lead architect of the Microsoft Big Data Center of Expertise. He focuses exclusively on DW/BI solutions.