Muutke küpsiste eelistusi

Preparing Data for Analysis with JMP [Pehme köide]

  • Formaat: Paperback / softback, 216 pages, kõrgus x laius x paksus: 235x191x12 mm, kaal: 381 g
  • Ilmumisaeg: 01-May-2017
  • Kirjastus: SAS Institute
  • ISBN-10: 1629604186
  • ISBN-13: 9781629604183
Teised raamatud teemal:
  • Formaat: Paperback / softback, 216 pages, kõrgus x laius x paksus: 235x191x12 mm, kaal: 381 g
  • Ilmumisaeg: 01-May-2017
  • Kirjastus: SAS Institute
  • ISBN-10: 1629604186
  • ISBN-13: 9781629604183
Teised raamatud teemal:
Access and clean up data easily using JMP®! Data acquisition and preparation commonly consume approximately 75% of the effort and time of total data analysis. JMP provides many visual, intuitive, and even innovative data-preparation capabilities that enable you to make the most of your organization's data. Preparing Data for Analysis with JMP® is organized within a framework of statistical investigations and model-building and illustrates the new data-handling features in JMP, such as the Query Builder. Useful to students and programmers with little or no JMP experience, or those looking to learn the new data-management features and techniques, it uses a practical approach to getting started with plenty of examples. Using step-by-step demonstrations and screenshots, this book walks you through the most commonly used data-management techniques that also include lots of tips on how to avoid common problems.With this book, you will learn how to: Manage database operations using the JMP Query Builder Get data into JMP from other formats, such as Excel, csv, SAS, HTML, JSON, and the web Identify and avoid problems with the help of JMP’s visual and automated data-exploration tools Consolidate data from multiple sources with Query Builder for tables Deal with common issues and repairs that include the following tasks:reshaping tables (stack/unstack) managing missing data with techniques such as imputation and Principal Components Analysis cleaning and correcting dirty data computing new variables transforming variables for modelling reconciling time and date Subset and filter your data Save data tables for exchange with other platforms

An introduction on how to use JMP to manage data for analysis. The book is organized within a framework of statistical investigations and model-building (where data acquisition and prep commonly eat up something like 75% of the effort and time) and in doing so illustrates the new data handling features in JMP, such as Query Builder.
About This Book ix
About The Author xiii
Chapter 1 Data Management in the Analytics Process
1(6)
Introduction
1(1)
A Continuous Process
2(1)
Asking Questions That Data Can Help to Answer
2(1)
Sourcing Relevant Data
3(1)
Reproducibility
3(1)
Combining and Reconciling Multiple Sources
4(1)
Identifying and Addressing Data Issues
4(1)
Data Requirements Shaped by Modeling Strategies
4(1)
Plan of the Book
5(1)
Conclusion
5(1)
References
5(2)
Chapter 2 Data Management Foundations
7(8)
Introduction
7(1)
Matching Form to Function
8(1)
JMP Data Tables
9(1)
Data Types and Modeling Types
10(2)
Data Types
10(1)
Modeling Types
10(2)
Basics of Relational Databases
12(1)
Conclusion
13(1)
References
14(1)
Chapter 3 Sources of Data and Their Challenges
15(8)
Introduction
15(1)
Internal Data in Flat Files
15(1)
Relational Databases
16(1)
External Data on the World Wide Web
16(3)
User-Facing Query Interfaces
16(3)
Tabular Data Pages
19(1)
Evolving WWW Data Standards
19(1)
Ethical and Legal Considerations
19(1)
Conclusion
20(1)
References
21(2)
Chapter 4 Single Files
23(20)
Introduction
23(1)
Review of JMP File Types
23(2)
Common Formats Other than JMP
25(16)
MS Excel
25(7)
Text Files
32(7)
SAS Files
39(2)
Other Data File Formats
41(1)
Conclusion
42(1)
References
42(1)
Chapter 5 Database Queries
43(26)
Introduction
43(1)
Sample Databases in This
Chapter
44(1)
Connecting to a Database
44(4)
Extracting Data from One Table in a Database
48(4)
Import an Entire Table
48(1)
Import a Subset of a Table
49(3)
Querying a Database from JMP
52(12)
Query Builder
52(3)
An Illustrative Scenario: Bicycle Parts
55(2)
Designing a Query with Query Builder
57(7)
Query Builder for SAS Server Data
64(2)
Conclusion
66(1)
References
67(2)
Chapter 6 Importing Data from Websites
69(8)
Introduction
69(1)
Variety of Web Formats
70(1)
Internet Open
70(2)
Common Issues to Anticipate
72(2)
Conclusion
74(1)
References
75(2)
Chapter 7 Reshaping a Data Table
77(20)
Introduction
77(1)
What Shape Is a Data Table?
78(1)
Wide versus Long Format
78(1)
Reasons for Wide and Long Formats
79(1)
Stacking Wide Data
79(3)
Unstacking Narrow Data
82(1)
Additional Examples
83(8)
Stacking Wide Data
83(2)
Scripting for Reproducibility
85(1)
Splitting Long Data
86(4)
Transposing Rows and Columns
90(1)
Reshaping the WDI Data
91(3)
Conclusion
94(1)
References
94(3)
Chapter 8 Joining, Subsetting, and Filtering
97(26)
Introduction
97(1)
Combining Data from Multiple Tables with Join
98(4)
Saving Memory with a Virtual Join
102(1)
Why and How to Select a Subset
103(4)
A Brief Detour: Creating a New Column from an Existing Column
104(3)
Row Filters: Global and Local
107(4)
Global Filter
107(2)
Local Filter
109(1)
A More Durable Subset
110(1)
Combining Rows with Concatenate
111(2)
Query Builder for Tables
113(8)
Back to the Movies
113(1)
Olympic Medals and Development Indicators
114(7)
Conclusion
121(1)
References
122(1)
Chapter 9 Data Exploration: Visual and Automated Tools to Detect Problems
123(16)
Introduction
123(1)
Common Issues to Anticipate
124(1)
On the Hunt for Dirty Data
125(1)
Distribution
126(1)
Columns Viewer
126(2)
Multivariate (Correlations and Scatterplot Matrix)
128(2)
More Tools within the Multivariate Platform
129(1)
Principal Components
129(1)
Outlier Analysis
130(1)
Item Reliability
130(1)
Explore Outliers
130(5)
Quantile Range Outliers
132(1)
Robust Fit Outliers
133(1)
Multivariate Robust Outliers
133(1)
Multivariate k-Nearest Neighbors Outliers
134(1)
Explore Missing
135(1)
Conclusion
136(1)
References
137(2)
Chapter 10 Missing Data Strategies
139(16)
Introduction
139(1)
Much Ado about Nothing?
140(2)
Four Basic Approaches
142(1)
Working with Complete Cases
142(1)
Analysis with Sampling Weights
142(2)
Imputation-based Methods
144(9)
Recode
144(1)
Informative Missing
145(2)
Multivariate Normal Imputation
147(2)
Multivariate SVD Imputation
149(2)
Special Considerations for Time Series
151(2)
Conclusion and a Note of Caution
153(1)
References
153(2)
Chapter 11 Data Preparation for Analysis
155(30)
Introduction
155(1)
Common Issues and Appropriate Strategies
156(1)
Distribution of Observations
157(10)
Noisy Data
157(3)
Skewness or Outliers
160(2)
Scale Differences among Model Variables
162(1)
Too Many Levels of a Categorical Variable
163(4)
High Dimensionality: Abundance of Columns
167(6)
Correlated or Redundant Variables
167(1)
Missing or Sparse Observations across Columns
168(1)
A PCA Example
168(5)
Abundance of Rows
173(6)
Partitioning into Training, Validation, and Test Sets
173(3)
Aggregating Rows with Summary Tables
176(1)
Oversampling Rare Events
177(2)
Date and Time-Related Issues
179(4)
Formatting Dates and Times
179(1)
Some Date Functions: Extracting Parts
180(1)
Aggregation
181(1)
Row Functions Especially Useful in Time-Ordered Data
181(1)
Elapsed Time and Date Arithmetic
182(1)
Conclusion
183(1)
References
183(2)
Chapter 12 Exporting Work to Other Platforms
185(10)
Introduction
185(1)
Why Export or Exchange Data?
185(1)
Fit the Method to the Purpose
186(3)
Save As
186(1)
Export to a Database
187(1)
Export to a SAS Library
188(1)
Exporting Reports
189(4)
Interactive Graphics
190(2)
Static Images: Graphics Formats, PowerPoint, and Word
192(1)
Conclusion
193(1)
References
193(2)
Index 195