Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Cody's Data Cleaning Techniques Using SAS, Second Edition 2nd Revised ed. [Pehme köide]

4.19/5 (51 hinnangut Goodreads-ist)

Ron Cody

Formaat: Paperback / softback, 268 pages, kõrgus x laius x paksus: 235x191x14 mm, kaal: 467 g, 1, black & white illustrations
Ilmumisaeg: 01-May-2008
Kirjastus: SAS Publishing
ISBN-10: 1599946599
ISBN-13: 9781599946597

Teised raamatud teemal:

Mathematical & statistical software - (Hetkel poes: 1 nimetust)

Pehme köide
Hind: 39,56 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 268 pages, kõrgus x laius x paksus: 235x191x14 mm, kaal: 467 g, 1, black & white illustrations
Ilmumisaeg: 01-May-2008
Kirjastus: SAS Publishing
ISBN-10: 1599946599
ISBN-13: 9781599946597

Teised raamatud teemal:

Mathematical & statistical software - (Hetkel poes: 1 nimetust)

Püsilink: https://www.kriso.ee/db/9781599946597.html

Märksõnad:

Electronic data processing - Data preparation

Now a retired professor from the Robert Wood Johnson Medical School, Cody is a private consultant and national instructor for SAS, and the author or coauthor of numerous books on SAS. He offers novice and experienced SAS programmers a practical guide to detecting and correcting data errors while learning to apply DATA step programming techniques and SAS procedures. The material has been updated to cover the many new functions in SAS, and includes a new chapter on integrity constraints and audit trails, several macros to make data cleaning tasks easier, and a short description of an SAS product called DataFlux for performing advanced data cleaning techniques such as address standardization and fuzzy matching. Annotation ©2008 Book News, Inc., Portland, OR (booknews.com)

Thoroughly updated for SAS 9, this second edition addresses tasks that nearly every SAS programmer needs to do - that is, make sure that data errors are located and corrected. Written in Ron Cody's signature informal, tutorial style, this book develops and demonstrates data cleaning programs and macros that you can use as written or modify for your own special data cleaning needs. Each topic is developed through specific examples, and every program and macro is explained in detail.You'll learn how to -find and correct errors in character and numeric values -develop programming techniques related to dates and missing values -use SQL approaches to data cleaning -develop techniques for correcting your data errors -use integrity constraints and audit trails to prevent errors from being added to a clean data set Novice and experienced SAS users will discover ways to detect and correct data errors while learning how to apply DATA step programming techniques and SAS procedures. SAS Products and Releases: Base SAS: 9.2, 9.1.3, 9.1.2, 9.1, 9.0 SAS/STAT: 9.2, 9.1.3, 9.1.2, 9.1, 9.0 Operating Systems: All

List of Programs

Preface

Acknowledgments

xvii

Checking Values of Character Variables

Introduction

(1)

Using Proc Freq to List Values

(1)

Description of the Raw Data File Patients.Txt

(5)

Using a Data Step to Check for Invalid Values

(2)

Describing the Verify, Trim, Missing, and Notdigit Functions

(4)

Uisng Proc Print with a Where Statement to List Invalid Values

(2)

Using Formats to Check for Invalid Values

(3)

Using Informats to Remove Invalid Values

(5)

Checking Values of Numeric Variables

Introduction

(1)

Using Proc Means, Proc Tabulate, and Proc Univariate to Look for Outliers

(10)

Using an Ods Select Statement to List Extreme Values

(1)

Using Proc Univariate Options to List More Exterme Observations

(2)

Using Proc Univariate to Look for Highest and Lowest Values by Percentage

(6)

Using Proc Rank to Look for Highest and Lowest Values by Percentage

(4)

Presenting a Program to List the Highest and Ten Values

(3)

Presenting a Macoro to List the Highest and Lowest ``n''Values

(2)

Using Proc Print with a Where Statement to List Invalid Data Values

(2)

Using a Data Step to Check for Out-of-Range Values

(1)

Identifying Invalid Values versus Missing Values

(2)

Listing Invalid (Character) Values in the Error Report

(3)

Creating a Macro for Range Checking

(2)

Checking Ranges for Several Variables

(4)

Using Formats to Check for Invalid Values

(2)

Using Informats to Filter Invalid Values

(3)

Checking a Range Using an Algorithm Based on Standard Deviation

(2)

Detecting Outliers Based on a Trimmed Mean and Standard Deviation

(3)

Presenting a Macro Based on Trimmed Statistics

(4)

Using the Trim Option of Proc Univariate and Ods to Compute Trimmed Statistics

(6)

Checking a Range Based on the Interquartile Range

(5)

Checking for Missing Values

Introduction

(1)

Inspecting the SAS Log

(2)

Using Proc Means and Proc Freq to Count Missing Values

(3)

Using Data Step Approaches to Identify and Count Missing Values

(4)

Searching for a Specific Numeric Value

100

(2)

Creating a Macro to Search for Specific Numeric Values

102

(3)

Working with Dates

Introduction

105

(1)

Checking Ranges for Dates (Using a Data Step)

106

(1)

Checking Ranges for Dates (Using Proc Print)

107

(1)

Checking for Invalid Dates

108

(3)

Working with Dates in Nonstandard Form

111

(2)

Creating a SAS Date When the Day of the Month Is Missing

113

(1)

Suspending Error Checking for Known Invalid Dates

114

(3)

Looking for Duplicates and ``n'' Observations per Subject

Introduction

117

(1)

Eliminating Duplicates by Using Proc Sort

117

(6)

Detecting Duplicates by Using Data Step Approaches

123

(3)

Using Proc Freq to Deted Duplicate ID's

126

(3)

Selecting Patients with Duplicate Observations by Using a Macro List and SQL

129

(1)

Identifying Subjects with ``n`` Observations Each (Data Step Approach)

130

(2)

Identifying Subjects with ``n`` Observations Each (Using Proc Freq)

132

(3)

Working with Multiple Files

Introduction

135

(1)

Checking for an ID in Each of Two Files

135

(3)

Checking for an ID in Each of ``n`` Files

138

(2)

A Macro for ID Checking

140

(3)

More Complicated Multi-File Rules

143

(4)

Checking That the Dates Are in the Proper Order

147

(2)

Double Entry and Verification (Proc Compare)

Introduction

149

(1)

Conducting a Simple Comparison of Two Data Sets

150

(9)

Using Proc Compare with Two Data Sets That Have an Unequal Number of Observations

159

(2)

Comparing Two Data Sets When Some Variables Are Not in Bothe Data Sets

161

(4)

Some Proc Sql Solutions to Data Cleaning

Introduction

165

(1)

A Quick Review of Proc Sql

166

(1)

Checking for Invalid Character Values

166

(2)

Checking for Outliers

168

(1)

Checking a Range Using an Algorithm Based on the Standard Deviation

169

(1)

Checking for Missing Values

170

(2)

Range Checking for Dates

172

(1)

Checking for Duplicates

173

(1)

Identifying Subjects with ``n`` Observations Each

174

(1)

Checking for an ID in Each of Two Files

174

(2)

More Complicatd Multi-File Rules

176

(5)

Correcting Errors

Introduction

181

(1)

Hardcoding Corrections

181

(1)

Describing Named Input

182

(2)

Reviewing the Update Statement

184

(3)

Creating Integrity Constraints and Audit Trails

Introducing SAS Integrity Constraints

187

(1)

Demonstrating General Integrity Constraints

188

(5)

Deleting an Integrity Constraint Using Proc Datasets

193

(1)

Creating an Audit Trail Data Set

193

(7)

Demonstrating an Integrity Constraint Involving More than One Variable

200

(2)

Demonstrating a Referential Constraint

202

(3)

Attempting to Delete a Primary Key When a Foreing Key Still Exists

205

(2)

Attempting to Add a Name to the Child Data Set

207

(1)

Demonstrating the Cascade Feature of a Referential Constraint

208

(2)

Demonstrating the Set Null Feature of a Referential Constaint

210

(1)

Demonstrating How to Delete a Referential Constraint

211

(2)

Dataflux and dfpower Studio

Introduction

213

(2)

Examples

215

(2)

Appendix: Listing of Raw Data Files and SAS Programs

Programs and Raw Data Files Used in This Book

217

(1)

Description of the Raw Data File Patients.txt

217

(1)

Layout for the Data File Patients.txt

218

(1)

Listing of Raw Data File Patients.txt

218

(1)

Program to Create the SAS Data Set Patients

219

(1)

Listing of Raw Data File Patients2.txt

220

(1)

Program to Create the SAS Data Set Patients2

221

(1)

Program to Create the SAS Data Set AE (Adverse Events)

221

(1)

Program to Create the SAS Data Set LAB_Test

222

(1)

Listings of the Data Cleaning Macros Used in This Book

222

(17)

Index

239

Cody's Data Cleaning Techniques Using SAS, Second Edition 2nd Revised ed. [Pehme köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv