Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Applied Regression and ANOVA Using SAS

Patricia F. Moodie, Dallas E. Johnson (Kansas State University, Manhattan, Kansas, USA)

Formaat: 428 pages
Ilmumisaeg: 07-Jun-2022
Kirjastus: Chapman & Hall/CRC
Keel: eng
ISBN-13: 9781439869529

Teised raamatud teemal:

Probability & statistics

Formaat - PDF+DRM
Hind: 59,79 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 428 pages
Ilmumisaeg: 07-Jun-2022
Kirjastus: Chapman & Hall/CRC
Keel: eng
ISBN-13: 9781439869529

Teised raamatud teemal:

Probability & statistics

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Applied Regression and ANOVA Using SAS® has been written specifically for non-statisticians and applied statisticians who are primarily interested in what their data are revealing. Interpretation of results are key throughout this intermediate-level applied statistics book. The authors introduce each method by discussing its characteristic features, reasons for its use, and its underlying assumptions. They then guide readers in applying each method by suggesting a step-by-step approach while providing annotated SAS programs to implement these steps.

Those unfamiliar with SAS software will find this book helpful as SAS programming basics are covered in the first chapter. Subsequent chapters give programming details on a need-to-know basis. Experienced as well as entry-level SAS users will find the book useful in applying linear regression and ANOVA methods, as explanations of SAS statements and options chosen for specific methods are provided.

Features:

Statistical concepts presented in words without matrix algebra and calculus Numerous SAS programs, including examples which require minimum programming effort to produce high resolution publication-ready graphics Practical advice on interpreting results in light of relatively recent views on threshold p-values, multiple testing, simultaneous confidence intervals, confounding adjustment, bootstrapping, and predictor variable selection Suggestions of alternative approaches when a methods ideal inference conditions are unreasonable for ones data

This book is invaluable for non-statisticians and applied statisticians who analyze and interpret real-world data. It could be used in a graduate level course for non-statistical disciplines as well as in an applied undergraduate course in statistics or biostatistics.

Arvustused

"... A must for someone that wants to work with theaforementioned models using SAS and wants a step-by-step guide on how and when toimplement those models. Each chapter is organized in a very similar manner. Itprovides theminimum amount of theory in a non-technical way at first, including when to use a specificmodel, what should be checked as assumptions and what to do when assumptions are not met."

David Manteigas, ISCB News, May 2024

Preface

xix

Author Biography

xxi

1 Review of Some Basic Statistical Ideas

(28)

1.1 Introducing Regression Analysis

(1)

1.2 Classification of Variables

(2)

1.2.1 Quantitative vs. Qualitative Variables

(1)

1.2.2 Scale of Measurement Classification

(1)

1.2.3 Overview of Variable Classification and the Methods to be Presented in this Book

(1)

1.3 Probability Distributions

(1)

1.3.1 The Normal Distribution

(1)

1.4 Statistical Inference

(1)

1.5 Missing Data

(1)

1.6 Estimating Population Parameters Using Sample Data

(1)

1.7 Basic Steps of Hypothesis Testing

(3)

1.8 Why an Observed p-Value Should Be Interpreted with Caution

(1)

1.9 A World Beyond 0.05?

(1)

1.10 Caveats Regarding a Confidence Interval Estimate for an Effect Size

(2)

1.11 Type I and Type II Errors, Power, and Robustness

(1)

1.11.1 Type I and Type II errors

(1)

1.12 Basic Types of Research Studies

(3)

1.12.1 Observational Studies

(1)

1.12.2 Controlled Randomized Trials

(1)

1.12.3 Quasi-experiments

(1)

1.13 Why Not Always Conduct a Controlled Randomized Trial?

(1)

1.14 Importance of Screening for Data Entry Errors

(1)

1.15 Example 1.1: Environmental Impact Study of a Mine Site

(1)

1.16 Some SAS Basics: Key Commonalities that Apply to all SAS Programs

(2)

1.16.1 Temporary and Permanent SAS Data Sets

(1)

1.17
Chapter Summary

(7)

Appendix

(1)

1.A Program 1.1

(1)

1.A.1 Explanation of SAS Statements in Program 1.1

(2)

1.B The SAS Log

(1)

1.C Listing of Data in the SAS Data File MINESET

(2)

2 Introduction to Simple Linear Regression

(18)

2.1 Characteristic Features

(1)

2.2 Why Use Simple Linear Regression?

(1)

2.3 Example 2.1: Systolic Blood Pressure and Age

(1)

2.4 Rationale Underlying a Simple Linear Regression Model

(1)

2.5 An Equation for the Simple Linear Regression Model

(1)

2.6 An Alternative Form of the Simple Linear Regression Model

(1)

2.7 How Simple Linear Regression is Used with Sample Data

(2)

2.7.1 Properties of Least Squares Estimators

(1)

2.8 Prediction of a Y Value for an Individual having a Particular X Value

(1)

2.9 Estimating the Mean Response in the Sampled Population Subgroup Having a Particular X Value

(1)

2.10 Assessing Accuracy with Prediction Intervals and Confidence Intervals

(1)

2.11 Ideal Inference Conditions for Simple Linear Regression

(2)

2.12 Potential Consequences of Violations of Ideal Inference Conditions

(1)

2.12.1 How Concerned Should We Be About Violations of Ideal Inference Conditions?

(1)

2.13 What Researchers Need to Know about the Variance of Y Given X

(1)

2.13.1 An Equation for Mean Square Error, A Model-Dependent Estimator of the Variance of Y given X

(1)

2.14 Fixed vs. Random X in Simple Linear Regression

(1)

2.15
Chapter Summary

(6)

Appendix

(1)

2.A Program 2.1

(1)

2.A.1 Explanation of SAS Statements in Program 2.1

(2)

2.A.2 A Message in SAS Log for Program 2.1

(2)

3 Model Checking in Simple Linear Regression

(42)

3.1 General Introduction

(1)

3.2 Regression Outliers

(2)

3.3 Influential Cases

(1)

3.4 Residuals as Diagnostic Tools for Model Checking

(1)

3.5 Obtaining Residuals from the REG Procedure

(1)

3.5.1 Introduction

(1)

3.5.2 Some Results from the REG Procedure for the Mine Site Example

(1)

3.6 Raw (Unsealed) Residuals

(2)

3.6.1 Example of a Raw Residual

(1)

3.6.2 Estimated Standard Error (Standard Deviation) of a Raw Residual

(1)

3.6.3 Difficulties Associated with Using Raw Residuals as a Diagnostic Tool

(1)

3.7 Internally Studentized Residuals

(1)

3.7.1 Advantages of Internally Studentized Residuals in Evaluating Ideal Inference Conditions

(1)

3.8 Studentized Deleted Residuals

(2)

3.8.1 Using a Studentized Deleted Residual to Identify a Regression Outlier

(1)

3.8.2 Example

(1)

3.8.3 Advantages of Studentized Deleted Residuals in Regression Outlier Dectection

(1)

3.8.4 Caveat Regarding Studentized Deleted Residuals in Outlier Detection

(1)

3.9 Graphical Evaluation of Ideal Inference Conditions in Simple Linear Regression

(8)

3.9.1 Graphical Evaluation of Independence of Errors

(1)

3.9.2 Graphical Evaluation of Linearity and Equality of Variances

(2)

3.9.3 Graphical Evaluation of Normality

(5)

3.10 Normality Tests and Gauging the Impact of Non-Normality

(1)

3.11 Homogeneity of Variance Tests and Gauging the Impact of Unequal Variances in Regression

(1)

3.12 Screening for Outliers to Detect Possible Recording Errors in Simple Linear Regression

(1)

3.13 Overview of a Step-by-Step Approach for Checking Ideal Inference Conditions in Simple Linear Regression

(2)

3.14 Example of Evaluating Ideal Inference Conditions

(4)

3.15 Checking for Influential Cases

(1)

3.16 DFBETA: Influence on an Estimated Regression Coefficient

(1)

3.16.1 General Equation for DFBETA

(1)

3.16.2 Equation for DFBETA for 8t in Simple Linear Regression

(1)

3.16.3 Interpretation of DFBETA

(1)

3.17 DFFITS: Influence of a Case on its Own Predicted Value

(1)

3.17.1 Equation for DFFITS

(1)

3.17.2 Interpretation of DFFITS

(1)

3.18 Cook's Distance: Influence of a Case on All Predicted Values in the Sample

(1)

3.18.1 Equation for Cook's D

(1)

3.18.2 Interpretation of Cook's Distance

(1)

3.19 Caveats Regarding Cut-Off Values for Measures of Influence

(1)

3.20 Detecting Multiple Influential Cases That Occur in a Cluster

(1)

3.21 Checking for Potentially Influential Cases in Example 1.1

(2)

3.21.1 DFBETAS Results for Example 1.1

(1)

3.21.2 DFFITS Results for Example 1.1

(1)

3.21.3 Cook's D Results for Example 1.1

(1)

3.22 Discussion and Conclusion Regarding Influence in Example 1.1

(1)

3.23 Summary of Model Checking for Example 1.1

(1)

3.24
Chapter Summary

(9)

Appendix

(1)

3.A Program 3.1A

(1)

3.A.1 Explanation of SAS Statements in Program 3.1A Not Encountered in Previous Programs

(1)

3.B Program 3.1B

(1)

3.B.1 Explanation of SAS statements in Program 3.1B

(1)

3.C Program 3.1C

(2)

3.C.1 Explanation of Program 3.1C

(2)

4 Interpreting a Simple Linear Regression Analysis

(16)

4.1 Introduction

(1)

4.2 A Basic Question to Ask in Simple Linear Regression

(1)

4.3 The Model F-Test

(1)

4.3.1 The Test Statistic for the Model F-Test

(1)

4.3.2 Mine Site Example

(1)

4.4 The t-Test of β1 = 0 vs. β1 ≠ 0 in the Sampled Population

(1)

4.4.1 Example

(1)

4.5 Possible Interpretations of a Large p-Value from a Model F-Test or Equivalent t-Test of β1 vs. β1 ≠ in Simple Linear Regression

(1)

4.6 Possible Interpretations of a Small p-Value from a Model F-Test or Equivalent t-test of β1 = 0 vs. β1 ≠ in Simple Linear Regression

(1)

4.7 Evaluating the Extent of the Usefulness of X for Explaining Y in a Simple Linear Regression Model

(1)

4.8 A Confidence Interval Estimate for β1, the Regression Coefficient for X in Simple Linear Regression

(2)

4.8.1 Interpreting a 95% Confidence Interval Estimate for β1

(1)

4.8.2 Example

(1)

4.9 R2, the Coefficient of Determination

(1)

4.9.1 An Equation for R2

(1)

4.9.2 Some Issues Interpreting R2

(1)

4.9.3 Example

(1)

4.10 Root Mean Square Error

(1)

4.11 Coefficient of Variation for the Model

(1)

4.12 Estimating a Confidence Interval for the Subpopulation Mean of Y Given a Specified X Value

(1)

4.12.1 Example

(1)

4.12.2 Equation for Estimating the Endpoints of a 95% Conventional Confidence Interval for the Mean of Y for a Given X Value

(1)

4.13 Estimating a Prediction Interval for an Individual Value of Y at a Particular X Value

(2)

4.13.1 Example

100

(1)

4.13.2 Equation for Estimating the Endpoints of a 95% Conventional Prediction Interval for an Individual Y Given X

100

(1)

4.14 Concluding Comments

101

(1)

4.15
Chapter Summary

102

(3)

Appendix

103

(1)

4.A Program 4.1

103

(2)

5 Introduction to Multiple Linear Regression

105

(12)

5.1 Characteristic Features of a Multiple Linear Regression Model

105

(1)

5.2 Why Use a Multiple Linear Regression Model?

105

(1)

5.3 Example 5.1

106

(1)

5.4 Equation for a First Order Multiple Linear Regression Model

107

(1)

5.5 Alternate Equation for a First Order Multiple Linear Regression Model

107

(1)

5.6 How Multiple Linear Regression is Used with Sample Data

107

(2)

5.7 Estimation of a Y Value for an Individual from the Sampled Population

109

(1)

5.7.1 Equation for Estimating a Y Value for an Individual from the Sampled Population

109

(1)

5.7.2 Example

109

(1)

5.8 Using Multiple Linear Regression to Estimate the Mean Y Value in a Population Subgroup with Particular Values for the Explanatory Variables

110

(1)

5.9 Assessing Accuracy of Predicted Values

110

(1)

5.10 Ideal Inference Conditions for a Multiple Linear Regression Model

110

(1)

5.11 Measurement Error in Explanatory Variables in Multiple Linear Regression

111

(1)

5.12 Collinearity -- Why Worry?

112

(1)

5.13 Why the Estimation of Variance of Y Given the X Variables is Critical in Multiple Linear Regression

113

(1)

5.14
Chapter Summary

114

(3)

Appendix

115

(1)

5.A Program 5.1

115

(1)

5.B Example 5.1 Data Values

116

(1)

6 Before Interpreting a Multiple Linear Regression Analysis

117

(20)

6.1 Introduction

117

(1)

6.2 Evaluating Collinearity

117

(3)

6.2.1 A Method for Diagnosing Collinearity

118

(1)

6.2.2 Interpreting Collinearity Diagnostics from the Method Proposed by Belsley (1991)

118

(1)

6.2.3 Variance Inflation Factor: Another Measure of Collinearity

119

(1)

6.2.4 Further Comments Regarding Collinearity Diagnosis

120

(1)

6.3 A Ten-Step Approach for Checking Ideal Inference Conditions in Multiple Linear Regression

120

(3)

6.4 Example -- Evaluating Ideal Inference Conditions

123

(4)

6.5 Identifying Influential Cases

127

(1)

6.6 Summary of Results Regarding Potentially Influential Cases based on Cook's D, DFBETAS, and DFFITS Values Output from Program 6.1C

128

(3)

6.6.1 DFBETAS

129

(1)

6.6.2 DFFITS

129

(2)

6.6.3 Potentially Influential Cases

131

(1)

6.7
Chapter Summary

131

(6)

Appendix

133

(1)

6.A Program 6.1A

133

(1)

6.B Program 6.1B

134

(1)

6.C Program 6.1C

135

(2)

7 Interpreting an Additive Multiple Linear Regression Model

137

(20)

7.1 Introduction

137

(1)

7.2 Height at Age 18 Example

137

(1)

7.3 The Model F-Test

138

(2)

7.3.1 Model F-Test and Some Other Results Generated by the Model Statement of the REG Procedure for the Height at Age 18 Example

138

(1)

7.3.2 Interpretation When the Model F-Test Statistic for a Multiple Linear Regression Model Has a Small p-Value

139

(1)

7.3.3 Possible Interpretations When the Model F-Test Statistic for an Additive Multiple Linear Regression Model Has a Large p-Value

140

(1)

7.4 R2 in Multiple Linear Regression

140

(1)

7.4.1 Equation

140

(1)

7.4.2 Issues Regarding R2

141

(1)

7.4.3 Example

141

(1)

7.5 Adjusted R2

141

(1)

7.5.1 Equation for Adjusted R2

141

(1)

7.5.2 Issues Regarding Adjusted R2

142

(1)

7.5.3 Example

142

(1)

7.6 Root Mean Square

142

(1)

7.7 Coefficient of Variation for the Model

142

(1)

7.8 Partial t-Tests in a Multiple Linear Regression Model

143

(2)

7.8.1 Equation of the Test Statistic for a Partial t-Test of a Regression Coefficient

143

(1)

7.8.2 Partial t-Test Examples

144

(1)

7.9 Partial t-Tests and the Issue of Multiple Testing

145

(1)

7.10 The Model F-Test, Partial t-Tests, and Collinearity

145

(1)

7.11 Why Estimate a Confidence Interval for Bp

145

(2)

7.11.1 Example

146

(1)

7.11.2 Equation for a Conventional Confidence Interval Estimate for Bj

146

(1)

7.12 Estimating a Conventional Confidence Interval for the Mean Y Value in a Population Subgroup with a Particular Combination of the Predictor Variables

147

(1)

7.12.1 Example

147

(1)

7.13 Estimating a Prediction Interval for an Individual Y Value Given Specified Values for the Predictor Variables

148

(1)

7.14 Further Evaluation of Influence in the Height at Age 18 Example

149

(1)

7.15 Extrapolation in Multiple Regression

150

(3)

7.15.1 Delivery Time Data Example

151

(1)

7.15.2 Cases Sorted by Hat Values from Program 7.2

151

(2)

7.16
Chapter Summary

153

(4)

Appendix

154

(1)

7.A Program 7.1

154

(1)

7.B Program 7.2

154

(1)

7.B.1 Explanation of Program 7.2

155

(2)

8 Modelling a Two-Way Interaction Between Continuous Predictors in Multiple Linear Regression

157

(24)

8.1 Introduction to a Two-Way Interaction

157

(1)

8.2 A Two-Way Interaction Model in Multiple Linear Regression

158

(2)

8.3 Investigating a Two-Way Interaction Using Sample Data

160

(1)

8.4 Example 8.1: Physical Endurance as a Linear Function of Age, Exercise, and their Interaction

161

(1)

8.5 Facilitating Interpretation in Two-Way Interaction Regression via Centring

161

(2)

8.5.1 Example of Centring to Facilitate Interpretation

163

(1)

8.6 A Suggested Approach for Applying a Multiple Linear Regression Two-Way Interaction Model with Two Continuous Predictors

163

(2)

8.7 An Example Using the Suggested Approach for Applying an Interaction Model with Two Continuous Predictors

165

(1)

8.8 Results from Application of a Multiple Regression Interaction Analysis with Centred AGE and Uncentered EXER in the Physical Endurance Example

166

(1)

8.9 Interpretation of Regression Coefficient Estimates for the Interaction Model Fitted to Example 8.1

167

(1)

8.10 Visualization of the Interaction Effect Observed in Example 8.1

168

(2)

8.11 Predicting Endurance on a Treadmill

170

(1)

8.12
Chapter Summary

171

(10)

Appendix

173

(1)

8.A Summary of Evaluation of Ideal Inference Conditions for the Physical Endurance Example with a Two-Way Interaction Model

173

(3)

8.B Summary of Influence Diagnostics for the Physical Endurance Example with a Two-Way Interaction Model

176

(1)

8.C Program 8.1

177

(1)

8.D Program 8.2

178

(2)

8.D.1 Explanation of SAS Statements in Program 8.2

180

(1)

9 Evaluating a Two-Way Interaction Between a Qualitative and a Continuous Predictor in Multiple Linear Regression

181

(20)

9.1 Introduction

181

(1)

9.2 How to Include a Qualitative Variable in a Multiple Linear Regression Model

181

(1)

9.3 Full Rank Reference Cell Coding

182

(1)

9.4 Example 9.1: Mussel Weight

182

(1)

9.5 Example Using a Nine-Step Approach for Applying a Multiple Linear Regression with a Two-Way Interaction between a Continuous and Qualitative Predictor

183

(1)

9.6 Evaluating Influence in Multiple Regression Models which Involve Qualitative Variables

184

(1)

9.7 Summary of Influence Evaluation in the Multiple Regression Interaction Analysis for the Mussel Example

185

(1)

9.8 Results from Program 9.2: Two-Way Interaction Regression Analysis between a Qualitative Predictor (Reference Cell Coding) and a Centred Continuous Predictor

186

(1)

9.9 Interpreting Results of the Model F-Test and Parameter Estimates Reported in Section 9.8

187

(4)

9.9.1 Introduction

187

(1)

9.9.2 Interpretation of Model F-Test

187

(1)

9.9.3 Interpretation of the Estimated Regression Coefficient for AGECILOC

187

(2)

9.9.4 Interpretation of the Estimated Regression Coefficient for AGEC

189

(1)

9.9.5 Interpretation of the Estimated Regression Coefficient for ILOC

189

(1)

9.9.6 Interpretation of the Sample Model Intercept

190

(1)

9.10
Chapter Summary

191

(10)

Appendix

192

(1)

9.A Evaluation of Ideal Inference Conditions for the Mussel Example

192

(2)

9.B Program 9.1: Getting to Know the Mussel Data

194

(1)

9.B.1 Explanation of Statements in Program 9.1

195

(1)

9.C Program 9.2: Two-Way Interaction Regression Analysis between a Qualitative Predictor (Reference Cell Coding) and a Centred Continuous Predictor

196

(2)

9.C.1 Explanation of Selected SAS Statements in Program 9.2

198

(1)

9.D Program 9.3

199

(2)

10 Subset Selection of Predictor Variables in Multiple Linear Regression

201

(26)

10.1 Introduction

201

(1)

10.2 Under-Fitting, Over-Fitting, and the Bias-Variance Trade-Off

202

(1)

10.3 Overview of Traditional Model Selection Algorithms

202

(2)

10.3.1 Forward Selection

202

(1)

10.3.2 Backward Elimination

203

(1)

10.3.3 Stepwise Selection

203

(1)

10.3.4 All Subsets Selection Algorithm

203

(1)

10.3.5 The Impact of Collinearity on Traditional Sequential Selection Algorithms

203

(1)

10.4 Fit Criteria Used in Model Selection Methods

204

(4)

10.4.1 Adjusted R2

204

(1)

10.4.2 Akaike's Information Criterion

205

(1)

10.4.3 The Corrected Akaike's Information Criterion

205

(1)

10.4.4 Average Square Error

206

(1)

10.4.5 Mallows' Cp Statistic

206

(1)

10.4.6 Mean Square Error

206

(1)

10.4.7 The PRESS Statistic

207

(1)

10.4.8 Schwarz's Bayesian Information Criterion

207

(1)

10.4.9 Significance Level (p-Value) Criterion

207

(1)

10.5 Post-Selection Inference Issues

208

(1)

10.6 Predicting Percentage Body Fat: Naval Academy Example

208

(3)

10.7 Model Selection and the REG Procedure

211

(2)

10.8 Model Selection and the GLMSELECT Procedure

213

(8)

10.8.1 Example of Subset Selection in GLMSELECT

214

(1)

10.8.2 Model Information: Program 10.2

214

(1)

10.8.3 Model Building Summary Results: Program 10.2

215

(1)

10.8.4 Comments Regarding Model Building Summary Results

216

(1)

10.8.5 Selected Model Results: Program 10.2

217

(1)

10.8.6 Comments Regarding Standard Errors of the Partial Regression Coefficients for the Selected Model

218

(1)

10.8.7 Average Square Error Plot: Program 10.2

218

(1)

10.8.8 Criteria Panel Plot: Program 10.2

218

(3)

10.9 Other Features Available in GLMSELECT

221

(1)

10.10
Chapter Summary

221

(6)

Appendix

223

(1)

10.A Program 10.1

223

(2)

10.B Program 10.2

225

(2)

11 Evaluating Equality of Group Means with a One-Way Analysis of Variance

227

(26)

11.1 Characteristic Features of a One-Way Analysis of Variance Model

227

(1)

11.2 Fixed Effects vs. Random Effects in One-Way ANOVA

227

(2)

11.3 Why Use a One-Way Fixed Effects Analysis of Variance?

229

(1)

11.4 Task Study Example

229

(1)

11.5 The Means Model for a One-Way Fixed Effects ANOVA

230

(1)

11.6 Basic Concepts Underlying a One-Way Fixed Effects ANOVA

230

(2)

11.7 Ideal Inference Conditions for One-Way Fixed Effects ANOVA

232

(1)

11.8 Potential Consequences of Violating Ideal Inference Conditions for One-Way Fixed Effects ANOVA Test of Equality of Group Means

232

(2)

11.9 Overview of General Approaches for Evaluating Ideal Inference Conditions for a One-Way Fixed Effects ANOVA

234

(1)

11.10 Suggestions for Alternative Approaches When Ideal Inference Conditions are Not Reasonable

234

(1)

11.11 Testing Equality of Group Variances

235

(1)

11.12 Some Earlier Tests for Equality of Variances

235

(1)

11.12.1 Bartlett's Test

235

(1)

11.12.2 Hartley's Fmax Test

236

(1)

11.13 ANOVA-Based Tests for Equality of Variances

236

(2)

11.13.1 Levene's Approach for Testing Equality of Variances

236

(1)

11.13.2 Brown's and Forsythe's Tests of Variances

237

(1)

11.13.3 O'Brien's Test

237

(1)

11.14 Simulation Studies on ANOVA-Based Equality of Variances Tests

238

(1)

11.15 Additional Comments about ANOVA-Based Equality of Variances Tests

239

(1)

11.16 Overview of a Step-by-Step Approach for Checking Ideal Inference Conditions

239

(1)

11.17 Nine-Step Approach for Evaluating Ideal Inference Conditions: Task Study Example

240

(3)

11.18 One-Way ANOVA Model F-Test of Means

243

(2)

11.18.1 Task Study Example

243

(1)

11.18.2 What is a Linear Contrast?

244

(1)

11.18.3 Comments on Model F-Test in One-Way Fixed Effects ANOVA

244

(1)

11.19 Testing Equality of Population Means When Population Variances Are Unequal

245

(1)

11.19.1 Welch's Test

245

(1)

11.19.2 Fitting Unequal Variance ANOVA Models

246

(1)

11.19.3 Other Suggestions

246

(1)

11.20
Chapter Summary

246

(7)

Appendix

248

(1)

11.A Program 11.1: Data Screening

248

(1)

11.B Program 11.2

249

(1)

11.C Explanation of Statements in Program 11.2

250

(3)

12 Multiple Testing and Simultaneous Confidence Intervals

253

(34)

12.1 Multiple Testing and the Multiplicity Problem

253

(1)

12.2 Measures of Error Rates

253

(1)

12.3 Overview of Multiple Testing Procedures

254

(2)

12.4 The Least Significant Difference: A Multiplicity-Unadjusted Procedure

256

(2)

12.4.1 Introduction

256

(1)

12.4.2 Ideal Inference Conditions

256

(1)

12.4.3 Fisher's LSD

257

(1)

12.4.4 SAS and the LSD Procedure

257

(1)

12.5 Examples of Familywise Error Rate Controlling Procedures

258

(1)

12.6 The Bonferroni Method

259

(3)

12.6.1 Introduction

259

(1)

12.6.2 Example 12.1: An Application of the Bonferroni Method

260

(1)

12.6.3 Why the Bonferroni Method Can Be Conservative

261

(1)

12.6.4 The Bonferroni Method and All Possible Pairwise Comparisons of Group Means

261

(1)

12.7 The Tukey-Kramer Method for All Pairwise Comparisons

262

(2)

12.7.1 Introduction

262

(1)

12.7.2 SAS and the Tukey-Kramer Method

263

(1)

12.8 A Simulation-Based Method for All Pairwise Comparisons

264

(1)

12.8.1 Introduction

264

(1)

12.8.2 SAS and Simulation-Based Adjusted p-Value Estimates

264

(1)

12.8.3 Task Study Example of Simulation-Based Adjusted p-Values Estimates

265

(1)

12.9 Dunnett's Method for "Treatment" vs. "Control" Comparisons

265

(2)

12.9.1 Introduction

265

(1)

12.9.2 SAS and Dunnett's Test

266

(1)

12.10 Scheffe's Method for "Data Snooping"

267

(2)

12.10.1 Introduction

267

(1)

12.10.2 Application of Scheffe's Method

268

(1)

12.10.3 SAS and Scheffe's method

268

(1)

12.11 Ordinary Confidence Intervals and the Multiplicity Issue

269

(1)

12.11.1 Introduction

269

(1)

12.11.2 Why Does the Overall Confidence Level Decrease as the Number of Ordinary Confidence Intervals Increases?

269

(1)

12.12 Controlling Familywise Error Rate for Confidence Intervals

270

(3)

12.12.1 Introduction

270

(1)

12.12.2 Task Study Example

271

(1)

12.12.3 SAS and Controlling Familywise Error Rate for Confidence Intervals

272

(1)

12.13 Confidence Bands for Simple Linear Regression

273

(5)

12.13.1 The Working-Hotelling Method

274

(1)

12.13.2 SAS and Working-Hotelling's Confidence Band

274

(1)

12.13.3 A Discrete Simulation-Based Method

275

(3)

12.14 Single Step vs. Sequential Multiplicity-Adjusted Procedures

278

(1)

12.15 The Holm-Bonferroni Sequential Procedure

278

(3)

12.15.1 Introduction

278

(1)

12.15.2 How to Compute Holm-Bonferroni Multiplicity-Adjusted p-Values

279

(1)

12.15.3 SAS and the Holm-Bonferroni Method

280

(1)

12.15.4 Familywise Adjusted p-Values for Partial t-Tests from Program 12.2

281

(1)

12.16 Adjusting for Multiplicity Using Resampling Methods

281

(1)

12.17 Benjamini-Hochberg's False Discovery Rate Method

281

(3)

12.17.1 Introduction

281

(1)

12.17.2 Benjamini-Hochberg's Adjusted p-Values

282

(1)

12.17.3 SAS and Benjamini-Hochberg's FDR-Controlling Method

283

(1)

12.17.4 Interpreting Results from a FDR-Controlling Procedure

284

(1)

12.18 Recent Advances in FDR-Controlling Procedures

284

(1)

12.19
Chapter Summary

285

(2)

13 Analysis of Covariance: Adjusting Group Means for Nuisance Variables Using Regression

287

(64)

13.1 Introduction

287

(1)

13.2 Characteristic Features of a One-Way Analysis of Covariance Linear Model

287

(1)

13.3 Why Apply an Analysis of Covariance?

288

(1)

13.4 Example 13.1: Exercise Programs and Heart Rate

288

(1)

13.5 General Equation for a One-Way Analysis of Covariance with a Single Continuous Covariate

289

(1)

13.6 Two Critical Decisions in an Analysis of Covariance

290

(4)

13.7 Implementing a One-Way ANCOVA A Step-by-Step Approach

294

(3)

13.8 An Example of Implementing an Analysis of Covariance

297

(11)

13.9 What If an Equal Slopes Analysis Had Been Applied to the Exercise Program Data?

308

(2)

13.9.1 What Is Going On?

309

(1)

13.10 What If a One-Way ANOVA Had Been Applied to the Exercise Program Data?

310

(2)

13.11 Example 13.2: Effect of Study Methods on Exam Scores Adjusted for Pretest Scores

312

(1)

13.12 A Step-by-Step Analysis of Covariance for Example 13.2

312

(9)

13.13 References for Other Approaches for Covariate Adjustment

321

(1)

13.14
Chapter Summary

321

(30)

Appendix

323

(1)

13.A Details of a Nine-Step Evaluation of Ideal Inference Conditions for Simple Linear Regressions in Exercise Programs A and B in Example 13.1

323

(6)

13.B Details of Evaluation of Ideal Inference Conditions for Simple Linear Regressions of Postscore vs. Prescore in Example 13.2

329

(6)

13.C Program 13.1

335

(1)

13.D Program 13.2

336

(2)

13.E Program 13.3A

338

(1)

13.F Program 13.3B

339

(1)

13.G Program 13.4

340

(2)

13.H Program 13.5

342

(1)

13.I Program 13.6

343

(1)

13.J Program 13.7

344

(1)

13.K Summary of SAS Programs for Example 13.2

345

(1)

13.L Program 13.8

345

(1)

13.M Program 13.9

346

(2)

13.N Program 13.10A

348

(1)

13.O Program 13.10B

349

(1)

13.P Program 13.11

349

(1)

13.Q Program 13.12

350

(1)

14 Alternative Approaches If Ideal Inference Conditions Are Not Satisfied

351

(38)

14.1 Introduction

351

(1)

14.2 When Random Sampling Is Not an Option

351

(1)

14.3 When Errors Are Not All Independent

352

(2)

14.3.1 Introduction

352

(1)

14.3.2 Fitting a Model for Correlated Data

353

(1)

14.4 Transformations

354

(5)

14.4.1 Advantages and Disadvantages of a Data Transformation Approach

354

(1)

14.4.2 Power Transformations

355

(1)

14.4.3 Applying Power Transformations in SAS

356

(1)

14.4.4 Log Transformations

357

(2)

14.5 Alternative Approaches When Linearity Is Not Satisfied

359

(8)

14.5.1 Introduction

359

(1)

14.5.2 Adding One or More Predictor Variable(s) to the Model to Achieve Linearity

359

(1)

14.5.3 Transformations and the Linearity Assumption

359

(5)

14.5.4 Overview of Rank Regression

364

(1)

14.5.5 Polynomial Regression and Nonlinearity

365

(2)

14.5.6 Overview of Fitting a Nonlinear Model

367

(1)

14.6 Alternative Approaches When Variances Are Not All Equal

367

(3)

14.6.1 Introduction

367

(1)

14.6.2 Transforming Data to Achieve Equality of Variances

368

(1)

14.6.3 Overview of Weighted Least Squares for Unequal Variances

369

(1)

14.7 Alternative Approaches If Model Errors Are Not Normal

370

(1)

14.7.1 Introduction

370

(1)

14.7.2 Reducing Skewness with Power Transformations

370

(1)

14.8 Robust Statistics

371

(1)

14.9 Bootstrapping

371

(2)

14.9.1 Case Resampling

372

(1)

14.9.2 The Bootstrap Estimate of Standard Error

372

(1)

14.9.3 The Bootstrap Percentile Confidence Interval

373

(1)

14.10 When Is n Too Small And What Is An Adequate Number Of Bootstrap Samples?

373

(4)

14.10.1 Example of a Bootstrapping Application

374

(3)

14.11 Alternative Approaches If Harmful Collinearity Is Detected

377

(1)

14.12
Chapter Summary

378

(11)

Appendix

380

(1)

14.A Program 14.A

380

(1)

14.B Program 14.B

380

(2)

14.C Program 14.C

382

(1)

14.D Program 14.D

383

(1)

14.E Program 14.E

384

(1)

14.E.1 Explanation of SAS Statements in Program 14.E

385

(4)

References

389

(12)

Index

401

Patricia F. Moodie is a Research Scholar in the Department of Mathematics and Statistics at the University of Winnipeg, Manitoba, Canada. Prior to that she was Head of Biostatistics in the Computer Department for Health Sciences in the College of Medicine, University of Manitoba, an adjunct lecturer in Biometry in the Department of Social and Preventive Medicine at the University of Manitoba, and a biostatistician in the Epidemiology and Biostatistics Department at the Manitoba Cancer Treatment and Research Foundation. Her statistical consulting and collaboration for over three decades as well as her substantive background in the biomedical sciences have made her appreciate the challenges in analyzing and interpreting real-life data. She received a BSc (Hons) in Biology at Memorial University of Newfoundland, an MSc in Zoology at the University of Alberta, and an MS in Biostatistics at the University of Illinois at Chicago. She has been an enthusiastic SAS user since 1980.

Dallas E. Johnson, Professor Emeritus in the Department of Statistics, Kansas State University, has published extensively in the areas of linear models, multiplicative interaction models, experimental design, and messy data analysis. He is the author of Applied Multivariate Methods for Data Analysts and co-author with George A. Milliken of the following books: Analysis of Messy Data, Vol. I - Designed Experiments, Vol. II - Nonreplicated Experiments, Vol. III - Analysis of Covariance, and Vol. I - Designed Experiments 2nd Edition. An active presenter of short courses, and a statistical consultant for over 50 years, he was the recipient of ASA's award for Excellence in Statistical Consulting in 2010. He received his B.S. degree in Mathematics Education, Kearney State College, a M.A.T. degree in Mathematics, Colorado State University, a M.S. degree in Mathematics, Western Michigan University, and a Ph.D. degree in Statistics, Colorado State University. He has been a SAS user and mentor since 1976.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97814398695292e.html

E-raamat: Applied Regression and ANOVA Using SAS

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv