Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Multiple Testing Procedures with Applications to Genomics 2008 ed. [Kõva köide]

Sandrine Dudoit, Mark J. van der Laan

Formaat: Hardback, 590 pages, kaal: 1100 g, XXXIII, 590 p., 1 Hardback
Sari: Springer Series in Statistics
Ilmumisaeg: 19-Dec-2007
Kirjastus: Springer-Verlag New York Inc.
ISBN-10: 0387493166
ISBN-13: 9780387493169

Teised raamatud teemal:

Genetics (non-medical) - (Hetkel poes: 3 nimetust)

Kõva köide
Hind: 187,67 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Tavahind: 220,79 €
Säästad 15%
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 590 pages, kaal: 1100 g, XXXIII, 590 p., 1 Hardback
Sari: Springer Series in Statistics
Ilmumisaeg: 19-Dec-2007
Kirjastus: Springer-Verlag New York Inc.
ISBN-10: 0387493166
ISBN-13: 9780387493169

Teised raamatud teemal:

Genetics (non-medical) - (Hetkel poes: 3 nimetust)

Püsilink: https://www.kriso.ee/db/9780387493169.html

Märksõnad:

This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. The methods are applied to a range of testing problems in biomedical and genomic research, including the identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments, such as microarray experiments; tests of association between gene expression measures and biological annotation metadata (e.g., Gene Ontology); sequence analysis; and the genetic mapping of complex traits using single nucleotide polymorphisms. The book is aimed at both statisticians interested in multiple testing theory and applied scientists encountering high-dimensional testing problems in their subject matter area. Specifically, the book proposes resampling-based single-step and stepwise multiple testing procedures for controlling a broad class of Type I error rates, defined as tail probabilities and expected values for arbitrary functions of the numbers of Type I errors and rejected hypotheses (e.g., false discovery rate). Unlike existing approaches, the procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics. The multiple testing results are reported in terms of rejection regions, parameter confidence regions, and adjusted p-values.

This book provides a detailed account of the theoretical foundations of proposed multiple testing methods and illustrates their application to a range of testing problems in genomics.

Arvustused

From the reviews:

"This book summarizes the recent work of Sandrine Dudoit and Mark van der Laan on multiple testing. It proposes a general framework for multiple testing procedures (MTPs) and introduces new concepts . The authors also provide code for reproducing the results of some of the applications. if one is looking for a detailed summary of the latest developments in multiple testing regarding MTPs or in the application of MTPs to biomedical and genomic data, then this book is an excellent reference." (Holger Schwender, Statistical Papers, Vol. 50, 2009)

"In the last decade a growing amount of statistical research has been devoted to multiple testing. This book summarizes the recent work on this area. very useful for the applied researcher who would like to understand how to apply multiple testing. a good reference for statisticians interested in a general treatment of multiple testing." (Avner Bar-Hen, Mathematical Reviews, Issue 2009 j)

Preface

VII

List of Figures

XXVII

List of Tables

XXXI

1 Multiple Hypothesis Testing

1.1 Introduction

1.1.1 Motivation

1.1.2 Bibliography for proposed multiple testing methodology

1.1.3 Overview of applications to biomedical and genomic research

1.1.4 Road map

1.2 Multiple hypothesis testing framework

1.2.1 Overview

1.2.2 Data generating distribution

1.2.3 Parameters

1.2.4 Null and alternative hypotheses

1.2.5 Test statistics

1.2.6 Multiple testing procedures

1.2.7 Rejection regions

1.2.8 Errors in multiple hypothesis testing: Type I, Type II, and Type HI errors

1.2.9 Type I error rates

1.2.10 Power

1.2.11 Type I error rates and power: Comparisons and examples

1.2.12 Unadjusted and adjusted p-values

1.2.13 Stepwise multiple testing procedures

2 Test Statistics Null Distribution

2.1 Introduction

2.1.1 Motivation

2.1.2 Outline

2.2 Type I error control and choice of a test statistics null distribution

2.2.1 Type I error control

2.2.2 Sketch of proposed approach to Type I error control

2.2.3 Characterization of test statistics null distribution in terms of null domination conditions

2.2.4 Contrast with other approaches

2.3 Null shift and scale-transformed test statistics null distribution

2.3.1 Explicit construction for the test statistics null distribution

2.3.2 Bootstrap estimation of the test statistics null distribution

2.4 Null quantile-transformed test statistics null distribution

2.4.1 Explicit construction for the test statistics null distribution

2.4.2 Bootstrap estimation of the test statistics null distribution

2.4.3 Comparison of null shift and scale-transformed and null quantile-transformed null distributions

2.5 Null distribution for transformations of the test statistics

2.5.1 Null distribution for transformed test statistics

2.5.2 Example: Absolute value transformation

2.5.3 Example: Null shift and scale and null quantile transformations

2.5.4 Bootstrap estimation of the null distribution for transformed test statistics

2.6 Testing single-parameter null hypotheses based on t-statistics

2.6.1 Set-up and assumptions

2.6.2 Test statistics null distribution

2.6.3 Estimation of the test statistics null distribution

2.6.4 Example: Tests for means

2.6.5 Example: Tests for correlation coefficients

2.6.6 Example: Tests for regression coefficients

2.7 Testing multiple-parameter null hypotheses based on F-statistics

2.7.1 Set-up and assumptions

2.7.2 Test statistics null distribution

2.7.3 Estimation of the test statistics null distribution

2.8 Weak and strong Type I error control and subset pivotality

2.8.1 Weak and strong control of a Type I error rate

2.8.2 Subset pivotality

2.9 Test statistics null distributions based on bootstrap and permutation data generating distributions

2.9.1 The two-sample test of means problem

2.9.2 Distribution of the test statistics under two different data generating distributions

100

2.9.3 Bootstrap and permutation test statistics null distributions

104

3 Overview of Multiple Testing Procedures

109

3.1 Introduction

109

3.1.1 Set-up

109

3.1.2 Type I error control and choice of a test statistics null distribution

110

3.1.3 Marginal multiple testing procedures

111

3.1.4 Joint multiple testing procedures

112

3.2 Multiple testing procedures for controlling the number of Type I errors: FWER

112

3.2.1 Controlling the number of Type I errors

112

3.2.2 FWER-controlling single-step procedures

113

3.2.3 FWER-controlling step-down procedures

121

3.2.4 FWER-controlling step-up procedures

127

3.3 Multiple testing procedures for controlling the number of Type I errors: gFWER

134

3.3.1 gFWER-controlling single-step and step-down Lehmann and Romano procedures

134

3.3.2 gFWER-controlling single-step common-cut-off and common-quantile procedures

137

3.3.3 gFWER-controlling augmentation multiple testing procedures

139

3.3.4 gFWER-controlling resampling-based empirical Bayes procedures

140

3.3.5 Other gFWER-controlling procedures

140

3.3.6 Comparison of gFWER-controlling procedures

140

3.4 Multiple testing procedures for controlling the proportion of Type I errors among the rejected hypotheses: FDR

145

3.4.1 Controlling the number vs. the proportion of Type I errors

145

3.4.2 FDR-controlling step-up Benjamini and Hochberg procedure

146

3.4.3 FDR-controlling step-up Benjamini and Yekutieli procedure

147

3.4.4 FDR-controlling resampling-based empirical Bayes procedures

148

3.4.5 Other FDR-controlling procedures

148

3.5 Multiple testing procedures for controlling the proportion of Type I errors among the rejected hypotheses: TPPFP

149

3.5.1 Controlling the expected value vs. tail probabilities for the proportion of Type I errors

149

3.5.2 TPPFP-controlling step-down Lehmann and Romano procedures

150

3.5.3 TPPFP-controlling augmentation multiple testing procedures

153

3.5.4 TPPFP-controlling resampling-based empirical Bayes procedures

154

3.5.5 Comparison of TPPFP-controlling procedures

155

4 Single-Step Multiple Testing Procedures for Controlling General Type I Error Rates, e(Fv)

161

4.1 Introduction

161

4.1.1 Motivation

161

4.1.2 Outline

163

4.2 Θ(Fvn)-controlling single-step procedures

163

4.2.1 Single-step common-quantile procedure

164

4.2.2 Single-step common-cut-off procedure

165

4.2.3 Asymptotic control of Type I error rate and test statistics null distribution

165

4.2.4 Common-cut-off vs. common-quantile procedures

168

4.3 Adjusted p-values for Θ(Fvn)-controlling single-step procedures

169

4.3.1 General Type I error rates, Θ(Fvn)

169

4.3.2 Per-comparison error rate, PCER

171

4.3.3 Generalized family-wise error rate, gFWER

172

4.4 Θ(Fvn)-controlling bootstrap-based single-step procedures

174

4.4.1 Asymptotic control of Type I error rate for single-step procedures based on consistent estimator of test statistics null distribution

175

4.4.2 Bootstrap-based single-step procedures

183

4.5 Θ(Fvn)-controlling two-sided single-step procedures

187

4.5.1 Symmetric two-sided single-step common-quantile procedure

188

4.5.2 Symmetric two-sided single-step common-cut-off procedure

189

4.5.3 Asymptotic control of Type I error rate and test statistics null distribution

189

4.5.4 Bootstrap-based symmetric two-sided single-step procedures

190

4.6 Multiple hypothesis testing and confidence regions

191

4.6.1 Confidence regions for general Type I error rates, Θ(Fvn)

191

4.6.2 Equivalence between Θ-specific single-step multiple testing procedures and confidence regions

194

4.6.3 Bootstrap-based confidence regions for general Type I error rates, Θ(Fvn)

196

4.7 Optimal multiple testing procedures

197

5 Step-Down Multiple Testing Procedures for Controlling the Family-Wise Error Rate

199

5.1 Introduction

199

5.1.1 Motivation

199

5.1.2 Outline

201

5.2 FWER-controlling step-down common-cut-off procedure based on maxima of test statistics

202

5.2.1 Step-down maxT procedure

202

5.2.2 Asymptotic control of the FWER

203

5.2.3 Test statistics null distribution

208

5.2.4 Adjusted p-values

211

5.3 FWER-controlling step-down common-quantile procedure based on minima of unadjusted p-values

212

5.3.1 Step-down minP procedure

213

5.3.2 Asymptotic control of the FWER

215

5.3.3 Test statistics null distribution

218

5.3.4 Adjusted p-values

219

5.3.5 Comparison of joint step-down minP procedure to marginal step-down procedures

220

5.4 FWER-controlling step-up common-cut-off and common-quantile procedures

224

5.4.1 Candidate step-up maxT and minP procedures

224

5.4.2 Comparison of joint stepwise minP procedures to marginal stepwise Holm and Hochberg procedures

227

5.5 FWER-controlling bootstrap-based step-down procedures

227

5.5.1 Asymptotic control of FWER for step-down procedures based on consistent estimator of test statistics null distribution

228

5.5.2 Bootstrap-based step-down procedures

232

6 Augmentation Multiple Testing Procedures for Controlling Generalized Tail Probability Error Rates

235

6.1 Introduction

235

6.1.1 Motivation

235

6.1.2 Outline

237

6.1.3 Type I error rates

238

6.1.4 Augmentation multiple testing procedures

239

6.2 Augmentation multiple testing procedures for controlling the generalized family-wise error rate, gFWER(k) = Pr(14, > k)

242

6.2.1 gFWER-controlling augmentation multiple testing procedures

242

6.2.2 Finite sample and asymptotic control of the gFWER

243

6.2.3 Adjusted p-values for gFWER-controlling augmentation multiple testing procedures

244

6.3 Augmentation multiple testing procedures for controlling the tail probability for the proportion of false positives, TPPFP(q)= Pr(Vn/Rn > q)

245

6.3.1 TPPFP-controlling augmentation multiple testing procedures

245

6.3.2 Finite sample and asymptotic control of the TPPFP

247

6.3.3 Adjusted p-values for TPPFP-controlling augmentation multiple testing procedures

250

6.4 TPPFP-based multiple testing procedures for controlling the false discovery rate, FDR = E[ Vn/Rn]

251

6.4.1 FDR-controlling TPPFP-based multiple testing procedures

251

6.4.2 Adjusted p-values for FDR-controlling TPPFP-based multiple testing procedures

255

6.5 General results on augmentation multiple testing procedures

256

6.5.1 Augmentation multiple testing procedures for controlling the generalized tail probability error rate, gTP(q, g) = Pr(g(Vn, Rn) > q)

257

6.5.2 Adjusted p-values for general augmentation multiple testing procedures

262

6.5.3 gFWER-controlling augmentation multiple testing procedures

264

6.5.4 TPPFP-controlling augmentation multiple testing procedures

265

6.5.5 gTPPFP-controlling augmentation multiple testing procedures

267

6.6 gTP-based multiple testing procedures for controlling the generalized expected value, g EV (g) = E[ g(Vn,Rn)]

269

6.6.1 gEV-controlling gTP-based multiple testing procedures

270

6.6.2 Adjusted p-values for gEV-controlling gTP-based multiple testing procedures

271

6.7 Initial FWER- and gFWER-controlling multiple testing procedures

272

6.8 Discussion

273

7 Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability Error Rates

289

7.1 Introduction

289

7.1.1 Motivation

289

7.1.2 Outline

290

7.2 gTP-controlling resampling-based empirical Bayes procedures

291

7.2.1 Notation

291

7.2.2 gTP control and optimal test statistic cut-offs

292

7.2.3 Overview of gTP-controlling resampling-based empirical Bayes procedures

294

7.2.4 Working model for distributions of null test statistics and guessed sets of true null hypotheses

295

7.2.5 gTP-controlling resampling-based empirical Bayes procedures

298

7.3 Adjusted p-values for gTP-controlling resampling-based empirical Bayes procedures

300

7.3.1 Adjusted p-values for common-cut-off procedure

300

7.3.2 Adjusted p-values for common-quantile procedure

302

7.4 Finite sample rationale for gTP control by resampling-based empirical Bayes procedures

303

7.4.1 Procedures based on constant guessed set of true null hypotheses and observed test statistics

303

7.4.2 Procedures based on constant guessed set of true null hypotheses and null test statistics

305

7.4.3 Procedures based on random guessed sets of true null hypotheses and null test statistics

305

7.5 Formal asymptotic gTP control results for resampling-based empirical Bayes procedures

306

7.5.1 Asymptotic control of gTP by resampling-based empirical Bayes Procedure 7.1

306

7.5.2 Assumptions for Theorem 7.2

307

7.5.3 Proof of Theorem 7.2

310

7.6 gTP-controlling resampling-based weighted empirical Bayes procedures

312

7.7 FDR-controlling empirical Bayes procedures

313

7.7.1 FDR-controlling empirical Bayes q-value-based procedures

314

7.7.2 Equivalence between empirical Bayes q-value-based procedure and frequentist step-up Benjamini and Hochberg procedure

316

7.8 Discussion

318

Color Plates

321

8 Simulation Studies: Assessment of Test Statistics Null Distributions

345

8.1 Introduction

345

8.1.1 Motivation

345

8.1.2 Outline

347

8.2 Bootstrap-based multiple testing procedures

348

8.2.1 Null shift and scale-transformed test statistics null distribution

348

8.2.2 Bootstrap estimation of the null shift and scale-transformed test statistics null distribution

349

8.2.3 Bootstrap-based single-step maxT procedure

350

8.3 Simulation Study 1: Tests for regression coefficients in linear models with dependent covariates and error terms

351

8.3.1 Simulation model

351

8.3.2 Multiple testing procedures

352

8.3.3 Simulation study design

354

8.3.4 Simulation study results

356

8.4 Simulation Study 2: Tests for correlation coefficients

360

8.4.1 Simulation model

360

8.4.2 Multiple testing procedures

360

8.4.3 Simulation study design

363

8.4.4 Simulation study results

364

9 Identification of Differentially Expressed and Co-Expressed Genes in High-Throughput Gene Expression Experiments

367

9.1 Introduction

367

9.2 Apolipoprotein AI experiment of Callow et al. (2000)

368

9.2.1 Apo AI dataset

368

9.2.2 Multiple testing procedures

370

9.2.3 Software implementation using the Bioconductor R package multtest

372

9.2.4 Results

376

9.3 Cancer microRNA study of Lu et al. (2005)

402

9.3.1 Cancer iniRNA dataset

403

9.3.2 Multiple testing procedures

403

9.3.3 Results

405

10 Multiple Tests of Association with Biological Annotation Metadata

413

10.1 Introduction

413

10.1.1 Motivation

413

10.1.2 Contrast with other approaches

414

10.1.3 Outline

416

10.2 Statistical framework for multiple tests of association with biological annotation metadata

417

10.2.1 Gene-annotation profiles

417

10.2.2 Gene-parameter profiles

418

10.2.3 Association measures for gene-annotation and gene-parameter profiles

419

10.2.4 Multiple hypothesis testing

422

10.3 The Gene Ontology

425

10.3.1 Overview of the Gene Ontology

425

10.3.2 Overview of R and Bioconductor software for GO annotation metadata analysis

428

10.3.3 The annotation metadata package GO

430

10.3.4 Affymetrix chip-specific annotation metadata packages: The hgu95av2 package

433

10.3.5 Assembling a GO gene-annotation matrix

437

10.4 Tests of association between GO annotation and differential gene expression in ALL

439

10.4.1 Acute lymphoblastic leukemia study of Chiaretti et al. (2004)

439

10.4.2 Multiple hypothesis testing framework

441

10.4.3 Results

448

10.5 Discussion

453

11 HIV-1 Sequence Variation and Viral Replication Capacity

477

11.1 Introduction

477

11.2 HIV-1 dataset of Segal et al. (2004)

477

11.2.1 HIV-1 sequence variation and viral replication capacity

477

11.2.2 HIV-1 dataset

478

11.3 Multiple testing procedures

479

11.3.1 Multiple testing analysis, Part I

480

11.3.2 Multiple testing analysis, Part II

480

11.4 Software implementation in SAS

481

11.5 Results

482

11.5.1 Multiple testing analysis, Part I

482

11.5.2 Multiple testing analysis, Part II

483

11.5.3 Biological interpretation

483

11.6 Discussion

484

12 Genetic Mapping of Complex Human Traits Using Single Nucleotide Polymorphisms: The ObeLinks Project

489

12.1 Introduction

489

12.1.1 Motivation

489

12.1.2 Outline

490

12.2 The ObeLinks Project

491

12.2.1 ObeLinks dataset

491

12.2.2 Galois lattices

493

12.3 Multiple testing procedures

495

12.4 Results

497

12.4.1 Body mass index

497

12.4.2 Glucose metabolism

498

12.5 Discussion

501

13 Software Implementation

519

13.1 11 package multtest

519

13.1.1 Introduction

519

13.1.2 Overview

520

13.1.3 MTP function for resampling-based multiple testing procedures

522

13.1.4 Numerical and graphical summaries of a multiple testing procedure

527

13.1.5 Software design

528

13.2 SAS macros

529

A Summary of Multiple Testing Procedures

533

B Miscellaneous Mathematical and Statistical Results

551

B.1 Probability inequalities

551

B.2 Convergence results

552

B.3 Properties of floor and ceiling functions

553

C SAS Code

555

References

561

Author Index

575

Subject Index

579

Multiple Testing Procedures with Applications to Genomics 2008 ed. [Kõva köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv