Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Statistical Learning for Biomedical Data

5.00/5 (1 hinnangut Goodreads-ist)

James D. Malley, Karen G. Malley, Sinisa Pajevic

Formaat: PDF+DRM
Sari: Practical Guides to Biostatistics and Epidemiology
Ilmumisaeg: 24-Feb-2011
Kirjastus: Cambridge University Press
Keel: eng
ISBN-13: 9780511985546

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 48,15 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Sari: Practical Guides to Biostatistics and Epidemiology
Ilmumisaeg: 24-Feb-2011
Kirjastus: Cambridge University Press
Keel: eng
ISBN-13: 9780511985546

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This book is for anyone who has biomedical data and needs to identify variables that predict an outcome, for two-group outcomes such as tumor/not-tumor, survival/death, or response from treatment. Statistical learning machines are ideally suited to these types of prediction problems, especially if the variables being studied may not meet the assumptions of traditional techniques. Learning machines come from the world of probability and computer science but are not yet widely used in biomedical research. This introduction brings learning machine techniques to the biomedical world in an accessible way, explaining the underlying principles in nontechnical language and using extensive examples and figures. The authors connect these new methods to familiar techniques by showing how to use the learning machine models to generate smaller, more easily interpretable traditional models. Coverage includes single decision trees, multiple-tree techniques such as Random Forests™, neural nets, support vector machines, nearest neighbors and boosting.

Biomedical researchers need machine learning techniques to make predictions such as survival/death or response to treatment when data sets are large and complex. This highly motivating introduction to these machines explains underlying principles in nontechnical language, using many examples and figures, and connects these new methods to familiar techniques.

Arvustused

'The book is well written and provides nice graphics and numerous applications.' Michael R. Chernick, Technometrics

Muu info

This highly motivating introduction to statistical learning machines explains underlying principles in nontechnical language, using many examples and figures.

Preface

Acknowledgments

xii

Part I Introduction

(88)

1 Prologue

(11)

1.1 Machines that learn - some recent history

(4)

1.2 Twenty canonical questions

(2)

1.3 Outline of the book

(2)

1.4 A comment about example datasets

(1)

1.5 Software

(1)

Note

(1)

2 The landscape of learning machines

(27)

2.1 Introduction

(1)

2.2 Types of data for learning machines

(2)

2.3 Will that be supervised or unsupervised?

(1)

2.4 An unsupervised example

(2)

2.5 More lack of supervision - where are the parents?

(1)

2.6 Engines, complex and primitive

(2)

2.7 Model richness means what, exactly?

(3)

2.8 Membership or probability of membership?

(2)

2.9 A taxonomy of machines?

(3)

2.10 A note of caution - one of many

(1)

2.11 Highlights from the theory

(6)

Notes

(5)

3 A mangle of machines

(16)

3.1 Introduction

(1)

3.2 Linear regression

(1)

3.3 Logistic regression

(1)

3.4 Linear discriminant

(2)

3.5 Bayes classifiers – regular and naive

(2)

3.6 Logic regression

(1)

3.7 k-Nearest neighbors

(2)

3.8 Support vector machines

(3)

3.9 Neural networks

(1)

3.10 Boosting

(1)

3.11 Evolutionary and genetic algorithms

(1)

Notes

(1)

4 Three examples and several machines

(32)

4.1 Introduction

(1)

4.2 Simulated cholesterol data

(3)

4.3 Lupus data

(1)

4.4 Stroke data

(1)

4.5 Biomedical means unbalanced

(1)

4.6 Measures of machine performance

(2)

4.7 Linear analysis of cholesterol data

(1)

4.8 Nonlinear analysis of cholesterol data

(3)

4.9 Analysis of the lupus data

(5)

4.10 Analysis of the stroke data

(4)

4.11 Further analysis of the lupus and stroke data

(8)

Notes

(2)

Part II A machine toolkit

(66)

5 Logistic regression

(27)

5.1 Introduction

(1)

5.2 Inside and around the model

(1)

5.3 Interpreting the coefficients

(1)

5.4 Using logistic regression as a decision rule

(1)

5.5 Logistic regression applied to the cholesterol data

(4)

5.6 A cautionary note

(3)

5.7 Another cautionary note

101

(1)

5.8 Probability estimates and decision rules

102

(1)

5.9 Evaluating the goodness-of-fit of a logistic regression model

103

(3)

5.10 Calibrating a logistic regression

106

(5)

5.11 Beyond calibration

111

(2)

5.12 Logistic regression and reference models

113

(2)

Notes

115

(3)

6 A single decision tree

118

(19)

6.1 Introduction

118

(1)

6.2 Dropping down trees

118

(2)

6.3 Growing a tree

120

(1)

6.4 Selecting features, making splits

120

(1)

6.5 Good split, bad split

121

(3)

6.6 Finding good features for making splits

124

(1)

6.7 Misreading trees

125

(2)

6.8 Stopping and pruning rules

127

(1)

6.9 Using functions of the features

128

(1)

6.10 Unstable trees?

129

(3)

6.11 Variable importance - growing on trees?

132

(2)

6.12 Permuting for importance

134

(1)

6.13 The continuing mystery of trees

135

(2)

7 Random Forests - trees everywhere

137

(18)

7.1 Random Forests in less than five minutes

137

(1)

7.2 Random treks through the data

138

(1)

7.3 Random treks through the features

139

(1)

7.4 Walking through the forest

140

(1)

7.5 Weighted and unweighted voting

140

(2)

7.6 Finding subsets in the data using proximities

142

(2)

7.7 Applying Random Forests to the Stroke data

144

(7)

7.8 Random Forests in the universe of machines

151

(2)

Notes

153

(2)

Part III Analysis fundamentals

155

(90)

8 Merely two variables

157

(14)

8.1 Introduction

157

(1)

8.2 Understanding correlations

158

(1)

8.3 Hazards of correlations

159

(4)

8.4 Correlations big and small

163

(5)

Notes

168

(3)

9 More than two variables

171

(27)

9.1 Introduction

171

(1)

9.2 Tiny problems, large consequences

172

(2)

9.3 Mathematics to the rescue?

174

(2)

9.4 Good models need not be unique

176

(3)

9.5 Contexts and coefficients

179

(2)

9.6 Interpreting and testing coefficients in models

181

(5)

9.7 Merging models, pooling lists, ranking features

186

(4)

Notes

190

(8)

10 Resampling methods

198

(17)

10.1 Introduction

198

(1)

10.2 The bootstrap

198

(3)

10.3 When the bootstrap works

201

(1)

10.4 When the bootstrap doesn't work

202

(1)

10.5 Resampling from a single group in different ways

203

(1)

10.6 Resampling from groups with unequal sizes

204

(2)

10.7 Resampling from small datasets

206

(1)

10.8 Permutation methods

207

(3)

10.9 Still more on permutation methods

210

(4)

Note

214

(1)

11 Error analysis and model validation

215

(30)

11.1 Introduction

215

(2)

11.2 Errors? What errors?

217

(1)

11.3 Unbalanced data, unbalanced errors

218

(1)

11.4 Error analysis for a single machine

219

(3)

11.5 Cross-validation error estimation

222

(2)

11.6 Cross-validation or cross-training?

224

(2)

11.7 The leave-one-out method

226

(1)

11.8 The out-of-bag method

227

(1)

11.9 Intervals for error estimates for a single machine

228

(2)

11.10 Tossing random coins into the abyss

230

(2)

11.11 Error estimates for unbalanced data

232

(1)

11.12 Confidence intervals for comparing error values

233

(3)

11.13 Other measures of machine accuracy

236

(2)

11.14 Benchmarking and winning the lottery

238

(1)

11.15 Error analysis for predicting continuous outcomes

239

(1)

Notes

240

(5)

Part IV Machine strategies

245

(18)

12 Ensemble methods — let's take a vote

247

(8)

12.1 Pools of machines

247

(1)

12.2 Weak correlation with outcome can be good enough

247

(3)

12.3 Model averaging

250

(4)

Notes

254

(1)

13 Summary and conclusions

255

(8)

13.1 Where have we been?

255

(2)

13.2 So many machines

257

(2)

13.3 Binary decision or probability estimate?

259

(1)

13.4 Survival machines? Risk machines?

259

(1)

13.5 And where are we going?

260

(3)

Appendix

263

(8)

References

271

(10)

Index

281

James D. Malley is a Research Mathematical Statistician in the Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, at the National Institutes of Health. Karen G. Malley is president of Malley Research Programming, Inc. in Rockville, Maryland, providing statistical programming services to the pharmaceutical industry and the National Institutes of Health. She also serves on the global council of the Clinical Data Interchange Standards Consortium (CDISC) user network, and the steering committee of the Washington, DC area CDISC user network. Sinisa Pajevic is a Staff Scientist in the Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, at the National Institutes of Health.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97805119855462e.html

Märksõnad:

E-raamat: Statistical Learning for Biomedical Data

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Muu info

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv