Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Machine Learning from Weak Supervision: An Empirical Risk Minimization Approach [Kõva köide]

4.00/5 (2 hinnangut Goodreads-ist)

Han Bao, Masashi Sugiyama

Formaat: Hardback, 320 pages, kõrgus x laius: 229x178 mm
Sari: Adaptive Computation and Machine Learning series
Ilmumisaeg: 23-Aug-2022
Kirjastus: MIT Press
ISBN-10: 0262047071
ISBN-13: 9780262047074

Teised raamatud teemal:

Computer programming / software development - (Hetkel poes: 4 nimetust)

Kõva köide
Hind: 86,50 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 320 pages, kõrgus x laius: 229x178 mm
Sari: Adaptive Computation and Machine Learning series
Ilmumisaeg: 23-Aug-2022
Kirjastus: MIT Press
ISBN-10: 0262047071
ISBN-13: 9780262047074

Teised raamatud teemal:

Computer programming / software development - (Hetkel poes: 4 nimetust)

Püsilink: https://www.kriso.ee/db/9780262047074.html

Märksõnad:

Supervised learning Machine learning

Fundamental theory and practical algorithms of weakly supervised classification, emphasizing an approach based on empirical risk minimization.
Standard machine learning techniques require large amounts of labeled data to work well. When we apply machine learning to problems in the physical world, however, it is extremely difficult to collect such quantities of labeled data. In this book Masashi Sugiyama, Han Bao, Takashi Ishida, Nan Lu, Tomoya Sakai and Gang Niu present theory and algorithms for weakly supervised learning, a paradigm of machine learning from weakly labeled data. Emphasizing an approach based on empirical risk minimization and drawing on state-of-the-art research in weakly supervised learning, the book provides both the fundamentals of the field and the advanced mathematical theories underlying them. It can be used as a reference for practitioners and researchers and in the classroom.The book first mathematically formulates classification problems, defines common notations, and reviews various algorithms for supervised binary and multiclass classification. It then explores problems of binary weakly supervised classification, including positive-unlabeled (PU) classification, positive-negative-unlabeled (PNU) classification, and unlabeled-unlabeled (UU) classification. It then turns to multiclass classification, discussing complementary-label (CL) classification and partial-label (PL) classification. Finally, the book addresses more advanced issues, including a family of correction methods to improve the generalization performance of weakly supervised learning and the problem of class-prior estimation.

Preface

xiii

I MACHINE LEARNING FROM WEAK SUPERVISION

1 Introduction

(18)

1.1 Machine Learning

(4)

1.1.1 Supervised Learning

(2)

1.1.2 Unsupervised Learning

(1)

1.1.3 Reinforcement Learning

(1)

1.2 Elements of Classification

(2)

1.2.1 Classifiers

(1)

1.2.2 Learning Criteria

(1)

1.2.3 Optimization Algorithms

(1)

1.3 Aspects of Machine Learning

(4)

1.3.1 Logical Learning, Biologically Inspired Learning, and Statistical Learning

(2)

1.3.2 Frequentist Learning and Bayesian Learning

(1)

1.3.3 Generative Classification and Discriminative Classification

(1)

1.3.4 Induction, Deduction, and Transduction

(1)

1.4 Improving Data Collection and Weakly Supervised Learning

(2)

1.5 Organization of This Book

(6)

1.5.1 Weakly Supervised Learning for Binary Classification

(3)

1.5.2 Weakly Supervised Learning for Multi-Class Classification

(1)

1.5.3 Advanced Topics and Perspectives

(3)

2 Formulation and Notation

(14)

2.1 Binary Classification

(10)

2.1.1 Formulation

(1)

2.1.2 Classification Models

(1)

2.1.2.1 Linear-in-input model

(1)

2.1.2.2 Linear-in-parameter model

(1)

2.1.2.3 Kernel model

(2)

2.1.2.4 Neural network model

(1)

2.1.3 Surrogate Losses

(2)

2.1.4 Training Samples

(2)

2.1.5 Regularization

(1)

2.2 Multi-Class Classification

(4)

2.2.1 Formulation

(1)

2.2.2 Surrogate Losses

(1)

2.2.3 Training Samples

(2)

3 Supervised Classification

(32)

3.1 Positive-Negative (PN) Classification

(21)

3.1.1 Formulation

(1)

3.1.1.1 One-sample case

(1)

3.1.1.2 Two-sample case

(1)

3.1.1.3 Comparison

(1)

3.1.2 Theoretical Analysis

(1)

3.1.2.1 Targets of convergence

(2)

3.1.2.2 Measures of convergence

(3)

3.1.2.3 Rademacher complexity

(3)

3.1.2.4 Rademacher complexity bounds

(5)

3.1.2.5 Estimation error bounds

(5)

3.2 Multi-Class Classification

(11)

3.2.1 Formulation

(2)

3.2.2 Theoretical Analysis

(1)

3.2.2.1 Estimation error bounds

(3)

3.2.2.2 Classification calibration

(6)

II WEAKLY SUPERVISED LEARNING FOR BINARY CLASSIFICATION

4 Positive-Unlabeled (PU) Classification

(18)

4.1 Introduction

(1)

4.2 Formulation

(1)

4.3 Unbiased Risk Estimation from PU Data

(6)

4.3.1 General Approach

(2)

4.3.2 Cost-Sensitive Approach

(1)

4.3.3 Convex Approach

(3)

4.4 Theoretical Analysis

(10)

4.4.1 PU Classification

(2)

4.4.2 NU Classification

(2)

4.4.3 Comparisons with PN Classification

(1)

4.4.3.1 Finite-sample comparisons

(3)

4.4.3.2 Asymptotic comparisons

(3)

5 Positive-Negative-Unlabeled (PNU) Classification

(26)

5.1 Introduction

(1)

5.2 Formulation

(1)

5.3 Manifold-Based Semi-Supervised Classification

(3)

5.3.1 Laplacian Regularization

(2)

5.3.2 Implementation

(1)

5.4 Information-Theoretic Semi-Supervised Classification

(4)

5.4.1 Squared-Loss Mutual Information Regularization

(3)

5.4.2 Implementation

(1)

5.5 PU+PN Classification

(9)

5.5.1 PNU and PU+NU Risk Estimators

(1)

5.5.2 PNU vs. PU+NU Classification

(1)

5.5.3 Theoretical Analysis

(1)

5.5.3.1 Estimation error bounds

(2)

5.5.3.2 Variance reduction

(4)

5.6 Experiments

103

(2)

5.6.1 Datasets

103

(1)

5.6.2 PNU Risk for Validation

103

(1)

5.6.3 Comparison with Other Methods

104

(1)

5.7 Extensions

105

(6)

5.7.1 Multi-Class Extension

105

(2)

5.7.2 AUC Maximization

107

(1)

5.7.3 Matrix Imputation

107

(4)

6 Positive-Confidence (Pconf) Classification

111

(16)

6.1 Introduction

111

(1)

6.2 Related Works

112

(1)

6.3 Problem Formulation

113

(1)

6.4 Empirical Risk Minimization (ERM) Framework

114

(2)

6.5 Theoretical Analysis

116

(3)

6.6 Implementation

119

(1)

6.7 Experiments

119

(8)

6.7.1 Synthetic Experiments with Linear Models

119

(3)

6.7.2 Benchmark Experiments with Neural Network Models

122

(5)

7 Pairwise-Constraint Classification

127

(22)

7.1 Introduction

127

(1)

7.2 Formulation

128

(3)

7.2.1 One-Sample Case

129

(1)

7.2.2 Two-Sample Case

130

(1)

7.2.3 Comparison of Sampling Schemes

130

(1)

7.2.4 Pairwise Constraints as Pointwise Data

131

(1)

7.3 Similar-Unlabeled (SU) Classification

131

(5)

7.3.1 Classification Risk Estimation

132

(1)

7.3.2 Minimum-Variance Risk Estimation

132

(2)

7.3.3 Convex Formulation

134

(1)

7.3.4 Class-Priors in SU Classification

135

(1)

7.4 Similar-Dissimilar (SD) and Dissimilar-Unlabeled (DU) Classification

136

(4)

7.4.1 Classification Risk Estimation

136

(1)

7.4.2 Interpretation of SD Risk

137

(3)

7.5 Similar-Dissimilar-Unlabeled (SDU) Classification

140

(1)

7.6 Theoretical Analysis

140

(3)

7.6.1 Derivation of Estimation Error Bounds

140

(1)

7.6.2 Comparison of Estimation Error Bounds

141

(2)

7.7 Experiments

143

(5)

7.7.1 Setup

143

(1)

7.7.2 Illustration of SU Classification

143

(2)

7.7.3 Comparison of SU Classification with Other Methods

145

(1)

7.7.4 Comparison of SDU Classification with Other Methods

146

(2)

7.8 Ongoing Research

148

(1)

8 Unlabeled-Unlabeled (UU) Classification

149

(28)

8.1 Introduction

149

(1)

8.2 Problem Formulation

150

(2)

8.2.1 Data Generation Process

151

(1)

8.2.2 Performance Measures

151

(1)

8.2.3 Relation to Classification with Noisy Labels

152

(1)

8.3 Risk Estimation from UU Data

152

(12)

8.3.1 Risk Estimation from One Set of U Data

153

(2)

8.3.2 Risk Estimation from Two Sets of U Data

155

(1)

8.3.2.1 Risk estimation

155

(2)

8.3.2.2 Simplification

157

(1)

8.3.2.3 Special cases

157

(1)

8.3.3 Theoretical Analysis

158

(1)

8.3.4 Experiments

159

(1)

8.3.4.1 Setup

159

(1)

8.3.4.2 Benchmark experiments with neural network models

160

(3)

8.3.4.3 Comparison with other methods

163

(1)

8.4 Generative Approach

164

(13)

8.4.1 Analysis of Bayes-Optimal Classifier

164

(1)

8.4.2 KDE-Based Algorithm

165

(1)

8.4.3 LSDD-Based Algorithm

166

(2)

8.4.4 DSDD-Based Algorithm

168

(3)

8.4.5 Experiments

171

(6)

III WEAKLY SUPERVISED LEARNING FOR MULTI-CLASS CLASSIFICATION

9 Complementary-Label Classification

177

(16)

9.1 Introduction

177

(1)

9.2 Risk Estimation from CL Data

178

(4)

9.2.1 Formulation

178

(1)

9.2.2 Risk Estimation

179

(1)

9.2.3 Case-Study for Symmetric Losses

180

(1)

9.2.4 Relation to Classification with Noisy Labels

181

(1)

9.3 Theoretical Analysis

182

(3)

9.4 Incorporation of Ordinary-Labels

185

(1)

9.5 Experiments

185

(2)

9.5.1 Experiments with CL

186

(1)

9.5.2 Experiments with CL and OL

186

(1)

9.6 Incorporation of Multi-Complementary-Labels

187

(6)

9.6.1 Formulation

187

(2)

9.6.2 Comparison with Multiple Single CLs

189

(1)

9.6.3 Unbiased Risk Estimator

190

(1)

9.6.4 Estimation Error Bound

191

(2)

10 Partial-Label Classification

193

(14)

10.1 Introduction

193

(1)

10.2 Formulation and Assumptions

193

(2)

10.2.1 Formulation

194

(1)

10.2.2 Data Generation Assumption

194

(1)

10.3 Risk Estimation

195

(1)

10.4 Experiments

196

(2)

10.5 Proper Partial-Label (PPL) Classification

198

(9)

10.5.1 Data Generation Assumption

198

(2)

10.5.2 Risk Estimation

200

(1)

10.5.3 Theoretical Analysis

201

(6)

IV ADVANCED TOPICS AND PERSPECTIVES

11 Non-Negative Correction for Weakly Supervised Classification

207

(32)

11.1 Introduction

207

(1)

11.2 Overfitting of Unbiased Learning Objectives

208

(3)

11.2.1 Binary Classification

208

(2)

11.2.2 Multi-Class Classification

210

(1)

11.3 Numerical Illustration

211

(2)

11.4 Non-Negative Correction

213

(5)

11.4.1 NNPU Classification

213

(2)

11.4.2 NNPNU Classification

215

(1)

11.4.3 NNUU Classification

216

(1)

11.4.4 NNCL Classification

217

(1)

11.4.5 CCUU Classification

217

(1)

11.5 Theoretical Analyses

218

(11)

11.5.1 Bias and Consistency

219

(5)

11.5.2 Estimation Error

224

(5)

11.6 Experiments

229

(10)

11.6.1 Comparison of PN, uPU, and nnPU Classification

229

(4)

11.6.2 Comparison of uCL and nnCL Classification

233

(2)

11.6.3 Comparison of uUU and ccUU Classification

235

(4)

12 Class-Prior Estimation

239

(36)

12.1 Introduction

239

(2)

12.2 Full Distribution Matching

241

(1)

12.3 Mixture Proportion Estimation

242

(6)

12.3.1 Estimation Goal and Optimization Goal

243

(1)

12.3.2 Redefinition of Optimization Goal

244

(1)

12.3.3 Irreducibility Assumption

245

(1)

12.3.4 Anchor Set/Point Assumption

246

(2)

12.3.5 Remarks

248

(1)

12.4 Partial Distribution Matching

248

(6)

12.4.1 Formulation

248

(1)

12.4.2 Differentiable Divergences

249

(2)

12.4.3 Non-Differentiable Divergences

251

(1)

12.4.4 Empirical f-Divergence Estimation

252

(2)

12.5 Penalized L1-Distance Minimization

254

(8)

12.5.1 Penalized L1-Distance

254

(2)

12.5.2 Practical Implementation

256

(2)

12.5.3 Theoretical Analysis

258

(1)

12.5.3.1 Realizability assumption

258

(1)

12.5.3.2 Summary of main results

259

(1)

12.5.3.3 Proofs of main results

259

(2)

12.5.3.4 On the convergence rate of np

261

(1)

12.6 Class-Prior Estimation with Regrouping

262

(11)

12.6.1 Motivation

262

(1)

12.6.2 Practical Implementation

263

(2)

12.6.3 Theoretical Justification

265

(1)

12.6.3.1 A formal definition of regrouping

265

(2)

12.6.3.2 Bias reduction

267

(1)

12.6.3.3 Convergence analysis

268

(2)

12.6.3.4 Computationally efficient identification of A*

270

(1)

12.6.3.5 Approximation of p'P with a surrogate

271

(2)

12.7 Class-Prior Estimation from Pairwise Data

273

(2)

13 Conclusions and Prospects

275

(4)

Notes

279

(4)

Bibliography

283

(10)

Index

293

Machine Learning from Weak Supervision: An Empirical Risk Minimization Approach [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv