Customer Support: +372 7440010

Help | New account | Log In

Artificial Intelligence and Causal Inference [Hardback]

Momiao Xiong (University of Texas School of Public Health, USA)

Format: Hardback, 368 pages, height x width: 280x210 mm, weight: 1520 g, 3 Tables, black and white; 72 Line drawings, black and white; 72 Illustrations, black and white
Series: Chapman & Hall/CRC Machine Learning & Pattern Recognition
Pub. Date: 08-Mar-2022
Publisher: Chapman & Hall/CRC
ISBN-10: 0367859408
ISBN-13: 9780367859404

Other books in subject:

Probability & statistics - (Currently in stock: 2 items)
Automatic control engineering

Hardback
Price: 158,05 €
This book is not in stock. Book will arrive in about 2-4 weeks. Please allow another 2 weeks for shipping outside Estonia.
Quantity:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Add to basket
Delivery time 4-6 weeks
Add to Wishlist
For Libraries

Format: Hardback, 368 pages, height x width: 280x210 mm, weight: 1520 g, 3 Tables, black and white; 72 Line drawings, black and white; 72 Illustrations, black and white
Series: Chapman & Hall/CRC Machine Learning & Pattern Recognition
Pub. Date: 08-Mar-2022
Publisher: Chapman & Hall/CRC
ISBN-10: 0367859408
ISBN-13: 9780367859404

Other books in subject:

Probability & statistics - (Currently in stock: 2 items)
Automatic control engineering

Permanent link: https://www.kriso.ee/db/9780367859404.html

Keywords:

"Artificial Intelligence and Causal Inference address the recent development of relationships between artificial intelligence (AI) and causal inference. Despite significant progress in AI, a great challenge in AI development we are still facing is to understand mechanism underlying intelligence, including reasoning, planning and imagination. Understanding, transfer and generalization are major principles that give rise intelligence. One of a key component for understanding is causal inference. Causal inference includes intervention, domain shift learning, temporal structure and counterfactual thinking as major concepts to understand causation and reasoning. Unfortunately, these essential components of the causality are often overlooked by machine learning, which leads to some failure of the deep learning. AI and causal inference involve (1) using AI techniques as major tools for causal analysis and (2) applying the causal concepts and causal analysis methods to solving AI problems. The purpose of this book is to fill the gap between the AI and modern causal analysis for further facilitating the AI revolution. This book is ideal for graduate students and researchers in AI, data science, causal inference, statistics, genomics, bioinformatics and precision medicine"--

Artificial Intelligence and Causal Inference address the recent development of relationships between artificial intelligence (AI) and causal inference. Despite significant progress in AI, a great challenge in AI development we are still facing is to understand mechanism underlying intelligence, including reasoning, planning and imagination. Understanding, transfer and generalization are major principles that give rise intelligence. One of a key component for understanding is causal inference. Causal inference includes intervention, domain shift learning, temporal structure and counterfactual thinking as major concepts to understand causation and reasoning. Unfortunately, these essential components of the causality are often overlooked by machine learning, which leads to some failure of the deep learning. AI and causal inference involve (1) using AI techniques as major tools for causal analysis and (2) applying the causal concepts and causal analysis methods to solving AI problems. The purpose of this book is to fill the gap between the AI and modern causal analysis for further facilitating the AI revolution. This book is ideal for graduate students and researchers in AI, data science, causal inference, statistics, genomics, bioinformatics and precision medicine.

Key Features:

Cover three types of neural networks, formulate deep learning as an optimal control problem and use Pontryagin’s Maximum Principle for network training.
Deep learning for nonlinear mediation and instrumental variable causal analysis.
Construction of causal networks is formulated as a continuous optimization problem.
Transformer and attention are used to encode-decode graphics. RL is used to infer large causal networks.
Use VAE, GAN, neural differential equations, recurrent neural network (RNN) and RL to estimate counterfactual outcomes.
AI-based methods for estimation of individualized treatment effect in the presence of network interference.

Reviews

" Both deep learning and causal inference are fast-moving fields, and the author covers the latest topics and methods well. The book has a high ratio of equations to text, and even more technical material is contained in appendices at the end of each chapter."

Stanley E. Lazic, University of Ottawa, Series A: Statisics in Society, 2022.

"The book is suitable for use in a graduate-level course on AI. The exercises are challenging but their answers are provided in the end of the book. Not all contents are understandable by the statistics community or commonly useful in the practice of statistics. I enjoyed reading this book. I recommend this book to engineering, data science, predictive business, statistics and computing professionals."

Ramalingam Shanmugam, School of Health Administration, Texas State University, San Marcos, Texas, Journal of Statistical Computation and Simulation, 2023.

Preface

xix

Chapter 1 Deep Neural Networks

(44)

1.1 Three Types Of Neural Networks

(16)

1.1.1 Multilayer Feedforward Neural Networks

(1)

1.1.1.1 Architecture of Feedforward Neural Networks

(2)

1.1.1.2 Loss Function and Training Algorithms

(2)

1.1.2 Convolutional Neural Network

(1)

1.1.2.1 Convolution

(2)

1.1.2.2 Nonlinearity (ReLU)

(1)

1.1.2.3 Pooling

(1)

1.1.2.4 Fully Connected Layers

(1)

1.1.3 Recurrent Neural Networks

(1)

1.1.3.1 Simple RNN

(4)

1.1.3.2 Gated Recurrent Units

(1)

1.1.3.3 Long Short-Term Memory (LSTM)

(1)

1.1.3.4 Applications of RNN to Modeling and Forecasting of Dynamic Systems

(1)

1.1.3.5 Recurrent State Space Models with Autonomous Adjusted Intervention Variable

(2)

1.2 Dynamic Approach To Deep Learning

(6)

1.2.1 Differential Equations for Neural Networks

(1)

1.2.2 Ordinary Differential Equations for ResNets

(1)

1.2.3 Ordinary Differential Equations for Reversible Neural Networks

(1)

1.2.3.1 Stability of Dynamic Systems

(1)

1.2.3.2 Second Method of Lyapunov

(2)

1.2.3.3 Lyapunov Exponent

(1)

1.2.3.4 Reversible ResNet

(1)

1.2.3.5 Residual Generative Adversarial Networks

(1)

1.2.3.6 Normalizing Flows

(1)

1.3 Optimal Control For Deep Learning

(7)

1.3.1 Mathematic Formulation of Optimal Control

(1)

1.3.2 Pontryagin's Maximum Principle

(1)

1.3.3 Optimal Control Approach to Parameter Estimation

(1)

1.3.4 Learning Nonlinear State Space Models

(1)

1.3.4.1 Joint Estimation of Parameters and Controls

(2)

1.3.4.2 Multiple Samples and Parameter Estimation

(1)

1.3.4.3 Optimal Control Problem

(1)

Software Package

(1)

Appendix 1A Brief Introduction Of Tensor Calculus

(7)

1A1 Tensor Algebra

(5)

1A2 Tensor Calculus

(2)

Appendix 1B Calculate Gradient Of Cross Entropy Loss Function

(1)

Appendix 1C Optimal Control And Pontryagin's Maximum Principle

(7)

1C1 Optimal Control

(1)

1C2 Pontryagin's Maximum Principle

(1)

1C3 Calculus of Variation

(2)

1C4 Proof of Pontryagin's Maximum Principle

(2)

Exercises

(2)

Chapter 2 Gaussian Processes and Learning Dynamic for Wide Neural Networks

(18)

2.1 Introduction

(1)

2.2 Linear Models For Learning In Neural Networks

(3)

2.2.1 Notation and Mathematic Formulation of Dynamics of Parameter Estimation Process

(2)

2.2.2 Linearized Neural Networks

(1)

2.3 Gaussian Processes

(4)

2.3.1 Motivation

(1)

2.3.2 Gaussian Process Models

(2)

2.3.3 Gaussian Processes for Regression

(1)

2.3.3.1 Prediction with Noise-Free Observations

(1)

2.3.3.2 Prediction with Noise Observations

(1)

2.4 Wide Neural Network As A Gaussian Process

(3)

2.4.1 Gaussian Process for Single-Layer Neural Networks

(1)

2.4.2 Gaussian Process for Multilayer Neural Networks

(2)

Appendix 2A Recursive Formula For Ntk Calculation

(6)

Appendix 2B Analytic Formula For Parameter Estimation In The Linearized Neural Networks

(2)

Exercises

(2)

Chapter 3 Deep Generative Models

(46)

3.1 Variational Inference

(13)

3.1.1 Introduction

(1)

3.1.2 Variational Inference as Optimization

(1)

3.1.3 Variational Bound and Variational Objective

(1)

3.1.4 Mean-Field Variational Inference

(1)

3.1.4.1 A General Framework

(1)

3.1.4.2 Bayesian Mixture of Gaussians

(4)

3.1.4.3 Mean-Field Variational Inference with Exponential Family

(3)

3.1.5 Stochastic Variational Inference

(1)

3.1.5.1 Natural Gradient Decent

(2)

3.1.5.2 Revisit Variational Distribution for Exponential Family

(2)

3.2 Variational Autoencoder

(8)

3.2.1 Autoencoder

(1)

3.2.2 Deep Latent Variable Models and Intractability of Likelihood Function

(1)

3.2.3 Approximate Techniques and Recognition Model

(1)

3.2.4 Framework of VAE

(1)

3.2.5 Optimization of the ELBO and Stochastic Gradient Method

(1)

3.2.6 Reparameterization Trick

(1)

3.2.7 Gradient of Expectation and Gradient of ELBO

(1)

3.2.8 Bernoulli Generative Model

(1)

3.2.9 Factorized Gaussian Encoder

(1)

3.2.10 Full Gaussian Encoder

(1)

3.2.11 Algorithms for Computing ELBO

(1)

3.2.12 Improve the Lower Bound

(1)

3.2.12.1 Importance Weighted Autoencoder

(1)

3.2.12.2 Connection between ELBO and KL Distance

(1)

3.3 Other Types Of Variational Autoencoder

(13)

3.3.1 Convolutional Variational Autoencoder

(1)

3.3.1.1 Encoder

(1)

3.3.1.2 Bottleneck

(1)

3.3.1.3 Decoder

(1)

3.3.2 Graphic Convolutional Variational Autoencoder

(1)

3.3.2.1 Notation and Basic Concepts for Graph Autoencoder

(1)

3.3.2.2 Spectral-Based Convolutional Graph Neural Networks

(5)

3.3.2.3 Graph Convolutional Encoder

(1)

3.3.2.4 Graph Convolutional Decoder

(1)

3.3.2.5 Loss Function

(1)

3.3.2.6 A Typical Approach to Variational Graph Autoencoders

(2)

3.3.2.7 Directed Graph Variational Autoencoder

(2)

3.3.2.8 Graph VAE for Clustering

(1)

Software Package

(1)

Appendix 3A

(1)

Appendix 3B Derivation Of Algorithms For Variational Graph Autoencoders

(5)

3B1 Evidence of Lower Bound

(1)

3B2 The Reparameterization Trick

(1)

3B3 Stochastic Gradient Variational Bayes (SGVB) Estimator

(1)

3B4 Neural Network Implementation

100

(2)

Appendix 3C Matrix Normal Distribution

102

(7)

3C1 Notations and Definitions

102

(2)

3C2 Properties of Matrix Normal Distribution

104

(2)

Exercises

106

(3)

Chapter 4 Generative Adversarial Networks

109

(42)

4.1 Introduction

109

(1)

4.2 Generative Adversarial Networks

109

(8)

4.2.1 Framework and Architecture of GAN

109

(1)

4.2.2 Loss Function

110

(1)

4.2.3 Optimal Solutions

111

(1)

4.2.4 Algorithm

112

(1)

4.2.5 Wasserstein GAN

113

(1)

4.2.5.1 Different Distances

113

(1)

4.2.5.2 The Kantorovich-Rubinstein Duality

114

(2)

4.2.5.3 Wasserstein GAN

116

(1)

4.3 Types Of Gan Models

117

(21)

4.3.1 Conditional GAN

117

(1)

4.3.1.1 Classical CGAN

117

(1)

4.3.1.2 Robust CGAN

118

(1)

4.3.2 Adversarial Autoencoder and Bidirectional GAN

119

(1)

4.3.2.1 Adversarial Autoencoder (AAE)

119

(1)

4.3.2.2 Bidirectional GAN

119

(1)

4.3.2.3 Anomaly Detection by BiGAN

120

(1)

4.3.3 Graph Representation in GAN

121

(1)

4.3.3.1 Adversarially Regularized Graph Autoencoder

121

(5)

4.3.3.2 Cycle-Consistent Adversarial Networks

126

(1)

4.3.3.3 Conditional Variational Autoencoder and Conditional Generative Adversarial Networks

127

(4)

4.3.3.4 Integrated Conditional Graph Variational Adversarial Networks

131

(3)

4.3.4 Deep Convolutional Generative Adversarial Network

134

(1)

4.3.4.1 Architecture of DCGAN

134

(1)

4.3.4.2 Generator

135

(1)

4.3.4.3 Discriminator Network

136

(1)

4.3.5 Multi-Agent GAN

136

(2)

4.4 Generative Implicit Networks For Causal Inference With Measured And Unmeasured Confounders

138

(13)

4.4.1 Generative Implicit Models

138

(1)

4.4.2 Loss Function

139

(1)

4.4.2.1 Bernoulli Loss

139

(1)

4.4.2.2 Loss Function for the Generative Implicit Models

140

(1)

4.4.3 Divergence Minimization

141

(4)

4.4.4 Lower Bound of the f-Divergence

145

(1)

4.4.4.1 Tighten Lower Bound of the f-Divergence

145

(1)

4.4.5 Representation for the Variational Function

146

(1)

4.4.6 Single-Step Gradient Method for Variational Divergence Minimization (VDM)

147

(1)

4.4.7 Random Vector Functional Link Network for Pearson %2 Divergence

147

(2)

Software Package

149

(1)

Exercises

149

(2)

Chapter 5 Deep Learning for Causal Inference

151

(58)

5.1 Functional Additive Models For Causal Inference

151

(11)

5.1.1 Correlation, Causation, and Do-Calculus

151

(1)

5.1.2 The Rules of Do-Calculus

152

(3)

5.1.3 Structural Equation Models and Additive Noise Models for Two or Two Sets of Variables

155

(2)

5.1.4 VAE and ANMs for Causal Analysis

157

(1)

5.1.4.1 Evidence Lower Bound (ELBO) for ANM

157

(1)

5.1.4.2 Computation of the ELBO

158

(2)

5.1.5 Classifier Two-Sample Test for Causation

160

(1)

5.1.5.1 Procedures of the VCTEST (Figure 5.5)

161

(1)

5.2 Learning Structural Causal Models With Graph Neural Networks

162

(13)

5.2.1 A General Framework for Formulation of Causal Inference into Continuous Optimization

162

(1)

5.2.1.1 Score Function and New Acyclic Constraint

162

(2)

5.2.2 Parameter Estimation and Optimization

164

(1)

5.2.2.1 Transform the Equality Constrained Optimization Problem into Unconstrained Optimization Problem

164

(2)

5.2.2.2 Compact Representation for the Hessian Approximation Ek and Limited-Memory-BFGS

166

(1)

5.2.3 VAE for Learning Structural Models and DAG among Observed Variables

167

(1)

5.2.3.1 Linear Structure Equation Model and Graph Neural Network Model

167

(1)

5.2.3.2 ELBO for Learning the Generative Model

167

(1)

5.23.3 Computation of ELBO

168

(2)

5.2.3.4 Optimization Formulation for Learning DAG

170

(2)

5.2.4 Loss Function and Acyclicity Constraint

172

(1)

5.2.4.1 OLS Loss Function

172

(1)

5.2.4.2 A New Characterization of Acyclicity

173

(2)

5.3 Latent Causal Structure

175

(4)

5.3.1 Latent Space and Latent Representation

175

(1)

5.3.2 Mapping Observed Variables to the Latent Space

175

(1)

5.3.2.1 Mask Layer

176

(1)

5.3.2.2 Encoder and Decoder for Latent Causal Graph

176

(1)

5.3.3 ELBO for the Log-Likelihood log pθ(Y|X)

177

(1)

5.3.4 Computation of ELBO

178

(1)

5.3.4.1 Encoder

178

(1)

5.3.4.2 Decoder

178

(1)

5.3.4.3 Learning Latent Causal Graph

179

(1)

5.3.5 Optimization for Learning the Latent DAG

179

(1)

5.4 Causal Mediation Analysis

179

(4)

5.4.1 Basics of Mediation Analysis

180

(1)

5.4.1.1 Univariate Mediation Model

180

(1)

5.4.1.2 Multivariate Mediation Analysis

180

(1)

5.4.1.3 Cascade Unobserved Mediator Model

181

(1)

5.4.1.4 Unobserved Multivariate Mediation Model

181

(1)

5.4.2 VAE for Cascade Unobserved Mediator Model

181

(1)

5.4.2.1 ELBO for Cascade Mediator Model

181

(1)

5.4.2.2 Encoder and Decoder

182

(1)

5.4.2.3 Test Statistics

183

(1)

5.5 Confounding

183

(4)

5.5.1 Deep Latent Variable Models for Causal Inference under Unobserved Confounders

183

(1)

5.5.2 Treatment Effect Formulation for Causal Inference with Unobserved Confounder

184

(1)

5.5.2.1 Decoder

184

(1)

5.5.2.2 Encoder

185

(1)

5.5.3 Elbo

185

(2)

5.6 Instrumental Variable Models

187

(7)

5.6.1 Simple Linear IV Regression and Mendelian Randomization

187

(2)

5.6.1.1 Two-Stage Least Square Method

189

(1)

5.6.1.2 Assumptions of IV

190

(1)

5.6.2 IV and Deep Latent Variable Models

190

(1)

5.6.2.1 Decoder

190

(2)

5.6.2.2 Encoder

192

(1)

5.6.2.3 ELBO

192

(1)

Software Package

193

(1)

Appendix 5A Derive Evidence Lower Bound (Elbo) For Anm

194

(1)

Appendix 5B Approximation Of Evidence Lower Bound (Elbo) For Anm

195

(1)

Appendix 5C Computation Of Kl Distance

195

(1)

Appendix 5D Bfgs And Limited Bfgs Updating Algorithm

196

(5)

Appendix 5E Nonsmooth Optimization Analysis

201

(1)

Appendix 5F Computation Of Elbo For Learning Sems

202

(7)

5F1 ELBO for SEMs

202

(1)

5F2 The Reparameterization Trick

203

(1)

5F3 Stochastic Gradient Variational Bayes (SGVB) Estimator

203

(1)

3F4 Neural Network Implementation

204

(3)

Exercises

207

(2)

Chapter 6 Causal Inference in Time Series

209

(38)

6.1 Introduction

209

(1)

6.2 Four Concepts Of Causality For Multiple Time Series

209

(2)

6.2.1 Granger Causality

209

(1)

6.2.2 Sims Causality

210

(1)

6.2.3 Intervention Causality

210

(1)

6.2.4 Structural Causality

211

(1)

6.3 Statistical Methods For Granger Causality Inference In Time Series

211

(25)

6.3.1 Bivariate Granger Causality Test

211

(1)

6.3.1.1 Bivariate Linear Granger Causality Test

211

(1)

6.3.1.2 Bivariate Nonlinear Causality Test

212

(2)

6.3.2 Multivariate Granger Causality Test

214

(1)

6.3.2.1 Multivariate Linear Granger Causality Test

214

(2)

6.3.3 Nonstationary Time Series Granger Causal Analysis

216

(1)

6.3.3.1 Background

216

(10)

6.3.3.2 Multivariate Nonlinear Causality Test for Nonstationary Time Series

226

(4)

6.3.4 Granger Causal Networks

230

(1)

6.3.4.1 Introduction

230

(1)

6.3.4.2 Architecture of Granger Causal Networks

230

(1)

6.3.4.3 Component-Wise Multilayer Perceptron (cMPL) for Inferring Granger Causal Networks

231

(1)

6.3.4.4 Component-Wise Recurrent Neural Networks (cRNNs) for Inferring Granger Causal Networks

232

(1)

6.3.4.5 Statistical Recurrent Units for Inferring Granger Causal Networks

233

(3)

6.4 Nonlinear Structural Equation Models For Causal Inference On Multivariate Time Series

236

(2)

Software Package

238

(1)

Appendix 6A Test Statistic Tnng Asymptotically Follows A Normal Distribution

238

(2)

Appendix 6B Hsic-Based Tests For Independence Between Two Stationary Multivariate Time Series

240

(7)

6B1 Reproducing Kernel Hilbert Space

240

(3)

6B2 Tensor Product

243

(1)

6B3 Cross-Covariance Operator

244

(1)

6B4 The Hilbert-Schmidt Independence Criterion

245

(1)

Exercises

246

(1)

Chapter 7 Deep Learning for Counterfactual Inference and Treatment Effect Estimation

247

(46)

7.1 Introduction

247

(9)

7.1.1 Potential Outcome Framework and Counterfactual Causal Inference

247

(1)

7.1.2 Assumptions and Average Treatment Effect

248

(3)

7.1.3 Traditional Methods without Unobserved Confounders

251

(1)

7.1.3.1 Regression Adjustment

251

(1)

7.1.3.2 Propensity Score Methods

251

(1)

7.1.3.3 Doubly Robust Estimation (DRE) and G-Methods

252

(3)

7.1.3.4 Targeted Maximum Likelihood Estimator (TMLE)

255

(1)

7.2 Combine Deep Learning With Classical Treatment Effect Estimation Methods

256

(2)

7.2.1 Adaptive Learning for Treatment Effect Estimation

256

(1)

7.2.1.1 Problem Formulation

256

(1)

7.2.2 Architecture of Neural Networks

256

(1)

7.2.3 Targeted Regularization

257

(1)

7.3 Counterfactual Variational Autoencoder

258

(3)

7.3.1 Introduction

258

(1)

7.3.2 Variational Autoencoders

259

(1)

7.3.2.1 CVAE

259

(1)

7.3.2.2 iVAE

259

(1)

7.3.3 Architecture of CFVAE

259

(1)

7.3.4 ELBO

260

(1)

7.3.4.1 Encoder

260

(1)

7.3.4.2 Decoder

260

(1)

7.3.4.3 Computation of the KL Distance

260

(1)

7.3.4.4 Calculation of ELBO

261

(1)

7.4 Variational Autoencoder For Survival Analysis

261

(8)

7.4.1 Introduction

261

(1)

7.4.2 Notations and Problem Formulation

262

(1)

7.4.3 Classical Survival Analysis Theory

262

(1)

7.4.4 Potential Outcome (Survival Time) and Censoring Time Distributions

263

(1)

7.4.5 VAE Causal Survival Analysis

264

(1)

7.4.5.1 Deep Latent Model

264

(1)

7.4.5.2 ELBO

264

(1)

7.4.5.3 Encoder

265

(1)

7.4.5.4 Decoder

265

(1)

7.4.5.5 Computation of the KL Distance

265

(1)

7.4.5.6 Calculation of ELBO

266

(1)

7.4.5.7 Prediction

266

(1)

7.4.6 VAE-Cox Model for Survival Analysis

267

(1)

7.4.6.1 Cox Model

267

(1)

7.4.6.2 Likelihood Estimation for the Cox Model

267

(1)

7.4.6.3 A Censored-Data Likelihood

268

(1)

7.4.6.4 Object Function for VAE-Cox Model

269

(1)

7.5 Time Series Causal Survival Analysis

269

(3)

7.5.1 Introduction

269

(1)

7.5.2 Multi-State Survival Models

269

(1)

7.5.2.1 Notations and Basic Concepts

269

(1)

7.5.3 Multi-State Survival Models

270

(1)

7.5.3.1 Transition Probabilities, the Kolmogorov Forward Equations and Likelihood Function

270

(1)

7.5.3.2 Likelihood Function with Interval Censoring

271

(1)

7.5.3.3 Ordinary Differential Equations (NODE) for Multi-State Survival Models

271

(1)

7.6 Neural Ordinary Differential Equation Approach To Treatment Effect Estimation And Intervention Analysis

272

(6)

7.6.1 Introduction

272

(1)

7.6.2 Latent NODE for Irregularly-Sampled Time Series

273

(1)

7.6.3 Augmented Counterfactual ODE for Effect Estimation of Time Series Interventions with Confounders

274

(1)

7.6.3.1 Potential Outcome Framework for Estimation of Effect of Time Series Interventions

275

(1)

7.6.3.2 Augmented Counterfactual Ordinary Differential Equations

275

(3)

7.7 Generative Adversarial Networks For Counterfactual And Treatment Effect Estimation

278

(9)

7.7.1 A General GAN Model for Estimation of ITE with Discrete Outcome and Any Type of Treatment

279

(1)

7.7.1.1 Potential Framework

279

(1)

7.7.1.2 Conditional GAN as a General Framework for Estimation of ITE

280

(2)

7.7.2 Adversarial Variational Autoencoder-Generative Adversarial Network (AVAE-GAN) for Estimation in the Presence of Unmeasured Confounders

282

(1)

7.7.2.1 Architecture of AVAE-GAN

283

(1)

7.7.2.2 VAE with Disentangled Latent Factors

283

(4)

Software Package

287

(1)

Appendix 7A Derive Evidence Of Lower Bound

287

(1)

Appendix 7B Derivation Of Kolmogorov Forward Equations

287

(1)

Appendix 7C Inverse Relationship Of The Kolmogorov Backward Equation

288

(1)

Appendix 7D Introduction To Pontryagin's Maximum Principle

289

(1)

Appendix 7E Algorithm For Ite Block Optimization

290

(1)

Appendix 7F Algorithms For Implementing Stochastic Gradient Decent

291

(2)

Exercises

291

(2)

Chapter 8 Reinforcement Learning and Causal Inference

293

(56)

8.1 Introduction

293

(1)

8.2 Basic Reinforcement Learning Theory

293

(15)

8.2.1 Formalization of the Problem

293

(1)

8.2.1.1 Markov Decision Process and Notation

293

(1)

8.2.1.2 State-Value Function and Policy

294

(3)

8.2.1.3 Optimal Value Functions and Policies

297

(1)

8.2.1.4 Bellman Optimality Equation

298

(2)

8.2.2 Dynamic Programming

300

(1)

8.2.2.1 Policy Evaluation

300

(3)

8.2.2.2 Value Function and Policy Improvement

303

(2)

8.2.2.3 Policy Iteration

305

(1)

8.2.2.4 Monte Carlo Policy Evaluation

306

(1)

8.2.2.5 Temporal-Difference Learning

307

(1)

8.2.2.6 Comparisons: Dynamic Programming, Monte Carlo Methods, and Temporal Difference Methods

308

(1)

8.3 Approximate Function And Approximate Dynamic Programming

308

(6)

8.3.1 Introduction

308

(1)

8.3.2 Linear Function Approximation

309

(1)

8.3.3 Neural Network Approximation

310

(2)

8.3.4 Value-Based Methods

312

(1)

8.3.4.1 Q-Learning

312

(1)

8.3.4.2 Deep Q-Network

313

(1)

8.4 Policy Gradient Methods

314

(10)

8.4.1 Introduction

314

(1)

8.4.2 Policy Approximation

314

(3)

8.4.3 Reinforce: Monte Carlo Policy Gradient

317

(1)

8.4.4 Reinforce with Baseline

317

(1)

8.4.5 Actor-Critic Methods

318

(1)

8.4.6 ft-Step Temporal Difference (TD)

319

(1)

8.4.6.1 n-Step Prediction

319

(1)

8.4.7 ID(A) Methods

320

(2)

8.4.8 Sarsa and Sarsa (A)

322

(1)

8.4.9 Watkin's Q(A)

323

(1)

8.4.10 Actor-Critic and Eligibility Trace

324

(1)

8.5 Causal Inference And Reinforcement Learning

324

(10)

8.5.1 Deconfounding Reinforcement Learning

325

(1)

8.5.1.1 Adjust for Measured Confounders

325

(1)

8.5.1.2 Proxy Variable Approximation to Unobserved Confounding

326

(1)

8.5.1.3 Deep Latent Model for Identifying the Proxy Variables of Confounders

326

(1)

8.5.1.4 Reward and Causal Effect Estimation

327

(1)

8.5.1.5 Variational Autoencoder for Reinforcement Learning

327

(1)

8.5.1.6 Encoder

328

(1)

8.5.1.7 Decoder and ELBO

329

(1)

8.5.1.8 Deconfounding Causal Effect Estimation and Actor-Critic Methods

330

(1)

8.5.2 Counterfactuals and Reinforcement Learning

330

(1)

8.5.2.1 Structural Causal Model for Counterfactual Inference

330

(1)

8.5.2.2 Bidirectional Conditional GAN (BiCoGAN) for Estimation of Causal Mechanism

331

(2)

8.5.2.3 Dueling Double-Deep Q-Networks and Augmented Counterfactual Data for Reinforcement Learning

333

(1)

8.6 Reinforcement Learning For Inferring Causal Networks

334

(11)

8.6.1 Instruction

334

(1)

8.6.2 Mathematic Formulation of Inferring Causal Networks Using Bidirectional Conditional GAN

334

(2)

8.6.3 Framework of Reinforcement Learning for Combinatorial Optimization

336

(1)

8.6.4 Graph Encoder and Decoder

337

(1)

8.6.4.1 Mathematic Formulation of Graph Embedding

337

(1)

8.6.4.2 Node Embedding

337

(1)

8.6.4.3 Shallow Embedding Approaches

338

(2)

8.6.4.4 Attention and Transformer for Combinatorial Optimization and Construction of Directed Acyclic Graph

340

(5)

Software Package

345

(1)

Appendix 8A Bidirectional Rnn For Encoding

345

(1)

Appendix 8B Calculation Of Kl Divergence

345

(4)

Exercises

347

(2)

References

349

(14)

Index

363

Momiao Xiong, is a professor in the Department of Biostatistics and Data Science, University of Texas School of Public Health, and a regular member in the Genetics & Epigenetics (G&E) Graduate Program at The University of Texas MD Anderson Cancer Center, UTHealth Graduate School of Biomedical Science. His interests are artificial intelligence, causal inference, bioinformatics and genomics.

Artificial Intelligence and Causal Inference [Hardback]

Reviews

Account & settings

Search

Search database

Refine By

Subjects English Books

Choose shopping cart