Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Foundations of Reinforcement Learning with Applications in Finance [Kõva köide]

4.50/5 (2 hinnangut Goodreads-ist)

Tikhon Jelvis, Ashwin Rao (Stanford University, USA)

Formaat: Hardback, 500 pages, kõrgus x laius: 254x178 mm, kaal: 1300 g, 2 Tables, black and white; 83 Line drawings, black and white; 83 Illustrations, black and white
Sari: Chapman & Hall/CRC Mathematics and Artificial Intelligence Series
Ilmumisaeg: 16-Dec-2022
Kirjastus: Chapman & Hall/CRC
ISBN-10: 1032124121
ISBN-13: 9781032124124

Teised raamatud teemal:

Applied mathematics - (Hetkel poes: 1 nimetust)
Artificial intelligence - (Hetkel poes: 4 nimetust)
Software Engineering
Programming & scripting languages: general - (Hetkel poes: 1 nimetust)
Automatic control engineering

Kõva köide
Hind: 110,79 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja
Raamatukogudele

Formaat: Hardback, 500 pages, kõrgus x laius: 254x178 mm, kaal: 1300 g, 2 Tables, black and white; 83 Line drawings, black and white; 83 Illustrations, black and white
Sari: Chapman & Hall/CRC Mathematics and Artificial Intelligence Series
Ilmumisaeg: 16-Dec-2022
Kirjastus: Chapman & Hall/CRC
ISBN-10: 1032124121
ISBN-13: 9781032124124

Teised raamatud teemal:

Applied mathematics - (Hetkel poes: 1 nimetust)
Artificial intelligence - (Hetkel poes: 4 nimetust)
Software Engineering
Programming & scripting languages: general - (Hetkel poes: 1 nimetust)
Automatic control engineering

Püsilink: https://www.kriso.ee/db/9781032124124.html

Märksõnad:

"Foundations of Reinforcement Learning with Applications in Finance aims to demystify Reinforcement Learning, and to make it a practically useful tool for those studying and working in applied areas - especially finance. Reinforcement Learning is emerging as a viable and powerful technique for solving a variety of complex problems across industries that involve Sequential Optimal Decisioning under Uncertainty. Its penetration in high-profile problems like self-driving cars, robotics, and strategy games points to a future where Reinforcement Learning algorithms will have decisioning abilities far superior to humans. But when it comes getting educated in this area, there seems to be a reluctance to jump right in, because Reinforcement Learning appears to have acquired a reputation for being mysterious and exotic. Even technical people will often claim that the subject involves "advanced math" and "complicated engineering", erecting a psychological barrier to entry against otherwise interested students. This book seeks to overcome that barrier, and to introduce the foundations of Reinforcement Learning in a way that balances depth of understanding with clear, minimally technical delivery. Features Focus on the foundational theory underpinning Reinforcement Learning Suitable as a primary text for courses in Reinforcement Learning, but also as supplementary reading for applied/financial mathematics, programming, and other related courses Suitable for a professional audience of quantitative analysts or industry specialists Blends theory/mathematics, programming/algorithms and real-world financial nuances while always striving to maintain simplicity and to build intuitive understanding"--

This book demystifies Reinforcement Learning, and makes it a practically useful tool for those studying and working in applied areas, especially finance. This book seeks to overcome that barrier, and to introduce the foundations of RL in a way that balances depth of understanding with clear, minimally technical delivery.

Foundations of Reinforcement Learning with Applications in Finance aims to demystify Reinforcement Learning, and to make it a practically useful tool for those studying and working in applied areas — especially finance.

Reinforcement Learning is emerging as a powerful technique for solving a variety of complex problems across industries that involve Sequential Optimal Decisioning under Uncertainty. Its penetration in high-profile problems like self-driving cars, robotics, and strategy games points to a future where Reinforcement Learning algorithms will have decisioning abilities far superior to humans. But when it comes getting educated in this area, there seems to be a reluctance to jump right in, because Reinforcement Learning appears to have acquired a reputation for being mysterious and technically challenging.

This book strives to impart a lucid and insightful understanding of the topic by emphasizing the foundational mathematics and implementing models and algorithms in well-designed Python code, along with robust coverage of several financial trading problems that can be solved with Reinforcement Learning. This book has been created after years of iterative experimentation on the pedagogy of these topics while being taught to university students as well as industry practitioners.

Features

Focus on the foundational theory underpinning Reinforcement Learning and software design of the corresponding models and algorithms
Suitable as a primary text for courses in Reinforcement Learning, but also as supplementary reading for applied/financial mathematics, programming, and other related courses
Suitable for a professional audience of quantitative analysts or data scientists
Blends theory/mathematics, programming/algorithms and real-world financial nuances while always striving to maintain simplicity and to build intuitive understanding
To access the code base for this book, please go to: https://github.com/TikhonJelvis/RL-book

Arvustused

This book is a nice addition to the literature on Reinforcement Learning (RL), offering comprehensive coverage of both foundational RL techniques and their applications in the field of finance. It has the potential to be a foundational reference for both practitioners and researchers in finance. The book delves into essential RL concepts such as Markov Decision Processes (MDPs), Dynamic Programming, Policy Optimization, Actor-Critic models, Multi-armed Bandits, and Regret Bounds.

Despite its finance-oriented approach, individuals without an extensive financial background but possessing a decent machine learning (ML) background will find it easy to read this book.

By encompassing all of the major asset classes including equities, fixed income and derivatives, the book caters to a broad range of readers, enabling them to apply RL techniques to diverse financial scenarios. In summary, this book is an outstanding resource that combines RL fundamentals with practical applications in finance. Natesh Pillai, Department of Statistics, Harvard University, Unites States of America

Preface

Author Biographies

xix

Summary of Notation

xxi

Chapter 1 Overview

(18)

1.1 Learning Reinforcement Learning

(1)

1.2 What You Will Learn From This Book

(1)

1.3 Expected Background To Read This Book

(1)

1.4 Decluttering The Jargon Linked To Reinforcement Learning

(1)

1.5 Introduction To The Markov Decision Process (Mdp) Framework

(3)

1.6 Real-World Problems That Fit The Mdp Framework

(1)

1.7 The Inherent Difficulty In Solving Mdps

(1)

1.8 Value Function, Bellman Equations, Dynamic Programming And Rl

(2)

1.9 Outline Of
Chapters

(7)

1.9.1 Module I: Processes and Planning Algorithms

(1)

1.9.2 Module II: Modeling Financial Applications

(2)

1.9.3 Module III: Reinforcement Learning Algorithms

(1)

1.9.4 Module IV: Finishing Touches

(1)

1.9.5 Short Appendix
Chapters

(3)

Chapter 2 Programming and Design

(22)

2.1 Code Design

(1)

2.2 Environment Setup

(1)

2.3 Classes and Interfaces

(11)

2.3.1 A Distribution Interface

(1)

2.3.2 A Concrete Distribution

(2)

2.3.2.1 Dataclasses

(2)

2.3.2.2 Immutability

(1)

2.3.3 Checking Types

(1)

2.3.3.1 Static Typing

(1)

2.3.4 Type Variables

(1)

2.3.5 Functionality

(2)

2.4 Abstracting Over Computation

(6)

2.4.1 First-Class Functions

(2)

2.4.1.1 Lambdas

(1)

2.4.2 Iterative Algorithms

(1)

2.4.2.1 Iterators and Generators

(3)

2.5 Key Takeaways From This
Chapter

(3)

Module I Processes and Planning Algorithms

Chapter 3 Markov Processes

(32)

3.1 The Concept Of State In A Process

(1)

3.2 Understanding Markov Property From Stock Price Examples

(7)

3.3 Formal Definitions For Markov Processes

(4)

3.3.1 Starting States

(1)

3.3.2 Terminal States

(1)

3.3.3 Markov Process Implementation

(2)

3.4 Stock Price Examples Modeled As Markov Processes

(1)

3.5 Finite Markov Processes

(2)

3.6 Simple Inventory Example

(3)

3.7 Stationary Distribution Of A Markov Process

(2)

3.8 Formalism Of Markov Reward Processes

(3)

3.9 Simple Inventory Example As A Markov Reward Process

(2)

3.10 Finite Markov Reward Processes

(1)

3.11 Simple Inventory Example As A Finite Markov Reward Process

(3)

3.12 Value Function Of A Markov Reward Process

(2)

3.13 Summary Of Key Learnings From This
Chapter

(2)

Chapter 4 Markov Decision Processes

(30)

4.1 Simple Inventory Example: How Much To Order?

(1)

4.2 The Difficulty Of Sequential Decisioning Under Uncertainty

(1)

4.3 Formal Definition Of A Markov Decision Process

(2)

4.4 Policy

(3)

4.5 [ Markov Decision Process, Policy]: = Markov Reward Process

(1)

4.6 Simple Inventory Example With Unlimited Capacity (Infinite State/Action Space)

(2)

4.7 Finite Markov Decision Processes

(2)

4.8 Simple Inventory Example As A Finite Markov Decision Process

(3)

4.9 Mdp Value Function For A Fixed Policy

(3)

4.10 Optimal Value Function And Optimal Policies

(4)

4.11 Variants And Extensions Of Mdps

(6)

4.11.1 Size of Spaces and Discrete versus Continuous

(1)

4.11.1.1 State Space

(2)

4.11.1.2 Action Space

(1)

4.11.1.3 Time Steps

(1)

4.11.2 Partially-Observable Markov Decision Processes (POMDPs)

(3)

4.12 Summary Of Key Learnings From This
Chapter

101

(2)

Chapter 5 Dynamic Programming Algorithms

103

(36)

5.1 Planning Versus Learning

103

(1)

5.2 Usage Of The Term Dynamic Programming

104

(1)

5.3 Fixed-Point Theory

105

(4)

5.4 Bellman Policy Operator And Policy Evaluation Algorithm

109

(3)

5.5 Greedy Policy

112

(1)

5.6 Policy Improvement

113

(2)

5.7 Policy Iteration Algorithm

115

(3)

5.8 Bellman Optimality Operator And Value Iteration Algorithm

118

(3)

5.9 Optimal Policy From Optimal Value Function

121

(1)

5.10 Revisiting The Simple Inventory Example

122

(2)

5.11 Generalized Policy Iteration

124

(3)

5.12 Asynchronous Dynamic Programming

127

(1)

5.13 Finite-Horizon Dynamic Programming: Backward Induction

128

(5)

5.14 Dynamic Pricing For End-Of-Life/End-Of-Season Of A Product

133

(4)

5.15 Generalization To Non-Tabular Algorithms

137

(1)

5.16 Summary Of Key Learnings From This
Chapter

138

(1)

Chapter 6 Function Approximation and Approximate Dynamic Programming

139

(34)

6.1 Function Approximation

140

(4)

6.2 Linear Function Approximation

144

(6)

6.3 Neural Network Function Approximation

150

(11)

6.4 Tabular As A Form Of Functionapprox

161

(3)

6.5 Approximate Policy Evaluation

164

(1)

6.6 Approximate Value Iteration

165

(1)

6.7 Finite-Horizon Approximate Policy Evaluation

166

(1)

6.8 Finite-Horizon Approximate Value Iteration

167

(1)

6.9 Finite-Horizon Approximate Q-Value Iteration

168

(1)

6.10 How To Construct The Non-Terminal States Distribution

169

(1)

6.11 Key Takeaways From This
Chapter

170

(3)

Module II Modeling Financial Applications

Chapter 7 Utility Theory

173

(12)

7.1 Introduction To The Concept Of Utility

173

(1)

7.2 A Simple Financial Example

174

(1)

7.3 The Shape Of The Utility Function

175

(2)

7.4 Calculating The Risk-Premium

177

(2)

7.5 Constant Absolute Risk-Aversion (Cara)

179

(1)

7.6 A Portfolio Application Of Cara

180

(1)

7.7 Constant Relative Risk-Aversion (Crra)

181

(1)

7.8 A Portfolio Application Of Crra

182

(1)

7.9 Key Takeaways From This
Chapter

183

(2)

Chapter 8 Dynamic Asset-Allocation and Consumption

185

(22)

8.1 Optimization Of Personal Finance

185

(2)

8.2 Merton's Portfolio Problem and Solution

187

(5)

8.3 Developing Intuition For The Solution To Merton's Portfolio Problem

192

(3)

8.4 A Discrete-Time Asset-Allocation Example

195

(4)

8.5 Porting To Real-World

199

(7)

8.6 Key Takeaways From This
Chapter

206

(1)

Chapter 9 Derivatives Pricing and Hedging

207

(36)

9.1 A Brief Introduction To Derivatives

208

(2)

9.1.1 Forwards

208

(1)

9.1.2 European Options

209

(1)

9.1.3 American Options

210

(1)

9.2 Notation For The Single-Period Simple Setting

210

(1)

9.3 Portfolios, Arbitrage And Risk-Neutral Probability Measure

211

(1)

9.4 First Fundamental Theorem Of Asset Pricing (1st Ftap)

212

(2)

9.5 Second Fundamental Theorem Of Asset Pricing (2nd Ftap)

214

(1)

9.6 Derivatives Pricing In Single-Period Setting

215

(12)

9.6.1 Derivatives Pricing When Market Is Complete

216

(2)

9.6.2 Derivatives Pricing When Market Is Incomplete

218

(8)

9.6.3 Derivatives Pricing When Market Has Arbitrage

226

(1)

9.7 Derivatives Pricing In Multi-Period/Continuous-Time

227

(2)

9.7.1 Multi-Period Complete-Market Setting

228

(1)

9.7.2 Continuous-Time Complete-Market Setting

229

(1)

9.8 Optimal Exercise Of American Options Cast As A Finite Mdp

229

(7)

9.9 Generalizing To Optimal-Stopping Problems

236

(2)

9.10 Pricing/Hedging In An Incomplete Market Cast As An Mdp

238

(2)

9.11 Key Takeaways From This
Chapter

240

(3)

Chapter 10 Order-Book Trading Algorithms

243

(36)

10.1 Basics Of Order Book And Price Impact

243

(9)

10.2 Optimal Execution Of A Market Order

252

(12)

10.2.1 Simple Linear Price Impact Model with No Risk-Aversion

257

(5)

10.2.2 Paper by Bertsimas and Lo on Optimal Order Execution

262

(1)

10.2.3 Incorporating Risk-Aversion and Real-World Considerations

263

(1)

10.3 Optimal Market-Making

264

(11)

10.3.1 Avellaneda-Stoikov Continuous-Time Formulation

266

(1)

10.3.2 Solving the Avellaneda-Stoikov Formulation

267

(5)

10.3.3 Analytical Approximation to the Solution to Avellaneda-Stoikov Formulation

272

(3)

10.3.4 Real-World Market-Making

275

(1)

10.4 Key Takeaways From This
Chapter

275

(4)

Module III Reinforcement Learning Algorithms

Chapter 11 Monte-Carlo and Temporal-Difference for Prediction

279

(36)

11.1 Overview Of The Reinforcement Learning Approach

279

(2)

11.2 Rl For Prediction

281

(1)

11.3 Monte-Carlo (Mc) Prediction

282

(6)

11.4 Temporal-Difference (Td) Prediction

288

(5)

11.5 Td Versus Mc

293

(12)

11.5.1 TD Learning Akin to Human Learning

293

(1)

11.5.2 Bias, Variance and Convergence

294

(2)

11.5.3 Fixed-Data Experience Replay on TD versus MC

296

(6)

11.5.4 Bootstrapping and Experiencing

302

(3)

11.6 Td(a) Prediction

305

(9)

11.6.1 N-Step Bootstrapping Prediction Algorithm

305

(1)

11.6.2 A-Return Prediction Algorithm

306

(2)

11.6.3 Eligibility Traces

308

(4)

11.6.4 Implementation of the TD(A) Prediction Algorithm

312

(2)

11.7 Key Takeaways From This
Chapter

314

(1)

Chapter 12 Monte-Carlo and Temporal-Difference for Control

315

(34)

12.1 Refresher On Generalized Policy Iteration (Gpi)

315

(1)

12.2 Gpi With Evaluation As Monte-Carlo

316

(4)

12.3 Glie Monte-Control Control

320

(5)

12.4 Sarsa

325

(6)

12.5 Sarsa(λ)

331

(1)

12.6 Off-Policy Control

332

(12)

12.6.1 Q-Learning

333

(2)

12.6.2 Windy Grid

335

(7)

12.6.3 Importance Sampling

342

(2)

12.7 Conceptual Linkage Between Dp And Td Algorithms

344

(2)

12.8 Convergence Of Rl Algorithms

346

(2)

12.9 Key Takeaways From This
Chapter

348

(1)

Chapter 13 Batch RL, Experience-Replay, DQN, LSPI, Gradient TD

349

(32)

13.1 Batch Rl And Experience-Replay

350

(3)

13.2 A Generic Implementation Of Experience-Replay

353

(1)

13.3 Least-Squares Rl Prediction

354

(6)

13.3.1 Least-Squares Monte-Carlo (LSMC)

355

(1)

13.3.2 Least-Squares Temporal-Difference (LSTD)

355

(3)

13.3.3 LSTD(λ)

358

(1)

13.3.4 Convergence of Least-Squares Prediction

359

(1)

13.4 Q-Learning With Experience-Replay

360

(3)

13.4.1 Deep Q-Networks (DON) Algorithm

362

(1)

13.5 Least-Squares Policy Iteration (Lspi)

363

(6)

13.5.1 Saving Your Village from a Vampire

365

(3)

13.5.2 Least-Squares Control Convergence

368

(1)

13.6 Rl For Optimal Exercise Of American Options

369

(3)

13.6.1 LSPI for American Options Pricing

370

(2)

13.6.2 Deep Q-Learning for American Options Pricing

372

(1)

13.7 Value Function Geometry

372

(6)

13.7.1 Notation and Definitions

373

(1)

13.7.2 Bellman Policy Operator and Projection Operator

374

(1)

13.7.3 Vectors of Interest in the Φ Subspace

374

(4)

13.8 Gradient Temporal-Difference (Gradient Td)

378

(1)

13.9 Key Takeaways From This
Chapter

379

(2)

Chapter 14 Policy Gradient Algorithms

381

(30)

14.1 Advantages And Disadvantages Of Policy Gradient Algorithms

382

(1)

14.2 Policy Gradient Theorem

383

(4)

14.2.1 Notation and Definitions

383

(1)

14.2.2 Statement of the Policy Gradient Theorem

384

(1)

14.2.3 Proof of the Policy Gradient Theorem

385

(2)

14.3 Score Function For Canonical Policy Functions

387

(1)

14.3.1 Canonical π(s, a; θ) for Finite Action Spaces

387

(1)

14.3.2 Canonical π(s, a; θ) for Single-Dimensional Continuous Action Spaces

388

(1)

14.4 Reinforce Algorithm (Monte-Carlo Policy Gradient)

388

(3)

14.5 Optimal Asset Allocation (Revisited)

391

(4)

14.6 Actor-Critic And Variance Reduction

395

(5)

14.7 Overcoming Bias With Compatible Function Approximation

400

(3)

14.8 Policy Gradient Methods In Practice

403

(3)

14.8.1 Natural Policy Gradient

403

(1)

14.8.2 Deterministic Policy Gradient

404

(2)

14.9 Evolutionary Strategies

406

(2)

14.10 Key Takeaways From This
Chapter

408

(3)

Module IV Finishing Touches

Chapter 15 Multi-Armed Bandits: Exploration versus Exploitation

411

(28)

15.1 Introduction To The Multi-Armed Bandit Problem

411

(4)

15.1.1 Some Examples of Explore-Exploit Dilemma

412

(1)

15.1.2 Problem Definition

412

(1)

15.1.3 Regret

413

(1)

15.1.4 Counts and Gaps

413

(2)

15.2 Simple Algorithms

415

(3)

15.2.1 Greedy and ε-Greedy

415

(1)

15.2.2 Optimistic Initialization

415

(1)

15.2.3 Decaying εt-Greedy Algorithm

416

(2)

15.3 Lower Bound

418

(1)

15.4 Upper Confidence Bound Algorithms

418

(5)

15.4.1 Hoeffding's Inequality

421

(1)

15.4.2 UCB1 Algorithm

421

(1)

15.4.3 Bayesian UCB

422

(1)

15.5 Probability Matching

423

(5)

15.5.1 Thompson Sampling

425

(3)

15.6 Gradient Bandits

428

(3)

15.7 Horse Race

431

(3)

15.8 Information State Space Mdp

434

(1)

15.9 Extending To Contextual Bandits And Rl Control

435

(2)

15.10 Key Takeaways From This
Chapter

437

(2)

Chapter 16 Blending Learning and Planning

439

(12)

16.1 Planning Versus Learning

439

(4)

16.1.1 Planning the Solution of Prediction/Control

440

(1)

16.1.2 Learning the Solution of Prediction/Control

441

(1)

16.1.3 Advantages and Disadvantages of Planning versus Learning

441

(1)

16.1.4 Blending Planning and Learning

442

(1)

16.2 Decision-Time Planning

443

(1)

16.3 Monte-Carlo Tree-Search (Mcts)

444

(1)

16.4 Adaptive Multi-Stage Sampling

445

(4)

16.5 Summary Of Key Learnings From This
Chapter

449

(2)

Chapter 17 Summary and Real-World Considerations

451

(8)

17.1 Summary Of Key Learnings From This Book

451

(4)

17.2 Rl In The Real-World

455

(4)

Appendix A Moment Generating Function and Its Applications

459

(4)

A.1 The Moment Generating Function (Mgf)

459

(1)

A.2 Mgf For Linear Functions Of Random Variables

460

(1)

A.3 Mgf For The Normal Distribution

460

(1)

A.4 Minimizing The Mgf

460

(3)

A.4.1 Minimizing the MGF When x Follows a Normal Distribution

461

(1)

A.4.2 Minimizing the MGF When x Follows a Symmetric Binary Distribution

461

(2)

Appendix B Portfolio Theory

463

(4)

B.1 Setting and Notation

463

(1)

B.2 Portfolio Returns

463

(1)

B.3 Derivation Of Efficient Frontier Curve

463

(1)

B.4 Global Minimum Variance Portfolio (Gmvp)

464

(1)

B.5 Orthogonal Efficient Portfolios

464

(1)

B.6 Two-Fund Theorem

465

(1)

B.7 An Example Of The Efficient Frontier For 16 Assets

465

(1)

B.8 Capm: Linearity Of Covariance Vector W.R.T. Mean Returns

465

(1)

B.9 Useful Corollaries Of Capm

466

(1)

B.10 Cross-Sectional Variance

466

(1)

B.11 Efficient Set With A Risk-Free Asset

466

(1)

Appendix C Introduction to and Overview of Stochastic Calculus Basics

467

(8)

C.1 Simple Random Walk

467

(1)

C.2 Brownian Motion As Scaled Random Walk

468

(1)

C.3 Continuous-Time Stochastic Processes

469

(1)

C.4 Properties Of Brownian Motion Sample Traces

469

(1)

C.5 Ito Integral

470

(1)

C.6 Ito's Lemma

471

(1)

C.7 A Lognormal Process

471

(1)

C.8 A Mean-Reverting Process

472

(3)

Appendix D The Hamilton-Jacobi-Bellman (HJB) Equation

475

(2)

D.1 Hjb As A Continuous-Time Version Of Bellman Optimality Equation

475

(1)

D.2 Hjb With State Transitions As An Ito Process

476

(1)

Appendix E Black-Scholes Equation and Its Solution for Call/Put Options

477

(4)

E.1 Assumptions

477

(1)

E.2 Derivation Of The Black-Scholes Equation

478

(1)

E.3 Solution Of The Black-Scholes Equation For Call/Put Options

479

(2)

Appendix F Function Approximations as Affine Spaces

481

(6)

F.1 Vector Space

481

(1)

F.2 Function Space

481

(1)

F.3 Linear Map Of Vector Spaces

481

(1)

F.4 Affine Space

482

(1)

F.5 Affine Map

482

(1)

F.6 Function Approximations

483

(1)

F.6.1 D[ R] as an Affine Space P

483

(1)

F.6.2 Delegator Space R

483

(1)

F.7 Stochastic Gradient Descent

484

(1)

F.8 Sgd Update For Linear Function Approximations

485

(2)

Appendix G Conjugate Priors for Gaussian and Bernoulli Distributions

487

(2)

G.1 Conjugate Prior For Gaussian Distribution

487

(1)

G.2 Conjugate Prior For Bernoulli Distribution

488

(1)

Bibliography

489

(4)

Index

493

Ashwin Rao is the Chief Science Officer of Wayfair, an e-commerce company where he and his team develop mathematical models and algorithms for supply-chain and logistics, merchandising, marketing, search, personalization, pricing and customer service. Ashwin is also an Adjunct Professor at Stanford University, focusing his research and teaching in the area of Stochastic Control, particularly Reinforcement Learning algorithms with applications in Finance and Retail. Previously, Ashwin was a Managing Director at Morgan Stanley and a Trading Strategist at Goldman Sachs. Ashwin holds a Bachelors degree in Computer Science and Engineering from IIT-Bombay and a Ph.D in Computer Science from University of Southern California, where he specialized in Algorithms Theory and Abstract Algebra.

Tikhon Jelvis is a programmer who specializes in bringing ideas from programming languages and functional programming to machine learning and data science. He has developed inventory optimization, simulation and demand forecasting systems as a Principal Scientist at Target and is a speaker and open-source contributor in the Haskell community where he serves on the board of directors for Haskell.org.

Foundations of Reinforcement Learning with Applications in Finance [Kõva köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv