Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Machine Learning and Big Data with kdbplus/q [Kõva köide]

2.00/5 (2 hinnangut Goodreads-ist)

Jan Novotny, Frederic Deleze, Paul A. Bilokon, Aris Galiotos

Formaat: Hardback, 640 pages, kõrgus x laius x paksus: 246x168x41 mm, kaal: 1225 g
Sari: Wiley Finance
Ilmumisaeg: 21-Nov-2019
Kirjastus: John Wiley & Sons Inc
ISBN-10: 1119404754
ISBN-13: 9781119404750

Teised raamatud teemal:

Finance & accounting - (Hetkel poes: 2 nimetust)

Kõva köide
Hind: 87,85 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja
Raamatukogudele

Formaat: Hardback, 640 pages, kõrgus x laius x paksus: 246x168x41 mm, kaal: 1225 g
Sari: Wiley Finance
Ilmumisaeg: 21-Nov-2019
Kirjastus: John Wiley & Sons Inc
ISBN-10: 1119404754
ISBN-13: 9781119404750

Teised raamatud teemal:

Finance & accounting - (Hetkel poes: 2 nimetust)

Püsilink: https://www.kriso.ee/db/9781119404750.html

Märksõnad:

Upgrade your programming language to more effectively handle high-frequency data

Machine Learning and Big Data with KDB+/Q offers quants, programmers and algorithmic traders a practical entry into the powerful but non-intuitive kdb+ database and q programming language. Ideally designed to handle the speed and volume of high-frequency financial data at sell- and buy-side institutions, these tools have become the de facto standard; this book provides the foundational knowledge practitioners need to work effectively with this rapidly-evolving approach to analytical trading.

The discussion follows the natural progression of working strategy development to allow hands-on learning in a familiar sphere, illustrating the contrast of efficiency and capability between the q language and other programming approaches. Rather than an all-encompassing “bible”-type reference, this book is designed with a focus on real-world practicality to help you quickly get up to speed and become productive with the language.

Understand why kdb+/q is the ideal solution for high-frequency data
Delve into “meat” of q programming to solve practical economic problems
Perform everyday operations including basic regressions, cointegration, volatility estimation, modelling and more

Learn advanced techniques from market impact and microstructure analyses to machine learning techniques including neural networks

The kdb+ database and its underlying programming language q offer unprecedented speed and capability. As trading algorithms and financial models grow ever more complex against the markets they seek to predict, they encompass an ever-larger swath of data – more variables, more metrics, more responsiveness and altogether more “moving parts.”

Traditional programming languages are increasingly failing to accommodate the growing speed and volume of data, and lack the necessary flexibility that cutting-edge financial modelling demands. Machine Learning and Big Data with KDB+/Q opens up the technology and flattens the learning curve to help you quickly adopt a more effective set of tools.

Preface

xvii

About the Authors

xxiii

PART ONE Language Fundamentals

Chapter 1 Fundamentals of the q Programming Language

(38)

1.1 The (Not So Very) First Steps in q

(2)

1.2 Atoms and Lists

(9)

1.2.1 Casting Types

(3)

1.3 Basic Language Constructs

(5)

1.3.1 Assigning, Equality and Matching

(3)

1.3.2 Arithmetic Operations and Right-to-Left Evaluation: Introduction to q Philosophy

(2)

1.4 Basic Operators

(12)

1.5 Difference between Strings and Symbols

(2)

1.5.1 Enumeration

(2)

1.6 Matrices and Basic Linear Algebra in q

(2)

1.7 Launching the Session: Additional Options

(3)

1.8 Summary and How-To's

(3)

Chapter 2 Dictionaries and Tables: The q Fundamentals

(16)

2.1 Dictionary

(3)

2.2 Table

(4)

2.3 The Truth about Tables

(2)

2.4 Keyed Tables are Dictionaries

(1)

2.5 From a Vector Language to an Algebraic Language

(6)

Chapter 3 Functions

(24)

3.1 Namespace

(1)

3.1.0.1 .quantQ. Namespace

(1)

3.2 The Six Adverbs

(12)

3.2.1 Each

(1)

3.2.1.1 Each

(1)

3.2.1.2 Each-left \:

(1)

3.2.1.3 Each-right /:

(1)

3.2.1.4 Cross Product /: \:

(1)

3.2.1.5 Each-both '

(3)

3.2.2 Each-prior ':

(1)

3.2.3 Compose (')

(1)

3.2.4 Over and Fold

(1)

3.2.5 Scan

(1)

3.2.5.1 EMA: The Exponential Moving Average

(1)

3.2.6 Converge

(1)

3.2.6.1 Converge-repeat

(1)

3.2.6.2 Converge-iterate

(1)

3.3 Apply

(3)

3.3.1 @ (apply)

(1)

3.3.2 . (apply)

(2)

3.4 Protected Evaluations

(1)

3.5 Vector Operations

(3)

3.5.1 Aggregators

(1)

3.5.1.1 Simple Aggregators

(1)

3.5.1.2 Weighted Aggregators

(1)

3.5.2 Uniform Functions

(1)

3.5.2.1 Running Functions

(1)

3.5.2.2 Window Functions

(1)

3.6 Convention for User-Defined Functions

(2)

Chapter 4 Editors and Other Tools

(12)

4.1 Console

(1)

4.2 Jupyter Notebook

(2)

4.3 GUIs

(6)

4.3.1 qStudio

(3)

4.3.2 Q Insight Pad

(2)

4.4 IDEs: IntelliJ IDEA

(2)

4.5 Conclusion

(1)

Chapter 5 Debugging q Code

(14)

5.1 Introduction to Making It Wrong: Errors

(7)

5.1.1 Syntax Errors

(1)

5.1.2 Runtime Errors

(1)

5.1.2.1 The Type Error

(3)

5.1.2.2 Other Errors

(2)

5.2 Debugging the Code

100

(2)

5.3 Debugging Server-Side

102

(5)

PART TWO Data Operations

Chapter 6 Splayed and Partitioned Tables

107

(14)

6.1 Introduction

107

(1)

6.2 Saving a Table as a Single Binary File

108

(2)

6.3 Splayed Tables

110

(3)

6.4 Partitioned Tables

113

(6)

6.5 Conclusion

119

(2)

Chapter 7 Joins

121

(30)

7.1 Comma Operator

121

(4)

7.2 Join Functions

125

(19)

7.2.1 ij

125

(1)

7.2.2 ej

126

(1)

7.2.3 Ij

126

(1)

7.2.4 pj

127

(1)

7.2.5 upsert

128

(1)

7.2.6 uj

129

(2)

7.2.7 aj

131

(3)

7.2.8 ajO

134

(1)

7.2.8.1 The Next Valid Join

135

(3)

7.2.9 asof

138

(2)

7.2.10 Wj

140

(4)

7.3 Advanced Example: Running TWAP

144

(7)

Chapter 8 Parallelisation

151

(10)

8.1 Parallel Vector Operations

152

(3)

8.2 Parallelisation over Processes

155

(1)

8.3 Map-Reduce

155

(3)

8.4 Advanced Topic: Parallel File/Directory Access

158

(3)

Chapter 9 Data Cleaning and Filtering

161

(4)

9.1 Predicate Filtering

161

(2)

9.1.1 The Where Clause

161

(2)

9.1.2 Aggregation Filtering

163

(1)

9.2 Data Cleaning, Normalising and APIs

163

(2)

Chapter 10 Parse Trees

165

(16)

10.1 Definition

166

(5)

10.1.1 Evaluation

166

(4)

10.1.2 Parse Tree Creation

170

(1)

10.1.3 Read-Only Evaluation

170

(1)

10.2 Functional Queries

171

(10)

10.2.1 Functional Select

174

(4)

10.2.2 Functional Exec

178

(1)

10.2.3 Functional Update

179

(1)

10.2.4 Functional Delete

180

(1)

Chapter 11 A Few Use Cases

181

(10)

11.1 Rolling VWAP

181

(2)

11.1.1 N Tick VWAP

181

(1)

11.1.2 Time Window VWAP

182

(1)

11.2 Weighted Mid for N Levels of an Order Book

183

(2)

11.3 Consecutive Runs of a Rule

185

(1)

11.4 Real-Time Signals and Alerts

186

(5)

PART THREE Data Science

Chapter 12 Basic Overview of Statistics

191

(38)

12.1 Histogram

191

(5)

12.2 First Moments

196

(2)

12.3 Hypothesis Testing

198

(31)

12.3.1 Normal p-values

198

(3)

12.3.2 Correlation

201

(1)

12.3.2.1 Implementation

202

(1)

12.3.3 t-test: One Sample

202

(2)

12.3.3.1 Implementation

204

(1)

12.3.4 t-test: Two Samples

204

(1)

12.3.4.1 Implementation

205

(1)

12.3.5 Sign Test

206

(2)

12.3.5.1 Implementation of the Test

208

(3)

12.3.5.2 Median Test

211

(1)

12.3.6 Wilcoxon Signed-Rank Test

212

(2)

12.3.7 Rank Correlation and Somers' D

214

(2)

12.3.7.1 Implementation

216

(5)

12.3.8 Multiple Hypothesis Testing

221

(3)

12.3.8.1 Bonferroni Correction

224

(1)

12.3.8.2 Sidak's Correction

224

(1)

12.3.8.3 Holm's Method

225

(1)

12.3.8.4 Example

226

(3)

Chapter 13 Linear Regression

229

(36)

13.1 Linear Regression

230

(1)

13.2 Ordinary Least Squares

231

(2)

13.3 The Geometric Representation of Linear Regression

233

(7)

13.3.1 Moore-Penrose Pseudoinverse

235

(2)

13.3.2 Adding Intercept

237

(3)

13.4 Implementation of the OLS

240

(3)

13.5 Significance of Parameters

243

(1)

13.6 How Good is the Fit: R2

244

(4)

13.6.1 Adjusted R-squared

247

(1)

13.7 Relationship with Maximum Likelihood Estimation and AIC with Small Sample Correction

248

(4)

13.8 Estimation Suite

252

(2)

13.9 Comparing Two Nested Models: Towards a Stopping Rule

254

(3)

13.9.1 Comparing Two General Models

256

(1)

13.10 In-/Out-of-Sample Operations

257

(5)

13.11 Cross-validation

262

(2)

13.12 Conclusion

264

(1)

Chapter 14 Time Series Econometrics

265

(36)

14.1 Autoregressive and Moving Average Processes

265

(20)

14.1.1 Introduction

265

(1)

14.1.2 AR(p) Process

266

(1)

14.1.2.1 Simulation

266

(2)

14.1.2.2 Estimation of AR(p) Parameters

268

(1)

14.1.2.3 Least Square Method

268

(1)

14.1.2.4 Example

269

(1)

14.1.2.5 Maximum Likelihood Estimator

269

(1)

14.1.2.6 Yule-Walker Technique

269

(2)

14.1.3 MA(q) Process

271

(1)

14.1.3.1 Estimation of MA(q) Parameters

272

(1)

14.1.3.2 Simulation

272

(1)

14.1.3.3 Example

273

(1)

14.1.4 ARMA(p, q) Process

273

(1)

14.1.4.1 Invertibility of the ARMA(p, q) Process

274

(1)

14.1.4.2 Hannan-Rissanen Algorithm: Two-Step Regression Estimation

274

(1)

14.1.4.3 Yule-Walker Estimation

274

(1)

14.1.4.4 Maximum Likelihood Estimation

275

(1)

14.1.4.5 Simulation

275

(1)

14.1.4.6 Forecast

276

(1)

14.1.5 ARIMA(p, d, q) Process

276

(1)

14.1.6 Code

276

(1)

14.1.6.1 Simulation

277

(1)

14.1.6.2 Estimation

278

(4)

14.1.6.3 Forecast

282

(3)

14.2 Stationarity and Granger Causality

285

(2)

14.2.1 Stationarity

285

(1)

14.2.2 Test of Stationarity -- Dickey-Fuller and Augmented Dickey-Fuller Tests

286

(1)

14.2.3 Granger Causality

286

(1)

14.3 Vector Autoregression

287

(14)

14.3.1 VAR(p) Process

288

(1)

14.3.1.1 Notation

288

(1)

14.3.1.2 Estimator

288

(1)

14.3.1.3 Example

289

(4)

14.3.1.4 Code

293

(4)

14.3.2 VARX(p, q) Process

297

(1)

14.3.2.1 Estimator

297

(1)

14.3.2.2 Code

298

(3)

Chapter 15 Fourier Transform

301

(24)

15.1 Complex Numbers

301

(7)

15.1.1 Properties of Complex Numbers

302

(6)

15.2 Discrete Fourier Transform

308

(6)

15.3 Addendum: Quaternions

314

(7)

15.4 Addendum: Fractals

321

(4)

Chapter 16 Eigensystem and PCA

325

(34)

16.1 Theory

325

(2)

16.2 Algorithms

327

(5)

16.2.1 QR Decomposition

328

(2)

16.2.2 QR Algorithm for Eigenvalues

330

(1)

16.2.3 Inverse Iteration

331

(1)

16.3 Implementation of Eigensystem Calculation

332

(9)

16.3.1 QR Decomposition

333

(4)

16.3.2 Inverse Iteration

337

(4)

16.4 The Data Matrix and the Principal Component Analysis

341

(10)

16.4.1 The Data Matrix

341

(3)

16.4.2 PCA: The First Principal Component

344

(1)

16.4.3 Second Principal Component

345

(2)

16.4.4 Terminology and Explained Variance

347

(2)

16.4.5 Dimensionality Reduction

349

(1)

16.4.6 PCA Regression (PCR)

350

(1)

16.5 Implementation of PCA

351

(3)

16.6 Appendix: Determinant

354

(5)

16.6.1 Theory

354

(1)

16.6.2 Techniques to Calculate a Determinant

355

(1)

16.6.3 Implementation of the Determinant

356

(3)

Chapter 17 Outlier Detection

359

(10)

17.1 Local Outlier Factor

360

(9)

Chapter 18 Simulating Asset Prices

369

(12)

18.1 Stochastic Volatility Process with Price Jumps

369

(2)

18.2 Towards the Numerical Example

371

(7)

18.2.1 Numerical Preliminaries

371

(3)

18.2.2 Implementing Stochastic Volatility Process with Jumps

374

(4)

18.3 Conclusion

378

(3)

PART FOUR Machine Learning

Chapter 19 Basic Principles of Machine Learning

381

(10)

19.1 Non-Numeric Features and Normalisation

381

(5)

19.1.1 Non-Numeric Features

381

(1)

19.1.1.1 Ordinal Features

382

(1)

19.1.1.2 Categorical Features

383

(1)

19.1.2 Normalisation

383

(1)

19.1.2.1 Normal Score

384

(1)

19.1.2.2 Range Scaling

385

(1)

19.2 Iteration: Constructing Machine Learning Algorithms

386

(5)

19.2.1 Iteration

386

(3)

19.2.2 Constructing Machine Learning Algorithms

389

(2)

Chapter 20 Linear Regression with Regularisation

391

(28)

20.1 Bias--Variance Trade-off

392

(1)

20.2 Regularisation

393

(1)

20.3 Ridge Regression

394

(2)

20.4 Implementation of the Ridge Regression

396

(7)

20.4.1 Optimisation of the Regularisation Parameter

401

(2)

20.5 Lasso Regression

403

(2)

20.6 Implementation of the Lasso Regression

405

(14)

Chapter 21 Nearest Neighbours

419

(18)

21.1 k-Nearest Neighbours Classifier

419

(4)

21.2 Prototype Clustering

423

(6)

21.3 Feature Selection: Local Nearest Neighbours Approach

429

(8)

21.3.1 Implementation

430

(7)

Chapter 22 Neural Networks

437

(28)

22.1 Theoretical Introduction

437

(8)

22.1.1 Calibration

440

(1)

22.1.1.1 Backpropagation

441

(2)

22.1.2 The Learning Rate Parameter

443

(1)

22.1.3 Initialisation

443

(1)

22.1.4 Overfitting

444

(1)

22.1.5 Dimension of the Hidden Layer(s)

444

(1)

22.2 Implementation of Neural Networks

445

(6)

22.2.1 Multivariate Encoder

445

(1)

22.2.2 Neurons

446

(2)

22.2.3 Training the Neural Network

448

(3)

22.3 Examples

451

(12)

22.3.1 Binary Classification

451

(3)

22.3.2 M-class Classification

454

(3)

22.3.3 Regression

457

(6)

22.4 Possible Suggestions

463

(2)

Chapter 23 AdaBoost with Stumps

465

(12)

23.1 Boosting

465

(1)

23.2 Decision Stumps

466

(1)

23.3 AdaBoost

467

(1)

23.4 Implementation of AdaBoost

468

(6)

23.5 Recommendation for Readers

474

(3)

Chapter 24 Trees

477

(18)

24.1 Introduction to Trees

477

(2)

24.2 Regression Trees

479

(3)

24.2.1 Cost-Complexity Pruning

481

(1)

24.3 Classification Tree

482

(2)

24.4 Miscellaneous

484

(1)

24.5 Implementation of Trees

485

(10)

Chapter 25 Forests

495

(14)

25.1 Bootstrap

495

(3)

25.2 Bagging

498

(2)

25.2.1 Out-of-Bag

499

(1)

25.3 Implementation

500

(9)

25.3.1 Prediction

503

(2)

25.3.2 Feature Selection

505

(4)

Chapter 26 Unsupervised Machine Learning: The Apriori Algorithm

509

(14)

26.1 Apriori Algorithm

510

(1)

26.2 Implementation of the Apriori Algorithm

511

(12)

Chapter 27 Processing Information

523

(18)

27.1 Information Retrieval

523

(9)

27.1.1 Corpus: Leonardo da Vinci

523

(1)

27.1.2 Frequency Counting

524

(4)

27.1.3 tf-idf

528

(4)

27.2 Information as Features

532

(9)

27.2.1 Sample: Simulated Proteins

533

(2)

27.2.2 Kernels and Metrics for Proteins

535

(1)

27.2.3 Implementation of Inner Products and Nearest Neighbours Principles

535

(4)

27.2.4 Further Topics

539

(2)

Chapter 28 Towards Al -- Monte Carlo Tree Search

541

(42)

28.1 Multi-Armed Bandit Problem

541

(17)

28.1.1 Analytic Solutions

543

(1)

28.1.2 Greedy Algorithms

543

(1)

28.1.3 Confidence-Based Algorithms

544

(2)

28.1.4 Bayesian Algorithms

546

(1)

28.1.5 Online Gradient Descent Algorithms

547

(1)

28.1.6 Implementation of Some Learning Algorithms

547

(11)

28.2 Monte Carlo Tree Search

558

(7)

28.2.1 Selection Step

561

(1)

28.2.2 Expansion Step

562

(1)

28.2.3 Simulation Step

563

(1)

28.2.4 Back Propagation Step

563

(1)

28.2.5 Finishing the Algorithm

563

(1)

28.2.6 Remarks and Extensions

564

(1)

28.3 Monte Carlo Tree Search Implementation -- Tic-tac-toe

565

(14)

28.3.1 Random Games

566

(4)

28.3.2 Towards the MCTS

570

(9)

28.3.3 Case Study

579

(1)

28.4 Monte Carlo Tree Search -- Additional Comments

579

(4)

28.4.1 Policy and Value Networks

579

(2)

28.4.2 Reinforcement Learning

581

(2)

Chapter 29 Econophysics: The Agent-Based Computational Models

583

(12)

29.1 Agent-Based Modelling

584

(3)

29.1.1 Agent-Based Models in Society

584

(2)

29.1.2 Agent-Based Models in Finance

586

(1)

29.2 Ising Agent-Based Model for Financial Markets

587

(5)

29.2.1 Ising Model in Physics

587

(1)

29.2.2 Ising Model of Interacting Agents

587

(1)

29.2.3 Numerical Implementation

588

(4)

29.3 Conclusion

592

(3)

Chapter 30 Epilogue: Art

595

(6)

Bibliography

601

(6)

Index

607

JAN NOVOTNY is an eFX quant trader at Deutsche Bank. Previously, he worked at the Centre for Econometric Analysis on high-frequency econometric models. He holds a PhD from CERGE-EI, Charles University, Prague.

PAUL A. BILOKON is CEO and founder of Thalesians Ltd and an expert in algorithmic trading. He previously worked at Nomura, Lehman Brothers, and Morgan Stanley. Paul was educated at Christ Church College, Oxford, and Imperial College.

ARIS GALIOTOS is the global technical lead for the eFX kdb+ team at HSBC, where he helps develop a big data installation processing billions of real-time records per day. Aris holds an MSc in Financial Mathematics with Distinction from the University of Edinburgh.

FRÉDÉRIC DÉLČZE is an independent algorithm trader and consultant. He has designed automated trading strategies for hedge funds and developed quantitative risk models for investment banks. He holds a PhD in Finance from Hanken School of Economics, Helsinki.

Machine Learning and Big Data with kdbplus/q [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv