Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Machine Learning Engineering in Action [Pehme köide]

5.00/5 (2 hinnangut Goodreads-ist)

Ben Wilson

Formaat: Paperback / softback, 300 pages, kõrgus x laius x paksus: 234x186x34 mm, kaal: 960 g
Ilmumisaeg: 14-Apr-2022
Kirjastus: Manning Publications
ISBN-10: 1617298719
ISBN-13: 9781617298714

Teised raamatud teemal:

Pehme köide
Hind: 64,08 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 300 pages, kõrgus x laius x paksus: 234x186x34 mm, kaal: 960 g
Ilmumisaeg: 14-Apr-2022
Kirjastus: Manning Publications
ISBN-10: 1617298719
ISBN-13: 9781617298714

Teised raamatud teemal:

Püsilink: https://www.kriso.ee/db/9781617298714.html

Märksõnad:

Machine Learning Engineering in Action lays out an approach to building deployable, maintainable production machine learning systems. You will adopt software development standards that deliver better code management, and make it easier to test, scale, and even reuse your machine learning code!

You will learn how to plan and scope your project, manage cross-team logistics that avoid fatal communication failures, and design your code's architecture for improved resilience. You will even discover when not to use machine learningand the alternative approaches that might be cheaper and more effective. When you're done working through this toolbox guide, you will be able to reliably deliver cost-effective solutions for organizations big and small alike.

Following established processes and methodology maximizes the likelihood that your machine learning projects will survive and succeed for the long haul. By adopting standard, reproducible practices, your projects will be maintainable over time and easy for new team members to understand and adapt.

Arvustused

Anice view on practical data science and machine learning. Great reading fornewbies, some interesting views for seasoned practitioners. Johannes Verwijnen

Amust read for those looking to balance the planning and experimentationlifecycle. Jesśs Antonino Juįrez Guerrero

Apractical book to help engineers understand the workflow of machine learningprojects. Xiangbo Mao

Donot implement your ML model into production without reading this book! Lokesh Kumar

Preface

Acknowledgments

xiii

About This Book

About The Author

xviii

About The Cover Illustration

xix

Part 1 An Introduction To Machine Learning Engineering

1 What is a machine learning engineer?

(23)

1.1 Why ML engineering?

(3)

1.2 The core tenets of ML engineering

(16)

Planning

(2)

Scoping and research

(3)

Experimentation

(2)

Development

(3)

Deployment

(3)

Evaluation

(3)

1.3 The goals of ML engineering

(2)

2 Your data science could use some engineering

(12)

2.1 Augmenting a complex profession with processes to increase project success

(2)

2.2 A foundation of simplicity

(2)

2.3 Co-opting principles of Agile software engineering

(4)

Communication and cooperation

(2)

Embracing and expecting change

(1)

2.4 The foundation of ML engineering

(3)

3 Before you model: Planning and scoping a project

(38)

3.1 Planning: You want me to predict what?!

(18)

Basic planning for a project

(6)

That first meeting

(3)

Plan for demos-lots of demos

(2)

Experimentation by solution building: Wasting time for pride's sake

(2)

3.2 Experimental scoping: Setting expectations and boundaries

(16)

What is experimental scoping?

(1)

Experimental scoping for the ML team: Research

(2)

Experimental scoping for the ML team: Experimentation

(12)

4 Before you model: Communication and logistics of projects

(48)

4.1 Communication: Defining the problem

(22)

Understanding the problem

(14)

Setting critical discussion boundaries

(7)

4.2 Don't waste our time: Meeting with cross-functional teams

101

(7)

Experimental update meeting: Do we know what we're doing here?

102

(1)

SME review/prototype review: Can we solve this?

103

(2)

Development progress review(s): Is this thing going to work?

105

(1)

MVP review: Did you build what we asked for?

106

(1)

Preproduction review: We really hope we didn't screw this up

107

(1)

4.3 Setting limits on your experimentation

108

(8)

Set a time limit

109

(3)

Can you put this into production? Would you want to maintain it?

112

(1)

TDD vs. RDD vs. PDD vs. CDD for ML projects

113

(3)

4.4 Planning for business rules chaos

116

(4)

Embracing chaos by planning for it

117

(2)

Human-in-the-loop design

119

(1)

What's your backup plan?

119

(1)

4.5 Talking about results

120

(4)

5 Experimentation in action: Planning and researching an ML project

124

(35)

5.1 Planning experiments

126

(11)

Perform basic research and planning

126

(4)

Forget the blogs-read the API docs

130

(5)

Draw straws for an internal hackathon

135

(1)

Level the playing field

136

(1)

5.2 Performing experimental prep work

137

(22)

Performing data analysis

139

(7)

Moving from script to reusable code

146

(8)

One last note on building reusable code for experimentation

154

(5)

6 Experimentation in action: Testing and evaluating a project

159

(38)

6.1 Testing ideas

162

(35)

Setting guidelines in code

163

(9)

Running quick forecasting tests

172

(18)

Whittling down the possibilities

190

(1)

Evaluating prototypes properly

191

(2)

Making a call on the direction to go in

193

(3)

So...what's next?

196

(1)

7 Experimentation in action: Moving from prototype to MVP

197

(31)

7.1 Tuning: Automating the annoying stuff

199

(16)

Tuning options

201

(5)

Hyperopt primer

206

(2)

Using Hyperopt to tune a complex forecasting problem

208

(7)

7.2 Choosing the right tech for the platform and the team

215

(13)

Why Spark?

216

(2)

Handling tuning from the driver with SparkTrials

218

(4)

Handling tuning from the workers with a pandas_udf

222

(4)

Using new paradigms for teams: Platforms and technologies

226

(2)

8 Experimentation in action: Finalizing an MVP with MLflow and runtime optimization

228

(15)

8.1 Logging: Code, metrics, and results

229

(8)

MLflow tracking

230

(2)

Please stop printing and log your information

232

(2)

Version control, branch strategies, and working with others

234

(3)

8.2 Scalability and concurrency

237

(8)

What is concurrency?

239

(1)

What you can (and can't) run asynchronously

239

(4)

Part 2 Preparing For Production: Creating Maintainable ML

243

(156)

9 Modularity for ML: Writing testable and legible code

245

(24)

9.1 Understanding monolithic scripts and why they are bad

248

(7)

How monoliths come into being

249

(1)

Walls of text

249

(3)

Considerations for monolithic scripts

252

(3)

9.2 Debugging walls of text

255

(2)

9.3 Designing modular ML code

257

(7)

9.4 Using test-driven development for ML

264

(5)

10 Standards of coding and creating maintainable ML code

269

(31)

10.1 ML code smells

270

(3)

10.2 Naming, structure, and code architecture

273

(5)

Naming conventions and structure

273

(1)

Trying to be too clever

274

(2)

Code architecture

276

(2)

10.3 Tuple unpacking and maintainable alternatives

278

(4)

Tuple unpacking example

278

(2)

A solid alternative to tuple unpacking

280

(2)

10.4 Blind to issues: Eating exceptions and other bad practices

282

(6)

Try/catch with the precision of a shotgun

283

(2)

Exception handling with laser precision

285

(1)

Handling errors the right way

286

(2)

10.5 Use of global mutable objects

288

(4)

How mutability can burn you

288

(2)

Encapsulation to prevent mutable side effects

290

(2)

10.6 Excessively nested logic

292

(8)

11 Model measurement and why it's so important

300

(34)

11.1 Measuring model attribution

302

(14)

Measuring prediction performance

302

(10)

Clarifying correlation vs. causation

312

(4)

11.2 Leveraging A/B testing for attribution calculations

316

(18)

A/B testing 101

317

(2)

Evaluating continuous metrics

319

(6)

Using alternative displays and tests

325

(4)

Evaluating categorical metrics

329

(5)

12 Holding on to your gains by watching for drift

334

(19)

12.1 Detecting drift

335

(12)

What influences drift?

336

(11)

12.2 Responding to drift

347

(6)

What can we do about it?

348

(2)

Responding to drift

350

(3)

13 ML development hubris

353

(46)

13.1 Elegant complexity vs. overengineering

355

(9)

Lightweight scripted style (imperative)

357

(4)

An overengineered mess

361

(3)

13.2 Unintentional obfuscation: Could you read this if you didn't write it?

364

(15)

The flavors of obfuscation

365

(13)

Troublesome coding habits recap

378

(1)

13.3 Premature generalization, premature optimization, and other bad ways to show how smart you are

379

(11)

Generalization and frameworks: Avoid them until you can't

379

(3)

Optimizing too early

382

(8)

13.4 Do you really want to be the canary? Alpha testing and the dangers of the open source coal mine

390

(3)

13.5 Technology-driven development vs. solution-driven development

393

(6)

Part 3 Developing Production Machine

14 Writing production code

399

(39)

14.1 Have you met your data?

401

(11)

Make sure you have the data

403

(1)

Check your data provenance

404

(4)

Find a source of truth and align on it

408

(2)

Don't embed data cleansing into your production code

410

(2)

14.2 Monitoring your features

412

(5)

14.3 Monitoring everything else in the model life cycle

417

(4)

14.4 Keeping things as simple as possible

421

(5)

Simplicity in problem definitions

423

(1)

Simplicity in implementation

424

(2)

14.5 Wireframing ML projects

426

(6)

14.6 Avoiding cargo cult ML behavior

432

(6)

15 Quality and acceptance testing

438

(33)

15.1 Data consistency

439

(8)

Training and inference skew

440

(1)

A brief intro to feature stores

441

(1)

Process over technology

442

(3)

The dangers of a data silo

445

(2)

15.2 Fallbacks and cold starts

447

(6)

Leaning heavily on prior art

448

(2)

Cold-start woes

450

(3)

15.3 End user vs. internal use testing

453

(7)

Biased testing

456

(1)

Dogfooding

457

(2)

SME evaluation

459

(1)

15.4 Model interpretability

460

(11)

Shapley additive explanations

461

(2)

Using shag

463

(8)

16 Production infrastructure

471

(39)

16.1 Artifact management

472

(10)

MLflow's model registry

474

(2)

Interfacing with the model registry

476

(6)

16.2 Feature stores

482

(8)

What a feature store is used for

483

(2)

Using a feature store

485

(4)

Evaluating a feature store

489

(1)

16.3 Prediction serving architecture

490

(20)

Determining serving needs

493

(7)

Bulk external delivery

500

(2)

Microbatch streaming

502

(1)

Real-time server-side

503

(4)

Integrated models (edge deployment)

507

(3)

Appendix A Big O(no) and how to think about runtime performance

510

(30)

Appendix B Setting up a development environment

540

(7)

Index

547

Ben Wilson has worked as a professional data scientist for more than ten years. He currently works as a resident solutions architect at Databricks,where he focuses on machine learning production architecture with companies ranging from 5-person startups to global Fortune 100. Ben is the creator and lead developer of the Databricks Labs AutoML project, a Scala-and Python-based toolkit that simplifies machine learning feature engineering, model tuning, and pipeline-enabled modelling.

Machine Learning Engineering in Action [Pehme köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv