Muutke küpsiste eelistusi

Open Source MLOPs: Version Control and Automation for Machine Learning Pipelines With DVC and CML [Pehme köide]

  • Formaat: Paperback / softback, 186 pages, kõrgus x laius: 235x191 mm
  • Ilmumisaeg: 26-Sep-2025
  • Kirjastus: Packt Publishing Limited
  • ISBN-10: 1801813205
  • ISBN-13: 9781801813204
Teised raamatud teemal:
  • Pehme köide
  • Hind: 58,29 €
  • See raamat ei ole veel ilmunud. Raamatu kohalejõudmiseks kulub orienteeruvalt 2-4 nädalat peale raamatu väljaandmist.
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Tellimisaeg 2-4 nädalat
  • Lisa soovinimekirja
  • Formaat: Paperback / softback, 186 pages, kõrgus x laius: 235x191 mm
  • Ilmumisaeg: 26-Sep-2025
  • Kirjastus: Packt Publishing Limited
  • ISBN-10: 1801813205
  • ISBN-13: 9781801813204
Teised raamatud teemal:
Build automated machine learning pipelines using CI/CD techniques applied to the domain of machine learning

Key Features

Create reproducible and automated machine learning pipelines using DVC and CML Speed up your machine learning development and promote collaboration using CI/CD techniques Ensure you stay ahead of the curve in the fiercely competitive machine learning market

Book DescriptionThe process of deriving useful insights from machine learning can be an arduous, though rewarding, one, even for data science practitioners. Its worth investing in any tools or techniques that can assist with the process.

Open Source MLOPs with DVC and CML will take you through two such techniques, which will allow you to automate your machine learning pipelines and make them eminently reproducible.

You'll begin with an introduction to Data Version Control (DVC) and learn how it can help you keep track of your machine learning artifacts using a familiar Git-like approach. This will lead you on to building end-to-end machine learning pipelines, complete with visualizations of the results. We move on to Continuous Machine Learning (CML), with which you can automate the training and testing of machine learning models so they can run alongside the rest of your CI/CD pipeline, ensuring stability and reproducibility.

By the end of this book, you will be able to develop reproducible pipelines as directed acyclic graphs and run those pipelines effortlessly in the cloud to speed up the development of your machine learning models.What you will learn

Create an S3 bucket to act as a remote repository Use remote storage and a GitHub repository to create a model registry Construct pipelines in YAML format in the dvc.yaml file Define for loops within the DVC pipeline to reduce repetition Share experiments with a coworker Access and save objects using DVC's Python API Run CML workloads on AWS EC2 instances including GPU-equipped machines Report results such as DVC metrics and plots to a GitHub pull request

Who this book is forPredominantly this book will be for people who want to learn how to use DVC and CML to build pipelines of the deployment of machine learning models. These people are most likely to be data scientists, or possibly software engineers, or students in training on PhD or MSc programs who are developing machine learning models. The book may also be useful for those interested in the Data Version Control aspect who are not (or not currently) developing or deploying machine learning models.

A bare minimum knowledge of data analytics, and a concern for producing analysis reproducibly and eagerness to learn is expected.
Table of Contents

A Brief Introduction to MLOps
First Steps with DVC
Using Remote Storage
Sharing Data with Registries
Troubleshooting Issues with DVC
Building pipelines with DVC
Advanced Pipelines Parameterization and foreach Stages
Creating Plots with DVC
Experiment Tracking
Deploying models with DVC
Automating your pipelines with github actions
Running GPU and compute heavy workloads
Train, test, and deployment with CML
Matthew Upson is a Data Scientist and Founder of MantisNLP experienced in Natural Language Processing and Machine Learning / Data Engineering problems. Previously he was the Lead Data Scientist at Juro, a legal tech startup where he used AI to make contracts faster, smarter, and more human. Prior to working at Juro he worked as a Data Scientist in the UK Government predominantly on Machine Learning services for Natural Language Processing. Version Control, Continuous integration, and Cloud Computing.