Data Pipelines with Apache Airflow, Second Edition: Orchestration for Data and AI 2nd edition [Kõva köide]

5.00/5 (2 hinnangut Goodreads-ist)

Julian Ruiter

Formaat: Hardback, 512 pages, kõrgus x laius x paksus: 235x190x3 mm, kaal: 910 g
Ilmumisaeg: 12-Feb-2026
Kirjastus: Manning Publications
ISBN-10: 1633436373
ISBN-13: 9781633436374

Teised raamatud teemal:

Database programming
Data warehousing
Databases - (Hetkel poes: 2 nimetust)
Data capture & analysis - (Hetkel poes: 2 nimetust)
Data mining

Kõva köide
Hind: 65,09 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 3-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 512 pages, kõrgus x laius x paksus: 235x190x3 mm, kaal: 910 g
Ilmumisaeg: 12-Feb-2026
Kirjastus: Manning Publications
ISBN-10: 1633436373
ISBN-13: 9781633436374

Teised raamatud teemal:

Database programming
Data warehousing
Databases - (Hetkel poes: 2 nimetust)
Data capture & analysis - (Hetkel poes: 2 nimetust)
Data mining

Püsilink: https://www.kriso.ee/db/9781633436374.html

Simplify, streamline, and scale your data operations with data pipelines built on Apache Airflow.

Apache Airflow provides a batteries-included platform for designing, implementing, and monitoring data pipelines. Building pipelines on Airflow eliminates the need for patchwork stacks and homegrown processes, adding security and consistency to the process. Now in its second edition, Data Pipelines with Apache Airflow teaches you to harness this powerful platform to simplify and automate your data pipelines, reduce operational overhead, and seamlessly integrate all the technologies in your stack.

In Data Pipelines with Apache Airflow, Second Edition you'll learn how to:

• Master the core concepts of Airflow architecture and workflow design
• Schedule data pipelines using the Dataset API and time tables, including complex irregular schedules
• Develop custom Airflow components for your specific needs
• Implement comprehensive testing strategies for your pipelines
• Apply industry best practices for building and maintaining Airflow workflows
• Deploy and operate Airflow in production environments
• Orchestrate workflows in container-native environments
• Build and deploy Machine Learning and Generative AI models using Airflow

Data Pipelines with Apache Airflow has empowered thousands of data engineers to build more successful data platforms. This new second edition has been fully revised to cover the latest features of Apache Airflow, including the Taskflow API, deferrable operators, and Large Language Model integration. Filled with real-world scenarios and examples, you'll be carefully guided from Airflow novice to expert.

About the book

Data Pipelines with Apache Airflow, Second Edition teaches you how to build and maintain effective data pipelines. You'll master every aspect of directed acyclic graphs (DAGs)—the power behind Airflow—and learn to customize them for your pipeline's specific needs. Part reference and part tutorial, each technique is illustrated with engaging hands-on examples, from training machine learning models for generative AI to optimizing delivery routes. You'll explore common Airflow usage patterns, including aggregating multiple data sources and connecting to data lakes, while discovering exciting new features such as dynamic scheduling, the Taskflow API, and Kubernetes deployments.

About the reader

For DevOps, data engineers, machine learning engineers, and sysadmins with intermediate Python skills.

About the author

Julian de Ruiter is a Data + AI engineering lead at Xebia Data, with a background in computer and life sciences and a PhD in computational cancer biology. As consultant at Xebia Data, he enjoys helping clients design and build AI solutions and platforms, as well as the teams that drive them. From this work, he has extensive experience in deploying and applying Apache Airflow in production in diverse environments.

Ismael Cabral is a Machine Learning Engineer and Airflow trainer with experience spanning across Europe, US, Mexico, and South America, where he has worked with market-leading companies. He has vast experience implementing data pipelines and deploying machine learning models in production.

Kris Geusebroek is a data-engineering consultant with extensive hands-on experience with Apache Airflow at several clients and is the maintainer of Whirl (the open source local testing with Airflow repository), where he is actively adding new examples based on new functionality and new technologies that integrate with Airflow.

Daniel van der Ende is a Data Engineer who first started using Apache Airflow back in 2016. Since then, he has worked in many different Airflow environments, both on-premises and in the cloud. He has actively contributed to the Airflow project itself, as well as related projects such as Astronomer-Cosmos.

Bas Harenslak is a Staff Architect at Astronomer, where he helps customers develop mission-critical data pipelines at large scale using Apache Airflow and the Astro platform. With a background in software engineering and computer science, he enjoys working on software and data as if they are challenging puzzles. He favours working on open source software, is a committer on the Apache Airflow project, and co-author of the first edition of Data Pipelines with Apache Airflow.

Get a free eBook (PDF or ePub) from Manning as well as access to the online liveBook format (and its AI assistant that will answer your questions in any language) when you purchase the print book.

Arvustused

This is an all-encompassing guide on Airflow, from the baby steps to hardcore DevOps stuff.

Pavel Filatov, Data Engineer, flatmap.blog

It's the perfect introduction to Airflow. Although it seems to assume a reader with very little development experience, it still touches all the important topics and address most questions very efficiently.

Gregor Zurowski, Founding Engineer, ResearchHub

PART 1: GETTING STARTED

1 MEET APACHE AIRFLOW

2 ANATOMY OF AN AIRFLOW DAG

3 TIME-BASED SCHEDULING IN AIRFLOW

4 ASSET-AWARE SCHEDULING

5 TEMPLATING TASKS USING THE AIRFLOW CONTEXT

PART 2: BEYOND THE BASICS

6 DEFINING DEPENDENCIES BETWEEN TASKS

7 TRIGGERING WORKFLOWS WITH EXTERNAL INPUT

8 COMMUNICATING WITH EXTERNAL SYSTEMS

9 EXTENDING AIRFLOW WITH CUSTOM OPERATORS AND SENSORS

10 TESTING

PART 3: AIRFLOW IN PRACTICE

11 RUNNING TASKS IN CONTAINERS

12 BEST PRACTICES

13 PROJECT: FINDING THE FASTEST WAY TO GET AROUND NYC

PART 4: AIRFLOW IN PRODUCTION

14 PROJECT: KEEPING FAMILY TRADITIONS ALIVE WITH AIRFLOW AND GENERATIVE AI

15 OPERATING AIRFLOW IN PRODUCTION

16 SECURING AIRFLOW

17 AIRFLOW DEPLOYMENT OPTIONS

APPENDICES

APPENDIX A: RUNNING CODE SAMPLES

APPENDIX B: PROMETHEUS METRIC MAPPING

Julian de Ruiter is a Data + AI engineering lead known for transforming messy pipelines into scalable platforms. With global consulting experience, Julian brings analytical rigor and clear storytelling to every page. He distills advanced engineering tactics into practical guidance that accelerates reader confidence.

Ismael Cabral is a machine-learning engineer and renowned Airflow trainer trusted by companies across four continents. Drawing on cross-industry projects, Ismael delivers concise, hands-on advice readers can apply the same day. He turns complex workflow concepts into simple patterns that unlock rapid innovation.

Kris Geusebroek is a data-engineering consultant and maintainer of Whirl, the open-source Airflow testing toolkit. Kris blends deep technical skill with a coachs mindset, helping teams adopt disciplined engineering habits. He channels that expertise into repeatable methods that keep pipelines reliable and testable.

Daniel van der Ende is a veteran data engineer and early Airflow contributor who has worked on diverse on-prem and cloud setups. Daniels pragmatic voice demystifies community-driven features and hidden pitfalls. He packages frontline lessons into actionable tips that save readers weeks of trial and error.

Bas Harenslak is a staff architect at Astronomer, guiding enterprises in large-scale Airflow deployments. With a software-engineering background, Bas writes with precision, warmth, and strategic insight. He translates architectural puzzles into elegant solutions that future-proof reader infrastructure.

Data Pipelines with Apache Airflow, Second Edition: Orchestration for Data and AI 2nd edition [Kõva köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv