Muutke küpsiste eelistusi

E-raamat: Observability Engineering: Achieving Production Excellence

  • Formaat: 320 pages
  • Ilmumisaeg: 06-May-2022
  • Kirjastus: O'Reilly Media
  • Keel: eng
  • ISBN-13: 9781492076391
  • Formaat - EPUB+DRM
  • Hind: 47,96 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 320 pages
  • Ilmumisaeg: 06-May-2022
  • Kirjastus: O'Reilly Media
  • Keel: eng
  • ISBN-13: 9781492076391

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Observability is critical for building, changing, and understanding the software that powers complex modern systems. Teams that adopt observability are much better equipped to ship code swiftly and confidently, identify outliers and aberrant behaviors, and understand the experience of each and every user. This practical book explains the value of observable systems and shows you how to practice observability-driven development.

Authors Charity Majors, Liz Fong-Jones, and George Miranda from Honeycomb explain what constitutes good observability, show you how to improve upon what youâ??re doing today, and provide practical dos and don'ts for migrating from legacy tooling, such as metrics, monitoring, and log management. Youâ??ll also learn the impact observability has on organizational culture (and vice versa).

You'll explore:

How the concept of observability applies to managing software at scale The value of practicing observability when delivering complex cloud native applications and systems The impact observability has across the entire software development lifecycle How and why different functional teams use observability with service-level objectives How to instrument your code to help future engineers understand the code you wrote today How to produce quality code for context-aware system debugging and maintenance How data-rich analytics can help you debug elusive issues
Foreword xi
Preface xv
Part I The Path to Observability
1 What Is Observability?
3(16)
The Mathematical Definition of Observability
4(1)
Applying Observability to Software Systems
4(3)
Mischaracterizations About Observability for Software
7(1)
Why Observability Matters Now
8(1)
Is This Really the Best Way?
9(1)
Why Are Metrics and Monitoring Not Enough?
9(2)
Debugging with Metrics Versus Observability
11(2)
The Role of Cardinality
13(1)
The Role of Dimensionality
14(2)
Debugging with Observability
16(1)
Observability Is for Modern Systems
17(1)
Conclusion
17(2)
2 How Debugging Practices Differ Between Observability and Monitoring
19(10)
How Monitoring Data Is Used for Debugging
19(2)
Troubleshooting Behaviors When Using Dashboards
21(2)
The Limitations of Troubleshooting by Intuition
23(1)
Traditional Monitoring Is Fundamentally Reactive
24(2)
How Observability Enables Better Debugging
26(2)
Conclusion
28(1)
3 Lessons from Scaling Without Observability
29(14)
An Introduction to Parse
29(2)
Scaling at Parse
31(2)
The Evolution Toward Modern Systems
33(3)
The Evolution Toward Modern Practices
36(2)
Shifting Practices at Parse
38(3)
Conclusion
41(2)
4 How Observability Relates to DevOps, SRE, and Cloud Native
43(8)
Cloud Native, DevOps, and SRE in a Nutshell
43(2)
Observability: Debugging Then Versus Now
45(1)
Observability Empowers DevOps and SRE Practices
46(2)
Conclusion
48(3)
Part II Fundamentals of Observability
5 Structured Events Are the Building Blocks of Observability
51(10)
Debugging with Structured Events
52(1)
The Limitations of Metrics as a Building Block
53(2)
The Limitations of Traditional Logs as a Building Block
55(1)
Unstructured Logs
55(1)
Structured Logs
56(1)
Properties of Events That Are Useful in Debugging
57(2)
Conclusion
59(2)
6 Stitching Events into Traces
61(12)
Distributed Tracing and Why It Matters Now
61(2)
The Components of Tracing
63(2)
Instrumenting a Trace the Hard Way
65(3)
Adding Custom Fields into Trace Spans
68(2)
Stitching Events into Traces
70(1)
Conclusion
71(2)
7 Instrumentation with OpenTelemetry
73(10)
A Brief Introduction to Instrumentation
74(1)
Open Instrumentation Standards
74(1)
Instrumentation Using Code-Based Examples
75(1)
Start with Automatic Instrumentation
76(2)
Add Custom Instrumentation
78(2)
Send Instrumentation Data to a Backend System
80(2)
Conclusion
82(1)
8 Analyzing Events to Achieve Observability
83(12)
Debugging from Known Conditions
84(1)
Debugging from First Principles
85(1)
Using the Core Analysis Loop
86(2)
Automating the Brute-Force Portion of the Core Analysis Loop
88(3)
This Misleading Promise of AIOps
91(1)
Conclusion
92(3)
9 How Observability and Monitoring Come Together
95(12)
Where Monitoring Fits
96(1)
Where Observability Fits
97(1)
System Versus Software Considerations
97(2)
Assessing Your Organizational Needs
99(2)
Exceptions: Infrastructure Monitoring That Can't Be Ignored
101(1)
Real-World Examples
101(2)
Conclusion
103(4)
Part III Observability for Teams
10 Applying Observability Practices in Your Team
107(10)
Join a Community Group
107(2)
Start with the Biggest Pain Points
109(1)
Buy Instead of Build
109(2)
Flesh Out Your Instrumentation Iteratively
111(1)
Look for Opportunities to Leverage Existing Efforts
112(2)
Prepare for the Hardest Last Push
114(1)
Conclusion
115(2)
11 Observability-Driven Development
117(10)
Test-Driven Development
117(1)
Observability in the Development Cycle
118(1)
Determining Where to Debug
119(1)
Debugging in the Time of Microservices
120(1)
How Instrumentation Drives Observability
121(2)
Shifting Observability Left
123(1)
Using Observability to Speed Up Software Delivery
123(2)
Conclusion
125(2)
12 Using Service-Level Objectives for Reliability
127(12)
Traditional Monitoring Approaches Create Dangerous Alert Fatigue
127(2)
Threshold Alerting Is for Known-Unknowns Only
129(2)
User Experience Is a North Star
131(1)
What Is a Service-Level Objective?
132(1)
Reliable Alerting with SLOs
133(2)
Changing Culture Toward SLO-Based Alerts: A Case Study
135(3)
Conclusion
138(1)
13 Acting on and Debugging SLO-Based Alerts
139(18)
Alerting Before Your Error Budget Is Empty
139(2)
Framing Time as a Sliding Window
141(1)
Forecasting to Create a Predictive Burn Alert
142(2)
The Lookahead Window
144(7)
The Baseline Window
151(1)
Acting on SLO Burn Alerts
152(2)
Using Observability Data for SLOs Versus Time-Series Data
154(2)
Conclusion
156(1)
14 Observability and the Software Supply Chain
157(16)
Why Slack Needed Observability
159(2)
Instrumentation: Shared Client Libraries and Dimensions
161(3)
Case Studies: Operationalizing the Supply Chain
164(1)
Understanding Context Through Tooling
164(2)
Embedding Actionable Alerting
166(2)
Understanding What Changed
168(2)
Conclusion
170(3)
Part IV Observability at Scale
15 Build Versus Buy and Return on Investment
173(12)
How to Analyze the ROI of Observability
174(1)
The Real Costs of Building Your Own
175(1)
The Hidden Costs of Using "Free" Software
175(1)
The Benefits of Building Your Own
176(1)
The Risks of Building Your Own
177(2)
The Real Costs of Buying Software
179(1)
The Hidden Financial Costs of Commercial Software
179(1)
The Hidden Nonfinancial Costs of Commercial Software
180(1)
The Benefits of Buying Commercial Software
181(1)
The Risks of Buying Commercial Software
182(1)
Buy Versus Build Is Not a Binary Choice
182(1)
Conclusion
183(2)
16 Efficient Data Storage
185(22)
The Functional Requirements for Observability
185(2)
Time-Series Databases Are Inadequate for Observability
187(2)
Other Possible Data Stores
189(1)
Data Storage Strategies
190(3)
Case Study: The Implementation of Honeycombs Retriever
193(1)
Partitioning Data by Time
194(1)
Storing Data by Column Within Segments
195(2)
Performing Query Workloads
197(2)
Querying for Traces
199(1)
Querying Data in Real Time
200(1)
Making It Affordable with Tiering
200(1)
Making It Fast with Parallelism
201(1)
Dealing with High Cardinality
202(1)
Scaling and Durability Strategies
202(2)
Notes on Building Your Own Efficient Data Store
204(1)
Conclusion
205(2)
17 Cheap and Accurate Enough: Sampling
207(18)
Sampling to Refine Your Data Collection
207(2)
Using Different Approaches to Sampling
209(1)
Constant-Probability Sampling
209(1)
Sampling on Recent Traffic Volume
210(1)
Sampling Based on Event Content (Keys)
210(1)
Combining per Key and Historical Methods
211(1)
Choosing Dynamic Sampling Options
211(1)
When to Make a Sampling Decision for Traces
211(1)
Translating Sampling Strategies into Code
212(1)
The Base Case
212(1)
Fixed-Rate Sampling
213(1)
Recording the Sample Rate
213(2)
Consistent Sampling
215(1)
Target Rate Sampling
216(2)
Having More Than One Static Sample Rate
218(1)
Sampling by Key and Target Rate
218(2)
Sampling with Dynamic Rates on Arbitrarily Many Keys
220(2)
Putting It All Together: Head and Tail per Key Target Rate Sampling
222(1)
Conclusion
223(2)
18 Telemetry Management with Pipelines
225(18)
Attributes of Telemetry Pipelines
226(1)
Routing
226(1)
Security and Compliance
227(1)
Workload Isolation
227(1)
Data Buffering
228(1)
Capacity Management
228(1)
Data Filtering and Augmentation
229(1)
Data Transformation
230(1)
Ensuring Data Quality and Consistency
230(1)
Managing a Telemetry Pipeline: Anatomy
231(2)
Challenges When Managing a Telemetry Pipeline
233(1)
Performance
233(1)
Correctness
233(1)
Availability
233(1)
Reliability
234(1)
Isolation
234(1)
Data Freshness
234(1)
Use Case: Telemetry Management at Slack
235(1)
Metrics Aggregation
235(1)
Logs and Trace Events
236(2)
Open Source Alternatives
238(1)
Managing a Telemetry Pipeline: Build Versus Buy
239(1)
Conclusion
240(3)
Part V Spreading Observability Culture
19 The Business Case for Observability
243(12)
The Reactive Approach to Introducing Change
243(2)
The Return on Investment of Observability
245(1)
The Proactive Approach to Introducing Change
246(2)
Introducing Observability as a Practice
248(1)
Using the Appropriate Tools
249(1)
Instrumentation
250(1)
Data Storage and Analytics
250(1)
Rolling Out Tools to Your Teams
251(1)
Knowing When You Have Enough Observability
252(1)
Conclusion
253(2)
20 Observability's Stakeholders and Allies
255(12)
Recognizing Nonengineering Observability Needs
255(3)
Creating Observability Allies in Practice
258(1)
Customer Support Teams
258(1)
Customer Success and Product Teams
259(1)
Sales and Executive Teams
260(1)
Using Observability Versus Business Intelligence Tools
261(1)
Query Execution Time
262(1)
Accuracy
262(1)
Recency
262(1)
Structure
263(1)
Time Windows
263(1)
Ephemerality
264(1)
Using Observability and BI Tools Together in Practice
264(1)
Conclusion
265(2)
21 An Observability Maturity Model
267(12)
A Note About Maturity Models
267(1)
Why Observability Needs a Maturity Model
268(1)
About the Observability Maturity Model
269(1)
Capabilities Referenced in the OMM
270(1)
Respond to System Failure with Resilience
271(2)
Deliver High-Quality Code
273(1)
Manage Complexity and Technical Debt
274(1)
Release on a Predictable Cadence
275(1)
Understand User Behavior
276(1)
Using the OMM for Your Organization
277(1)
Conclusion
277(2)
22 Where to Go from Here
279(8)
Observability, Then Versus Now
279(2)
Additional Resources
281(1)
Predictions for Where Observability Is Going
282(5)
Index 287
Charity Majors is a cofounder and engineer at Honeycomb.io, a startup that blends the speed of time series with the raw power of rich events to give you interactive, iterative debugging for complex systems. She has worked at companies like Facebook, Parse, and Linden Lab, as a systems engineer and engineering manager, but always seems to end up responsible for the databases too.

Liz Fong-Jones is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 15+ years of experience. She is an advocate at Honeycomb.io for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights.

George Miranda is a former engineer turned product marketer at Honeycomb.io. He spent 15+ years building large scale distributed systems in the finance and video game industries. He discovered his knack for storytelling and now works to shape the tools, practices, and culture that help improve the lives of people responsible for managing production systems.