Muutke küpsiste eelistusi

Hands-On LLM Serving and Optimization: Hosting LLMs at Scale [Pehme köide]

  • Formaat: Paperback / softback, 300 pages, kõrgus x laius: 232x178 mm
  • Ilmumisaeg: 30-Apr-2026
  • Kirjastus: O'Reilly Media
  • ISBN-13: 9798341621497
  • Pehme köide
  • Hind: 75,98 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Tavahind: 89,39 €
  • Säästad 15%
  • See raamat ei ole veel ilmunud. Raamatu kohalejõudmiseks kulub orienteeruvalt 3-4 nädalat peale raamatu väljaandmist.
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Tellimisaeg 2-4 nädalat
  • Lisa soovinimekirja
  • Formaat: Paperback / softback, 300 pages, kõrgus x laius: 232x178 mm
  • Ilmumisaeg: 30-Apr-2026
  • Kirjastus: O'Reilly Media
  • ISBN-13: 9798341621497
Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale.

In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success.





Learn the key principles for designing a model-serving system tailored to popular business scenarios Understand the common challenges of hosting LLMs at scale while minimizing costs Pick up practical techniques for optimizing LLM serving performance Build a model-serving system that meets specific business requirements Improve LLM serving throughput and reduce latency Host LLMs in a cost-effective manner, balancing performance and resource efficiency