Doprava zdarma při nákupu nad 1 499 Kč přes Zásilkovnu nebo PPL Box.

Zjistit stav objednávky

Staňte se součástí komunity milovníků knih z celého světa a získejte hromadu výhod. Založit účet zdarma

Doprava zdarma se Zásilkovnou nad 1 499 Kč

Kurýr DPD 69 Kč PPL shop 49 Kč Balíkovna 69 Kč PPL kurýr 74 Kč PPL box 39 Kč Balíkovna 49 Kč Výdejní místo DPD 49 Kč Zásilkovna 39 Kč

Kontakt

Jak nakupovat

Pomoc

Můj účet

▸ Prázdný :-(

Doprava zdarma při nákupu nad 1 499 Kč přes Zásilkovnu nebo PPL Box.

Distributed Machine Learning Patterns

Name: Distributed Machine Learning Patterns
Brand: Cybersoft Publishing LLc
SKU: 52377935
Price: 548 CZK
Availability: InStock
Author: Jazper Carter
ISBN: 9798904980030

A Patterns-First Manual for Architects, Engineers, and Technical Leads

Jazper Carter

Jazyk

Angličtina

Kniha Brožovaná

Libristo kód: 52377935

Nakladatelství Cybersoft Publishing LLc, květen 2026

Distributed machine learning systems fail in ways single-node systems never do. A 1024-GPU training... Celý popis

Libristo kód: 52377935

55 b

Nové

548 Kč

Skladem u dodavatele Odesíláme za 9-15 dnů

30 dní na vrácení zboží

Distributed machine learning systems fail in ways single-node systems never do. A 1024-GPU training job stalls for four hours while every worker reports healthy; gradient synchronization deadlocks leave no stack trace and no alert. A serving cluster absorbs a traffic spike, then silently doubles inference cost because the KV cache policy was tuned for a model half the size. Checkpoint corruption surfaces only after twelve hours of resumed training. These are the predictable failure modes of distributed systems, and the teams that ship reliable distributed ML design against them with patterns that hold across frameworks, clouds, and model scales.
Inside this book, readers will learn how to:

Design parallelism strategies that fit workload shape and hardware, selecting among data, tensor, pipeline, and expert axes based on architecture, memory budget, and interconnect topology.
Tune gradient synchronization and sharding applying ZeRO, FSDP, and pipeline schedules to keep accelerator utilization high without amplifying communication overhead as cluster size grows.
Build fault-tolerant training pipelines with checkpoint strategies, elastic cluster patterns, and spot instance management that recover from mid-run hardware failures without restarting from epoch zero.
Operate inference at scale using continuous batching, paged attention, and KV cache management to maximize throughput and meet latency SLOs under variable load.
Instrument distributed jobs for observability tracing per-rank metrics, gradient norms, and communication timings so silent failures surface before consuming days of compute budget.
Manage multi-tenant clusters securely with workload isolation, quota enforcement, and cost attribution that keep shared GPU infrastructure safe and financially accountable.
Apply LLM and foundation model patterns for distributed pre-training, RLHF infrastructure, and large-scale inference that generalize across architectures as hardware generations turn over.
Assess platform maturity using the book's maturity model to locate gaps in reliability, cost efficiency, and operational readiness across the distributed ML stack.

Frameworks rotate; the parallelism decisions, synchronization tradeoffs, and fault-tolerance designs that determine whether a distributed ML system works at scale do not. As foundation models grow larger and serving loads grow steeper, the distance between teams that reason in patterns and teams that copy configurations will only widen.
The book is organized in four parts: Foundations, covering parallelism patterns, data sharding, I/O, and orchestration; Training at Scale, addressing fault-tolerant training, checkpoint management, and spot scheduling; Serving and Operations, covering inference architecture, cost control, observability, and multi-tenant security; and Frontier Patterns, applying everything to LLMs and foundation models and closing with end-to-end case studies and a full platform synthesis.
This book is for ML architects who design distributed systems others depend on, ML engineers and data engineers who build and operate them, and technical team leads who set reliability and cost standards, with platform and SRE engineers as a strong secondary audience. Every chapter opens with a production incident scenario, teaches canonical patterns by name, and closes with a checklist the team can apply immediately. Readers finish with the vocabulary, playbook, and pattern library to ship reliable distributed ML systems with confidence.

Herečka & Polyglotka

EWA KASP pro

Přehrát video

Libristo má největší výběr cizojazyčné literatury. Proto své knihy kupuji tady.

Informace o knize

Plný název Distributed Machine Learning Patterns

Autor Jazper Carter

Jazyk

Angličtina

Vazba Kniha - Brožovaná

Datum vydání 2026

Počet stran 406

EAN 9798904980030

Libristo kód 52377935

Nakladatelství Cybersoft Publishing LLc

Váha 543

Rozměry 152 x 229 x 21

Kategorie

Ekonomie, finance, obchod a management > Obchod a management > Obchodní strategie

Výpočetní a informační technologie > Databáze > Uchovávání a analýza dat

Výpočetní a informační technologie > Informatika > Systémová analýza a systémový design

Darujte tuto knihu ještě dnes

Je to snadné

1 Přidejte knihu do košíku a zvolte doručit jako dárek 2 Obratem vám zašleme poukaz 3 Kniha dorazí na adresu obdarovaného

Často hledané

Categories

Authors

Publishers

Často hledané

Zboží

Categories

Authors

Publishers

Doručení

Nákupní rádce

Distributed Machine Learning Patterns

A Patterns-First Manual for Architects, Engineers, and Technical Leads

Informace o knize

Kategorie

Darujte tuto knihu ještě dnes

Je to snadné

Často hledané

Categories

Authors

Publishers

Distributed Machine Learning Patterns

A Patterns-First Manual for Architects, Engineers, and Technical Leads

Informace o knize

Kategorie

Darujte tuto knihu ještě dnes

Je to snadné

Nemáte účet? Získejte výhody Libristo účtu!