Doprava zdarma při nákupu nad 1 499 Kč přes Zásilkovnu nebo PPL Box.

Zjistit stav objednávky

Staňte se součástí komunity milovníků knih z celého světa a získejte hromadu výhod. Založit účet zdarma

Doprava zdarma se Zásilkovnou nad 1 499 Kč

Kurýr DPD 69 Kč PPL shop 49 Kč Balíkovna 69 Kč PPL kurýr 74 Kč PPL box 39 Kč Balíkovna 49 Kč Výdejní místo DPD 49 Kč Zásilkovna 39 Kč

Kontakt

Jak nakupovat

Pomoc

Můj účet

▸ Prázdný :-(

Doprava zdarma při nákupu nad 1 499 Kč přes Zásilkovnu nebo PPL Box.

Advanced Concepts in Transformers for Deep Learning

Name: Advanced Concepts in Transformers for Deep Learning
Brand: Springer, Berlin
SKU: 52385167
Price: 3048 CZK
Availability: PreOrder
Author: P. Yadla
ISBN: 9783032292797

DE

P. Yadla

Jazyk

Angličtina

Kniha Pevná

Libristo kód: 52385167

Nakladatelství Springer, Berlin, říjen 2026

Advanced Concepts in Transformers for Deep Learning goes beyond explaining what the architecture doe... Celý popis

Libristo kód: 52385167

305 b

Připravujeme Nové

Nové

3 048 Kč

Očekávaná novinka

Vydání 25. 10. 2026

Až 30 dní na vrácení zboží

Advanced Concepts in Transformers for Deep Learning goes beyond explaining what the architecture does to reveal why it works and how to push it further.

Written for researchers and machine learning engineers who have outgrown introductory treatments, this book develops genuine mathematical fluency across the full transformer landscape. Readers will find rigorous derivations of attention variants, positional encodings, state space models (including Mamba), and Mixture-of-Experts routing, alongside concrete implementations that connect theory directly to practice.

The book spans efficient and sparse attention mechanisms; vision and multimodal transformers; graph and speech architectures; and NLP applications, including pre-trained language models and sequence-to-sequence systems. It also covers parameter-efficient fine-tuning, retrieval-augmented generation, multi-agent systems and tool use, hybrid Transformer SSM designs, speculative decoding, and FlashAttention. In addition, it addresses RLHF and alignment techniques such as DPO and Constitutional AI, as well as advanced prompting frameworks including chain-of-thought and tree-of-thoughts.

Advanced optimization is treated in depth, including adaptive optimizers, learning rate scheduling, gradient clipping, and regularization strategies. These are presented alongside distributed training approaches data, model, pipeline, tensor, and context parallelism as well as FSDP and DeepSpeed workflows.

Inference receives equal rigor, with coverage of quantization, KV-cache optimization, continuous batching, paged attention memory management, and disaggregated prefill decode systems. Interpretability, robustness, and ethical alignment are treated as core design considerations throughout, rather than isolated topics.

Hands-on chapters guide readers from scratch implementations through parameter-efficient fine-tuning and Mixture-of-Experts case studies. A working knowledge of deep learning fundamentals and basic transformers is assumed. Whether designing new architectures or deploying large models at scale, this book serves as a rigorous, comprehensive reference for advanced practitioners.