Nehodí se? Vůbec nevadí! Zboží můžete vrátit až do 30 dní
S dárkovým poukazem nešlápnete vedle. Obdarovaný si za dárkový poukaz může vybrat cokoliv z naší nabídky.
Až 30 dní na vrácení zboží
Advanced Concepts in Transformers for Deep Learning goes beyond explaining what the architecture does to reveal why it works and how to push it further.
Written for researchers and machine learning engineers who have outgrown introductory treatments, this book develops genuine mathematical fluency across the full transformer landscape. Readers will find rigorous derivations of attention variants, positional encodings, state space models (including Mamba), and Mixture-of-Experts routing, alongside concrete implementations that connect theory directly to practice.
The book spans efficient and sparse attention mechanisms; vision and multimodal transformers; graph and speech architectures; and NLP applications, including pre-trained language models and sequence-to-sequence systems. It also covers parameter-efficient fine-tuning, retrieval-augmented generation, multi-agent systems and tool use, hybrid Transformer SSM designs, speculative decoding, and FlashAttention. In addition, it addresses RLHF and alignment techniques such as DPO and Constitutional AI, as well as advanced prompting frameworks including chain-of-thought and tree-of-thoughts.
Advanced optimization is treated in depth, including adaptive optimizers, learning rate scheduling, gradient clipping, and regularization strategies. These are presented alongside distributed training approaches data, model, pipeline, tensor, and context parallelism as well as FSDP and DeepSpeed workflows.
Inference receives equal rigor, with coverage of quantization, KV-cache optimization, continuous batching, paged attention memory management, and disaggregated prefill decode systems. Interpretability, robustness, and ethical alignment are treated as core design considerations throughout, rather than isolated topics.
Hands-on chapters guide readers from scratch implementations through parameter-efficient fine-tuning and Mixture-of-Experts case studies. A working knowledge of deep learning fundamentals and basic transformers is assumed. Whether designing new architectures or deploying large models at scale, this book serves as a rigorous, comprehensive reference for advanced practitioners.
Ahoj! Jsem Libroamiko, tvůj knižní rádce.
Jak ti můžu pomoct?