Building Production LLM Systems is the comprehensive engineering guide for deploying open-source language models at enterprise scale. Written by Cedar Moon, a veteran who has shipped multiple million-dollar LLM systems, this practitioner's handbook covers everything from selecting the right model family (LLaMA, Mistral, Mixtral) to building production-ready infrastructure that rivals proprietary APIs.
Across 15 detailed chapters, you'll master quantization techniques that run 70B models on consumer GPUs, inference optimization strategies delivering 3-8× higher throughput, fine-tuning pipelines for 405B parameters on a single card, and RAG systems ready for regulatory scrutiny. The book includes production-tested code, real infrastructure cost models, security hardening for regulated industries, and MLOps patterns for continuous improvement.
This isn't theory—it's battle-tested wisdom from real deployments across finance, healthcare, and defense. Whether you're running your first 7B model or orchestrating a thousand-GPU cluster, you'll gain the complete playbook to build LLM systems that are faster, cheaper, and more secure than any API—while maintaining full ownership of your AI stack.
Stop renting intelligence. Start owning it.
"Sinopsis" puede pertenecer a otra edición de este libro.
Librería: California Books, Miami, FL, Estados Unidos de America
Condición: New. Print on Demand. Nº de ref. del artículo: I-9798277266137
Cantidad disponible: Más de 20 disponibles
Librería: PBShop.store US, Wood Dale, IL, Estados Unidos de America
PAP. Condición: New. New Book. Shipped from UK. Established seller since 2000. Nº de ref. del artículo: L2-9798277266137
Cantidad disponible: Más de 20 disponibles
Librería: Grand Eagle Retail, Bensenville, IL, Estados Unidos de America
Paperback. Condición: new. Paperback. Building Production LLM Systems is the comprehensive engineering guide for deploying open-source language models at enterprise scale. Written by Cedar Moon, a veteran who has shipped multiple million-dollar LLM systems, this practitioner's handbook covers everything from selecting the right model family (LLaMA, Mistral, Mixtral) to building production-ready infrastructure that rivals proprietary APIs.Across 15 detailed chapters, you'll master quantization techniques that run 70B models on consumer GPUs, inference optimization strategies delivering 3-8 higher throughput, fine-tuning pipelines for 405B parameters on a single card, and RAG systems ready for regulatory scrutiny. The book includes production-tested code, real infrastructure cost models, security hardening for regulated industries, and MLOps patterns for continuous improvement.This isn't theory-it's battle-tested wisdom from real deployments across finance, healthcare, and defense. Whether you're running your first 7B model or orchestrating a thousand-GPU cluster, you'll gain the complete playbook to build LLM systems that are faster, cheaper, and more secure than any API-while maintaining full ownership of your AI stack.Stop renting intelligence. Start owning it. This item is printed on demand. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. Nº de ref. del artículo: 9798277266137
Cantidad disponible: 1 disponibles
Librería: PBShop.store UK, Fairford, GLOS, Reino Unido
PAP. Condición: New. New Book. Shipped from UK. Established seller since 2000. Nº de ref. del artículo: L2-9798277266137
Cantidad disponible: Más de 20 disponibles
Librería: CitiRetail, Stevenage, Reino Unido
Paperback. Condición: new. Paperback. Building Production LLM Systems is the comprehensive engineering guide for deploying open-source language models at enterprise scale. Written by Cedar Moon, a veteran who has shipped multiple million-dollar LLM systems, this practitioner's handbook covers everything from selecting the right model family (LLaMA, Mistral, Mixtral) to building production-ready infrastructure that rivals proprietary APIs.Across 15 detailed chapters, you'll master quantization techniques that run 70B models on consumer GPUs, inference optimization strategies delivering 3-8 higher throughput, fine-tuning pipelines for 405B parameters on a single card, and RAG systems ready for regulatory scrutiny. The book includes production-tested code, real infrastructure cost models, security hardening for regulated industries, and MLOps patterns for continuous improvement.This isn't theory-it's battle-tested wisdom from real deployments across finance, healthcare, and defense. Whether you're running your first 7B model or orchestrating a thousand-GPU cluster, you'll gain the complete playbook to build LLM systems that are faster, cheaper, and more secure than any API-while maintaining full ownership of your AI stack.Stop renting intelligence. Start owning it. This item is printed on demand. Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. Nº de ref. del artículo: 9798277266137
Cantidad disponible: 1 disponibles