Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More - Tapa blanda

Poisson, Peter E.

9798294338459: Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Tapa blanda

ISBN 13: 9798294338459

Editorial: Independently published, 2025

Ver todas las copias de esta edici�n del ISBN

2 Usado

De EUR 18,59

6 Nuevo

De EUR 19,00

Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.

Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you’re building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.

Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.

Key Features:
• Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more
• Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching
• Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment
• Covers performance profiling, streaming, batching, and cost-efficient scaling
• Future-proof insights on compiler-aware models, LoRA 2.0, and edge inference

Ready to build LLM systems that are faster, cheaper, and more scalable?
Grab your copy of Optimizing LLM Performance today and deploy smarter.

"Sinopsis" puede pertenecer a otra edici�n de este libro.

Editorial: Independently published
A�o de publicaci�n: 2025
Idioma: Ingl�s
ISBN 13: 9798294338459
Encuadernaci�n: Tapa blanda
N�mero de p�ginas: 163
Contacto del fabricante: no disponible
Persona responsable: no disponible

Comprar usado

Condici�n: Como Nuevo

Unread book in perfect condition...

Ver este art�culo

EUR 18,59

Env�o por EUR 2,29
Se env�a dentro de Estados Unidos de America

A�adir al carrito

Comprar nuevo

Ver este art�culo

EUR 19,00

Env�o por EUR 2,29
Se env�a dentro de Estados Unidos de America

A�adir al carrito

Resultados de la b�squeda para Optimizing LLM Performance: Framework-Agnostic Techniques...

Imagen de archivo

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Publicado por Independently published, 2025

ISBN 13: 9798294338459

Antiguo o usado Tapa blanda

Librería: GreatBookPrices, Columbia, MD, Estados Unidos de America

Calificaci�n del vendedor: 5 de 5 estrellas

Condici�n: As New. Unread book in perfect condition. N� de ref. del art�culo: 50955172

Contactar al vendedor

Comprar usado

EUR 18,59

Env�o por EUR 2,29
Se env�a dentro de Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Publicado por Independently published, 2025

ISBN 13: 9798294338459

Nuevo Tapa blanda

Librería: GreatBookPrices, Columbia, MD, Estados Unidos de America

Calificaci�n del vendedor: 5 de 5 estrellas

Condici�n: New. N� de ref. del art�culo: 50955172-n

Contactar al vendedor

Comprar nuevo

EUR 19,00

Env�o por EUR 2,29
Se env�a dentro de Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance

Poisson, Peter E.

Publicado por Independently Published, 2025

ISBN 13: 9798294338459

Nuevo PAP

Librería: PBShop.store US, Wood Dale, IL, Estados Unidos de America

Calificaci�n del vendedor: 5 de 5 estrellas

PAP. Condici�n: New. New Book. Shipped from UK. Established seller since 2000. N� de ref. del art�culo: L2-9798294338459

Contactar al vendedor

Comprar nuevo

EUR 21,37

Gastos de env�o gratis
Se env�a dentro de Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance (Paperback)

Peter E. Poisson

Publicado por Independently Published, 2025

ISBN 13: 9798294338459

Nuevo Paperback

Impresi�n bajo demanda

Librería: Grand Eagle Retail, Bensenville, IL, Estados Unidos de America

Calificaci�n del vendedor: 5 de 5 estrellas

Paperback. Condici�n: new. Paperback. Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you're building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.Key Features: - Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more- Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching- Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment- Covers performance profiling, streaming, batching, and cost-efficient scaling- Future-proof insights on compiler-aware models, LoRA 2.0, and edge inferenceReady to build LLM systems that are faster, cheaper, and more scalable?Grab your copy of Optimizing LLM Performance today and deploy smarter. This item is printed on demand. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. N� de ref. del art�culo: 9798294338459

Contactar al vendedor

Comprar nuevo

EUR 21,94

Gastos de env�o gratis
Se env�a dentro de Estados Unidos de America

Cantidad disponible: 1 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance

Poisson, Peter E.

Publicado por Independently Published, 2025

ISBN 13: 9798294338459

Nuevo PAP

Librería: PBShop.store UK, Fairford, GLOS, Reino Unido

Calificaci�n del vendedor: 4 de 5 estrellas

PAP. Condici�n: New. New Book. Shipped from UK. Established seller since 2000. N� de ref. del art�culo: L2-9798294338459

Contactar al vendedor

Comprar nuevo

EUR 19,20

Env�o por EUR 4,82
Se env�a de Reino Unido a Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Publicado por Independently published, 2025

ISBN 13: 9798294338459

Nuevo Tapa blanda

Librería: GreatBookPricesUK, Woodford Green, Reino Unido

Calificaci�n del vendedor: 5 de 5 estrellas

Condici�n: New. N� de ref. del art�culo: 50955172-n

Contactar al vendedor

Comprar nuevo

EUR 19,19

Env�o por EUR 17,38
Se env�a de Reino Unido a Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Publicado por Independently published, 2025

ISBN 13: 9798294338459

Antiguo o usado Tapa blanda

Librería: GreatBookPricesUK, Woodford Green, Reino Unido

Calificaci�n del vendedor: 5 de 5 estrellas

Condici�n: As New. Unread book in perfect condition. N� de ref. del art�culo: 50955172

Contactar al vendedor

Comprar usado

EUR 20,83

Env�o por EUR 17,38
Se env�a de Reino Unido a Estados Unidos de America

Cantidad disponible: M�s de 20 disponibles

A�adir al carrito

Imagen de archivo

Optimizing LLM Performance (Paperback)

Peter E. Poisson

Publicado por Independently Published, 2025

ISBN 13: 9798294338459

Nuevo Paperback

Impresi�n bajo demanda

Librería: CitiRetail, Stevenage, Reino Unido

Calificaci�n del vendedor: 5 de 5 estrellas

Paperback. Condici�n: new. Paperback. Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you're building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.Key Features: - Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more- Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching- Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment- Covers performance profiling, streaming, batching, and cost-efficient scaling- Future-proof insights on compiler-aware models, LoRA 2.0, and edge inferenceReady to build LLM systems that are faster, cheaper, and more scalable?Grab your copy of Optimizing LLM Performance today and deploy smarter. This item is printed on demand. Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. N� de ref. del art�culo: 9798294338459

Contactar al vendedor

Comprar nuevo

EUR 23,26

Env�o por EUR 42,88
Se env�a de Reino Unido a Estados Unidos de America

Cantidad disponible: 1 disponibles

A�adir al carrito