vllm/docs/features/quantization
Reid 3a886bd58c
[Misc] small improve (#18680)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-25 06:05:38 -07:00
..
README.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
auto_awq.md [doc] improve readability (#18675) 2025-05-25 01:40:31 -07:00
bitblas.md [doc] improve readability (#18675) 2025-05-25 01:40:31 -07:00
bnb.md [Misc] small improve (#18680) 2025-05-25 06:05:38 -07:00
fp8.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
gguf.md [doc] improve readability (#18675) 2025-05-25 01:40:31 -07:00
gptqmodel.md [doc] improve readability (#18675) 2025-05-25 01:40:31 -07:00
int4.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
int8.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
modelopt.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
quantized_kvcache.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
quark.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
supported_hardware.md Migrate docs from Sphinx to MkDocs (#18145) 2025-05-23 02:09:53 -07:00
torchao.md [doc] improve readability (#18675) 2025-05-25 01:40:31 -07:00

README.md

title
Quantization

{ #quantization-index }

Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

Contents: