vllm/quantization at topk_id_hack - vllm - Gitea: Git with a cup of tea

History

Lukas Geiger c3649e4fee [Docs] Fix syntax highlighting of shell commands (#19870 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>		2025-06-23 17:59:09 +00:00
..
README.md	[Doc] Fix quantization link titles (#19478 )	2025-06-11 01:27:22 -07:00
auto_awq.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
bitblas.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
bnb.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
fp8.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
gguf.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
gptqmodel.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
int4.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
int8.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
modelopt.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
quantized_kvcache.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
quark.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00
supported_hardware.md	[Doc][Neuron] Update documentation for Neuron (#18868 )	2025-05-28 19:44:01 -07:00
torchao.md	[Docs] Fix syntax highlighting of shell commands (#19870 )	2025-06-23 17:59:09 +00:00

README.md

title
Quantization

{ #quantization-index }

Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

Contents: