Minor
Signed-off-by: WoosukKwon <woosuk.kwon@berkeley.edu>
This commit is contained in:
parent
a8f7abcc58
commit
41e379c103
|
@ -109,7 +109,7 @@ V1 supports decoder-only Transformers like Llama, mixture-of-experts (MoE) model
|
||||||
V1 currently lacks support for log probs, prompt log probs sampling parameters, pipeline parallelism, structured decoding, speculative decoding, prometheus metrics, and LoRA. We are actively working to close this feature gap and add new optimizations. Please stay tuned!
|
V1 currently lacks support for log probs, prompt log probs sampling parameters, pipeline parallelism, structured decoding, speculative decoding, prometheus metrics, and LoRA. We are actively working to close this feature gap and add new optimizations. Please stay tuned!
|
||||||
|
|
||||||
**Hardware Support:**
|
**Hardware Support:**
|
||||||
V1 currently supports only Ampere or later NVIDIA GPUs. We are working on support for other hardware backends.
|
V1 currently supports only Ampere or later NVIDIA GPUs. We are working on support for other hardware backends such as TPU.
|
||||||
|
|
||||||
Finally, please note that you can continue using V0 and maintain backward compatibility by not setting `VLLM_USE_V1=1`.
|
Finally, please note that you can continue using V0 and maintain backward compatibility by not setting `VLLM_USE_V1=1`.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue