From 4ff97a8f9d9384fc64902575dd2fb88a3fc9f175 Mon Sep 17 00:00:00 2001 From: WoosukKwon Date: Sun, 26 Jan 2025 22:50:18 -0800 Subject: [PATCH] Minor Signed-off-by: WoosukKwon --- _posts/2025-01-26-v1-alpha-release.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2025-01-26-v1-alpha-release.md b/_posts/2025-01-26-v1-alpha-release.md index 0809c8e..62d14b7 100644 --- a/_posts/2025-01-26-v1-alpha-release.md +++ b/_posts/2025-01-26-v1-alpha-release.md @@ -128,7 +128,7 @@ We measured the performance of vLLM V0 and V1 on Llama 3.1 8B and Llama 3.3 70B V1 demonstrated consistently lower latency than V0 especially at high QPS, thanks to the higher throughput it achieves. Given that the kernels used for V0 and V1 are almost identical, the performance difference is mainly due to the architectural improvements (reduced CPU overheads) in V1. -- **Vision-language Models: Qwen2-VL, 1xH100** +- **Vision-language Models: Qwen2-VL**