Merge pull request #2 from vllm-project/bold

Bold
2023-11-14 14:32:29 -08:00 · 2023-11-14 14:32:29 -08:00 · 73685a63a9
parent 600dace4c4 f67078d283
commit 73685a63a9
1 changed files with 3 additions and 3 deletions
--- a/_posts/2023-11-14-notes-vllm-vs-deepspeed.md
+++ b/_posts/2023-11-14-notes-vllm-vs-deepspeed.md
@ -23,8 +23,8 @@ For the majority of workloads, vLLM is faster than (or performs comparably to) D

 We've identified two key differences between vLLM and DeepSpeed in terms of performance optimization:

-1. DeepSpeed adopts a conservative/suboptimal memory allocation scheme, which wastes memory when output lengths are large.
-2. DeepSpeed’s Dynamic SplitFuse scheduling gives speedup only when prompt lengths are much greater than output lengths.
+1. **DeepSpeed adopts a conservative/suboptimal memory allocation scheme**, which wastes memory when output lengths are large.
+2. DeepSpeed’s Dynamic SplitFuse scheduling gives **speedup only when prompt lengths are much greater than output lengths**.

 As a result, DeepSpeed outperforms when the workload is consistently long prompt and short output.
 In other scenarios, vLLM shows superior performance.
@ -40,7 +40,7 @@ However, the performance gain we observe isn't as significant as 2x.
 </p>

 #### Scenario 2: Other cases
-In these cases, vLLM is up to 1.8x faster than DeepSpeed.
+In these cases, vLLM is up to **1.8x** faster than DeepSpeed.

 <p align="center">
 <picture>