Default Branch

0e3fe896e2 · Support Llama 4 for fused_marlin_moe (#20457) · Updated 2025-07-04 15:55:10 +08:00

Branches

d3eddd6ef1 · initial · Updated 2025-04-02 07:06:59 +08:00    vLLM

1909
1

af985d70bf · change to greedy · Updated 2025-04-02 06:53:26 +08:00    vLLM

1947
7

db9dfcfa6a · [Docs] Add Ollama meetup slides (#15905) · Updated 2025-04-02 04:58:59 +08:00    vLLM

1905
0
Included

a3fac739b3 · Bump actions/setup-python from 5.4.0 to 5.5.0 · Updated 2025-03-31 12:11:47 +08:00    vLLM

1958
1

4c42267293 · updated · Updated 2025-03-28 10:26:20 +08:00    vLLM

2020
4

44d638a896 · merge · Updated 2025-03-26 01:26:20 +08:00    vLLM

2083
4

25f560a62c · [V1][Spec Decode] Update target_logits in place for rejection sampling (#15427) · Updated 2025-03-25 12:04:41 +08:00    vLLM

2093
0
Included

220d694080 · updated · Updated 2025-03-24 09:00:20 +08:00    vLLM

2137
50

13d8b590c1 · minor · Updated 2025-03-21 13:59:00 +08:00    vLLM

2175
20

8db54c7912 · Merge branch 'main' into v1-sched-interface-2 · Updated 2025-03-21 08:56:13 +08:00    vLLM

2175
17

61c7a1b856 · [V1] Minor V1 async engine test refactor (#15075) · Updated 2025-03-20 01:37:17 +08:00    vLLM

2208
0
Included

966f933ee1 · [Bugfix] Fix LoRA extra vocab size (#15047) · Updated 2025-03-19 01:51:10 +08:00    vLLM

2249
9

031c8b32a4 · Add time comment · Updated 2025-03-17 21:50:44 +08:00    vLLM

2254
4

90eb28ca21 · [V1][Scheduler] Use dict for running queue · Updated 2025-03-14 04:11:07 +08:00    vLLM

2350
1

bfff9bcd1d · [V1] TPU - Remove self.kv_caches · Updated 2025-03-06 04:42:05 +08:00    vLLM

2533
1

3679753af5 · Reduce Scatter Plumbing · Updated 2025-03-01 00:33:52 +08:00    vLLM

2611
1

34e3494e70 · Fix failing `MyGemma2Embedding` test (#13820) · Updated 2025-02-26 04:33:03 +08:00    vLLM

2671
0
Included

243408b6b4 · Support moe_wna16 as well · Updated 2025-02-13 03:18:29 +08:00    vLLM

2918
4

70b4e46e70 · compilation is fixed · Updated 2025-02-07 04:49:29 +08:00    vLLM

3184
14

0408efc6d0 · [Misc] Improve error message for incorrect pynvml (#12809) · Updated 2025-02-06 15:23:50 +08:00    vLLM

3018
0
Included