Michael Goin
51d7c6a2b2
[Model] Support Mistral3 in the HF Transformers format ( #15505 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-04-01 06:10:05 -07:00
shangmingc
239b7befdd
[V1][Spec Decode] Remove deprecated spec decode config params ( #15466 )
...
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-03-31 09:19:35 -07:00
Cyrus Leung
09e974d483
[Bugfix] Check dimensions of multimodal embeddings in V1 ( #15816 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-31 09:01:35 -07:00
Isotr0py
3c0ff914ac
[Bugfix] Fix Mllama interleaved images input support ( #15564 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-03-29 18:11:15 +00:00
pengyuange
de1cb38769
[Model] Support Skywork-R1V ( #15397 )
...
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
2025-03-28 20:39:21 -07:00
Nick Hill
15dac210f0
[V1] AsyncLLM data parallel ( #13923 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-03-27 16:14:41 -07:00
Chengji Yao
619d3de8bd
[TPU] [V1] fix cases when max_num_reqs is set smaller than MIN_NUM_SEQS ( #15583 )
...
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-03-26 22:46:26 -07:00
youkaichao
e64afa455c
multi-node offline DP+EP example ( #15484 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-03-26 23:54:24 +08:00
Reid
4ec2cee000
[Misc] improve example script output ( #15528 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-03-26 10:12:47 +00:00
Cyrus Leung
a9e879b316
[Misc] Clean up MiniCPM-V/O code ( #15337 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-25 10:22:52 +00:00
shangmingc
50c9636d87
[V1][Usage] Refactor speculative decoding configuration and tests ( #14434 )
...
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-03-22 19:28:10 -10:00
Woosuk Kwon
e588ac237c
Add an example for reproducibility ( #15262 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-03-20 19:55:47 -07:00
Jee Jee Li
10f55fe6c5
[Misc] Clean up the BitsAndBytes arguments ( #15140 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-20 19:17:12 -07:00
Roger Wang
34868b106a
[Doc] Update Mistral Small 3.1/Pixtral example ( #15184 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-03-20 04:46:06 +00:00
Cyrus Leung
ffa443afed
[Bugfix] Fix embedding assignment for InternVL-based models ( #15086 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-20 03:40:13 +00:00
Jee Jee Li
46c759c165
[Bugfix] Fix LoRA extra vocab size ( #15047 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-18 09:40:29 -07:00
Patrick von Platen
f863ffc965
[Mistral-Small 3.1] Update docs and tests ( #14977 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-18 03:29:42 -07:00
Cyrus Leung
6eaf1e5c52
[Misc] Add `--seed` option to offline multi-modal examples ( #14934 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-17 03:00:17 -07:00
Nick Hill
b82662d952
[BugFix] Fix torch distributed stateless PG backend init ( #14870 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-03-15 20:26:19 -07:00
Rémi Delacourt
61c6a5a796
[VLM] Merged multi-modal processor for Pixtral ( #12211 )
...
Signed-off-by: remi <remi@mistral.ai>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-15 06:28:27 -07:00
Bryan Lu
9ed6ee92d6
[Bugfix] EAGLE output norm bug ( #14464 )
...
Signed-off-by: Bryan Lu <yuzhelu@amazon.com>
2025-03-15 06:50:33 +00:00
Cyrus Leung
382403921f
[VLM] Support pan-and-scan for Gemma3 multi-modal processor ( #14672 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-13 02:23:12 -07:00
Woosuk Kwon
c0c25e25fa
[Model] Add support for Gemma 3 ( #14660 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-12 08:36:33 -07:00
Harry Mellor
3b352a2f92
Correct capitalisation: `VLLM` -> `vLLM` ( #14562 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-10 16:36:21 +00:00
Chengji Yao
212007b168
[Hardware][TPU] Fix the recompiling issue in logits processor after warmup ( #14510 )
...
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-03-09 05:44:39 -04:00
Isotr0py
03fe18ae0f
[VLM] Add TP support for Phi-4-MM ( #14453 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-08 05:57:14 -08:00
Jee Jee Li
952a074980
[Misc] Add Phi4-MM example ( #14343 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-07 17:28:52 +00:00
Tyler Michael Smith
cc2f9b32c8
[Distributed] Add enable_expert_parallel arg ( #14305 )
...
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-03-06 18:54:45 +00:00
youkaichao
151b08e0fe
[RLHF] use worker_extension_cls for compatibility with V0 and V1 ( #14185 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-03-07 00:32:46 +08:00
Vincent
a4f1ee35d6
Deprecate `best_of` Sampling Parameter in anticipation for vLLM V1 ( #13997 )
...
Signed-off-by: vincent-4 <vincentzhongy+githubvincent4@gmail.com>
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-05 20:22:43 +00:00
Isotr0py
f71b00a19e
[Bugfix] Fix broken vision language example ( #14292 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-05 15:57:10 +00:00
Tyler Michael Smith
72c62eae5f
[V1] EP/TP MoE + DP Attention ( #13931 )
2025-03-04 21:27:26 -08:00
lkchen
b3cf368d79
[V1][Molmo] Fix get_multimodal_embeddings() in molmo.py ( #14161 )
2025-03-04 15:43:59 +00:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Isotr0py
fdcc405346
[Doc] Consolidate `whisper` and `florence2` examples ( #14050 )
2025-02-28 22:49:15 -08:00
Isotr0py
edf309ebbe
[VLM] Support multimodal inputs for Florence-2 models ( #13320 )
2025-02-27 02:06:41 -08:00
Chauncey
d08b285adf
[Misc] fixed qwen_vl_utils parameter error ( #13906 )
2025-02-26 08:31:53 -08:00
Jiayi Yao
2f42a4888c
[Feature] Support KV cache offloading and disagg prefill with LMCache connector. ( #12953 )
2025-02-25 00:38:42 -08:00
youkaichao
2382ad29d1
[ci] fix linter ( #13701 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-22 20:28:59 +08:00
youkaichao
3e472d882a
[core] set up data parallel communication ( #13591 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-22 19:28:59 +08:00
Harry Mellor
992e5c3d34
Merge similar examples in `offline_inference` into single `basic` example ( #12737 )
2025-02-20 04:53:51 -08:00
Cyrus Leung
377d10bd14
[VLM][Bugfix] Pass processor kwargs properly on init ( #13516 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-02-19 13:13:50 +00:00
Roger Wang
b7d309860e
[V1] Update doc and examples for H2O-VL ( #13349 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-02-16 10:35:54 +00:00
XiaobingZhang
84683fa271
[Bugfix] Offline example of disaggregated prefill ( #13214 )
2025-02-13 20:20:47 -08:00
Cyrus Leung
1bc3b5e71b
[VLM] Separate text-only and vision variants of the same model architecture ( #13157 )
2025-02-13 06:19:15 -08:00
Michael Goin
d88c8666a1
[Bugfix][Example] Fix GCed profiling server for TPU ( #12792 )
...
Signed-off-by: mgoin <michael@neuralmagic.com>
2025-02-13 11:52:11 +08:00
Christian Pinto
974dfd4971
[Model] IBM/NASA Prithvi Geospatial model ( #12830 )
2025-02-11 20:34:30 -08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
6c4dbe23eb
[BugFix] Pop instead of del CUDA_VISIBLE_DEVICES ( #12962 )
...
Signed-off-by: Hollow Man <hollowman@opensuse.org>
2025-02-12 00:21:50 +08:00
Farzad Abdolhosseini
08b2d845d6
[Model] Ultravox Model: Support v0.5 Release ( #12912 )
...
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
2025-02-10 22:02:48 +00:00
youkaichao
aa0ca5ebb7
[core][rlhf] add colocate example for RLHF ( #12984 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-10 10:28:59 +08:00
Jee Jee Li
86222a3dab
[VLM] Merged multi-modal processor for GLM4V ( #12449 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-02-08 20:32:16 +00:00
Cyrus Leung
8a69e0e20e
[CI/Build] Auto-fix Markdown files ( #12941 )
2025-02-08 04:25:15 -08:00
Shaoting
e31498bdcb
[Misc] Add offline test for disaggregated prefill ( #12418 )
2025-02-08 08:38:20 +00:00
Roger Wang
bf3b79efb8
[VLM] Qwen2.5-VL
2025-02-05 13:31:38 -08:00
youkaichao
bc1bdecebf
[core][distributed] exact ray placement control ( #12732 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-06 02:03:19 +08:00
Thomas Parnell
bb392af434
[Doc] Replace ibm-fms with ibm-ai-platform ( #12709 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-02-04 07:05:04 +00:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com>
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com>
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Alphi
d93bf4da85
[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM ( #12069 )
...
Signed-off-by: hzh <hezhihui_thu@163.com>
Signed-off-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
Signed-off-by: shaochangxu.scx <shaochangxu.scx@antgroup.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Akshat Tripathi <akshat@krai.ai>
Signed-off-by: Oleg Mosalov <oleg@krai.ai>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>
Signed-off-by: Chenguang Li <757486878@qq.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Shanshan Shen <467638484@qq.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: Yikun <yikunkero@gmail.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Co-authored-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
Co-authored-by: shaochangxu <85155497+shaochangxu@users.noreply.github.com>
Co-authored-by: shaochangxu.scx <shaochangxu.scx@antgroup.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: sixgod <evethwillbeok@outlook.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Akshat Tripathi <Akshat.tripathi6568@gmail.com>
Co-authored-by: Oleg Mosalov <oleg@krai.ai>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Avshalom Manevich <12231371+avshalomman@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Yangcheng Li <liyangcheng.lyc@alibaba-inc.com>
Co-authored-by: Siyuan Li <94890248+liaoyanqing666@users.noreply.github.com>
Co-authored-by: Concurrensee <yida.wu@amd.com>
Co-authored-by: Chenguang Li <757486878@qq.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Alex Brooks <alex.brooks@ibm.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Shanshan Shen <467638484@qq.com>
Co-authored-by: elijah <30852919+e1ijah1@users.noreply.github.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Steve Luo <36296769+SunflowerAries@users.noreply.github.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: TJian <tunjian1996@gmail.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Elfie Guo <164945471+elfiegg@users.noreply.github.com>
Co-authored-by: Rui Qiao <161574667+ruisearch42@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-01-29 09:24:59 +00:00
Michael Goin
fbb5bd4cef
[TPU] Add example for profiling TPU inference ( #12531 )
...
Signed-off-by: mgoin <mgoin@redhat.com>
2025-01-29 03:16:47 +00:00
Pooya Davoodi
0cc6b383d7
[Frontend] Support scores endpoint in run_batch ( #12430 )
...
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>
2025-01-27 04:30:17 +00:00
zhou fan
528dbcac7d
[Model][Bugfix]: correct Aria model output ( #12309 )
...
Signed-off-by: xffxff <1247714429@qq.com>
2025-01-22 11:39:19 +00:00
youkaichao
c222f47992
[core][bugfix] configure env var during import vllm ( #12209 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-20 19:35:59 +08:00
Cyrus Leung
b37d82791e
[Model] Upgrade Aria to transformers 4.48 ( #12203 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-20 17:58:48 +08:00
Isotr0py
02798ecabe
[Model] Port deepseek-vl2 processor, remove dependency ( #12169 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-18 13:59:39 +08:00
Chen Zhang
d06e824006
[Bugfix] Set enforce_eager automatically for mllama ( #12127 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-01-16 15:30:08 -05:00
Isotr0py
62b06ba23d
[Model] Add support for deepseek-vl2-tiny model ( #12068 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-16 17:14:48 +00:00
youkaichao
92e793d91a
[core] LLM.collective_rpc interface and RLHF example ( #12084 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-16 20:19:52 +08:00
youkaichao
bf53e0c70b
Support torchrun and SPMD-style offline inference ( #12071 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-16 19:58:53 +08:00
Isotr0py
d14e98d924
[Model] Support GGUF models newly added in `transformers` 4.46.0 ( #9685 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-13 00:13:44 +00:00
Isotr0py
f967e51f38
[Model] Initialize support for Deepseek-VL2 models ( #11578 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-12 00:17:24 -08:00
Harry Mellor
482cdc494e
[Doc] Rename offline inference examples ( #11927 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-10 23:50:29 +08:00
Harry Mellor
aba8d6ee00
[Doc] Move examples into categories ( #11840 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-08 13:09:53 +00:00