vLLM/vllm - vllm - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Michael Goin	74f8c2cf5f	Add openai.beta.chat.completions.parse example to structured_outputs.rst (#10433 )	2024-11-19 04:37:46 +00:00
ismael-dm	31894a2155	[Doc] Add documentation for Structured Outputs (#9943 ) Signed-off-by: ismael-dm <ismaeldm99@gmail.com>	2024-11-18 09:52:12 -08:00
B-201	4186be8111	[Doc] Update doc for LoRA support in GLM-4V (#10425 ) Signed-off-by: B-201 <Joy25810@foxmail.com>	2024-11-18 15:08:30 +00:00
youkaichao	755b85359b	[doc] add doc for the plugin system (#10372 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-15 21:46:27 -08:00
Cyrus Leung	2ac6d0e75b	[Misc] Consolidate pooler config overrides (#10351 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-15 06:59:00 +00:00
Cyrus Leung	b40cf6402e	[Model] Support Qwen2 embeddings and use tags to select model tests (#10184 )	2024-11-14 20:23:09 -08:00
Cyrus Leung	0b8bb86bf1	[1/N] Initial prototype for multi-modal processor (#10044 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-13 12:39:03 +00:00
B-201	d909acf9fe	[Model][LoRA]LoRA support added for idefics3 (#10281 ) Signed-off-by: B-201 <Joy25810@foxmail.com>	2024-11-13 17:25:59 +08:00
Austin Veselka	1b886aa104	[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 (#9944 ) Signed-off-by: FurtherAI <austin.veselka@lighton.ai> Co-authored-by: FurtherAI <austin.veselka@lighton.ai>	2024-11-13 08:28:13 +00:00
电脑星人	3945c82346	[Model] Add support for Qwen2-VL video embeddings input & multiple image embeddings input with varied resolutions (#10221 ) Signed-off-by: imkero <kerorek@outlook.com>	2024-11-13 07:07:22 +00:00
whyiug	40d0e7411d	[Doc] Update FAQ links in spec_decode.rst (#9662 ) Signed-off-by: whyiug <whyiug@hotmail.com>	2024-11-08 04:44:58 +00:00
Eric	406d4cc480	[Model][LoRA]LoRA support added for Qwen2VLForConditionalGeneration (#10022 ) Signed-off-by: ericperfect <ericperfectttt@gmail.com>	2024-11-06 14:13:15 +00:00
Jee Jee Li	a5bba7d234	[Model] Add Idefics3 support (#9767 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: B-201 <Joy25810@foxmail.com> Co-authored-by: B-201 <Joy25810@foxmail.com>	2024-11-06 11:41:17 +00:00
Jee Jee Li	2003cc3513	[Model][LoRA]LoRA support added for LlamaEmbeddingModel (#10071 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-11-06 09:49:19 +00:00
Roger Wang	6e056bcf04	[Doc] Update VLM doc about loading from local files (#9999 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-11-04 19:47:11 +00:00
shanshan wang	54597724f4	[Model] Add support for H2OVL-Mississippi models (#9747 ) Signed-off-by: Shanshan Wang <shanshan.wang@h2o.ai> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-11-04 00:15:36 +00:00
Nick Hill	eed92f12fc	[Docs] Update Granite 3.0 models in supported models table (#9930 ) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-11-02 09:02:18 +00:00
Cyrus Leung	ba0d892074	[Frontend] Use a proper chat template for VLM2Vec (#9912 )	2024-11-01 14:09:07 +00:00
Cyrus Leung	06386a64dd	[Frontend] Chat-based Embeddings API (#9759 )	2024-11-01 08:13:35 +00:00
Cyrus Leung	d3aa2a8b2f	[Doc] Update multi-input support (#9906 )	2024-11-01 07:34:49 +00:00
Yongzao	2b5bf20988	[torch.compile] Adding torch compile annotations to some models (#9876 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-11-01 00:25:47 -07:00
Jee Jee Li	5608e611c2	[Doc] Update Qwen documentation (#9869 )	2024-10-31 08:54:18 +00:00
Russell Bryant	c5d7fb9ddc	[Doc] fix third-party model example (#9771 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-10-28 19:39:21 -07:00
kakao-kevin-us	6650e6a930	[Model] Add classification Task with Qwen2ForSequenceClassification (#9704 ) Signed-off-by: Kevin-Yang <ykcha9@gmail.com> Co-authored-by: Kevin-Yang <ykcha9@gmail.com>	2024-10-26 17:53:35 +00:00
Cyrus Leung	b979143d5b	[Doc] Move additional tips/notes to the top (#9647 )	2024-10-24 09:43:59 +00:00
Yongzao	8a02cd045a	[torch.compile] Adding torch compile annotations to some models (#9639 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-10-24 00:54:57 -07:00
Cyrus Leung	836e8ef6ee	[Bugfix] Fix PP for ChatGLM and Molmo (#9422 )	2024-10-24 06:12:05 +00:00
Yunfei Chu	fc6c274626	[Model] Add Qwen2-Audio model support (#9248 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-23 17:54:22 +00:00
Cyrus Leung	831540cf04	[Model] Support E5-V (#9576 )	2024-10-23 11:35:29 +08:00
Isotr0py	bb392ea2d2	[Model][VLM] Initialize support for Mono-InternVL model (#9528 )	2024-10-22 16:01:46 +00:00
Rafael Vasquez	f7db5f0fa9	[Doc] Use shell code-blocks and fix section headers (#9508 ) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>	2024-10-22 06:43:24 +00:00
Dhia Eddine Rhaiem	f6b97293aa	[Model] FalconMamba Support (#9325 )	2024-10-21 12:50:16 -04:00
Michael Goin	3921a2f29e	[Model] Support Pixtral models in the HF Transformers format (#9036 )	2024-10-18 13:29:56 -06:00
Cyrus Leung	051eaf6db3	[Model] Add user-configurable task for models that support both generation and embedding (#9424 )	2024-10-18 11:31:58 -07:00
Kuntai Du	81ede99ca4	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 ) Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).	2024-10-17 11:38:15 -05:00
Junhao Li	5b8a1fde84	[Model][Bugfix] Add FATReLU activation and support for openbmb/MiniCPM-S-1B-sft (#9396 )	2024-10-16 16:40:24 +00:00
Roger Wang	59230ef32b	[Misc] Consolidate example usage of OpenAI client for multimodal models (#9412 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-16 11:20:51 +00:00
Cyrus Leung	7abba39ee6	[Model] VLM2Vec, the first multimodal embedding model in vLLM (#9303 )	2024-10-16 14:31:00 +08:00
Michael Goin	8e836d982a	[Doc] Fix code formatting in spec_decode.rst (#9348 )	2024-10-14 21:29:11 -07:00
Tyler Michael Smith	169b530607	[Bugfix] Clean up some cruft in mamba.py (#9343 )	2024-10-15 00:24:25 +00:00
Reza Salehi	dfe43a2071	[Model] Molmo vLLM Integration (#9016 ) Co-authored-by: sanghol <sanghol@allenai.org> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-10-14 07:56:24 -07:00
Wallas Henrique	8baf85e4e9	[Doc] Compatibility matrix for mutual exclusive features (#8512 ) Signed-off-by: Wallas Santos <wallashss@ibm.com>	2024-10-11 11:18:50 -07:00
sixgod	6cf1167c1a	[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242 )	2024-10-11 17:36:13 +00:00
Tyler Michael Smith	7342a7d7f8	[Model] Support Mamba (#6484 )	2024-10-11 15:40:06 +00:00
Cyrus Leung	e808156f30	[Misc] Collect model support info in a single process per model (#9233 )	2024-10-11 11:08:11 +00:00
whyiug	04de9057ab	[Model] support input image embedding for minicpmv (#9237 )	2024-10-10 15:00:47 +00:00
Jiangtao Hu	dc4aea677a	[Doc] Fix VLM prompt placeholder sample bug (#9170 )	2024-10-09 08:59:42 +00:00
Sayak Paul	1874c6a1b0	[Doc] Update vlm.rst to include an example on videos (#9155 ) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-10-08 18:12:29 +00:00
Cyrus Leung	151ef4efd2	[Model] Support NVLM-D and fix QK Norm in InternViT (#9045 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2024-10-07 11:55:12 +00:00
Cyrus Leung	b22b798471	[Model] PP support for embedding models and update docs (#9090 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-10-06 16:35:27 +08:00
Cyrus Leung	f22619fe96	[Misc] Remove user-facing error for removed VLM args (#9104 )	2024-10-06 01:33:52 -07:00
Roger Wang	26aa325f4f	[Core][VLM] Test registration for OOT multimodal models (#8717 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:38:25 -07:00
Cyrus Leung	0e36fd4909	[Misc] Move registry to its own file (#9064 )	2024-10-04 10:01:37 +00:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
Nick Hill	18c2e30c57	[Doc] Update Granite model docs (#9025 )	2024-10-03 02:42:24 +00:00
Cyrus Leung	4f341bd4bf	[Doc] Update list of supported models (#8987 )	2024-10-02 00:35:39 +08:00
whyiug	e01ab595d8	[Model] support input embeddings for qwen2vl (#8856 )	2024-09-30 03:16:10 +00:00
Cyrus Leung	3b00b9c26c	[Core] rename`PromptInputs` and `inputs` (#8876 )	2024-09-26 20:35:15 -07:00
Roger Wang	4bb98f2190	[Misc] Update config loading for Qwen2-VL and remove Granite (#8837 )	2024-09-26 07:45:30 -07:00
Roger Wang	e2c6e0a829	[Doc] Update doc for Transformers 4.45 (#8817 )	2024-09-25 13:29:48 -07:00
Chen Zhang	770ec6024f	[Model] Add support for the multi-modal Llama 3.2 model (#8811 ) Co-authored-by: simon-mo <xmo@berkeley.edu> Co-authored-by: Chang Su <chang.s.su@oracle.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-25 13:29:32 -07:00
Simon Mo	4f1ba0844b	Revert "rename PromptInputs and inputs with backward compatibility (#8760 ) (#8810 )	2024-09-25 10:36:26 -07:00
Cyrus Leung	28e1299e60	rename PromptInputs and inputs with backward compatibility (#8760 )	2024-09-25 09:36:47 -07:00
Simon Mo	3185fb0cca	Revert "[Core] Rename `PromptInputs` to `PromptType`, and `inputs` to `prompt`" (#8750 )	2024-09-24 05:45:20 +00:00
litianjian	5b59532760	[Model][VLM] Add LLaVA-Onevision model support (#8486 ) Co-authored-by: litianjian <litianjian@bytedance.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-22 10:51:44 -07:00
Cyrus Leung	0057894ef7	[Core] Rename `PromptInputs` and `inputs`(#8673 )	2024-09-20 19:00:54 -07:00
Niklas Muennighoff	3b63de9353	[Model] Add OLMoE (#7922 )	2024-09-20 09:31:41 -07:00
Jiaxin Shan	260d40b5ea	[Core] Support Lora lineage and base model metadata management (#6315 )	2024-09-20 06:20:56 +00:00
Geun, Lim	e18749ff09	[Model] Support Solar Model (#8386 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-09-18 11:04:00 -06:00
ywfang	8a0cf1ddc3	[Model] support minicpm3 (#8297 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-14 14:50:26 +00:00
Cyrus Leung	a84e598e21	[CI/Build] Reorganize models tests (#7820 )	2024-09-13 10:20:06 -07:00
Alex Brooks	c6202daeed	[Model] Support multiple images for qwen-vl (#8247 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-12 10:10:54 -07:00
Patrick von Platen	d394787e52	Pixtral (#8377 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-11 14:41:55 -07:00
Yang Fan	3b7fea770f	[Model][VLM] Add Qwen2-VL model support (#7905 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-11 09:31:19 -07:00
Yangshen⚡Deng	6a512a00df	[model] Support for Llava-Next-Video model (#7559 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-10 22:21:36 -07:00
Isotr0py	e807125936	[Model][VLM] Support multi-images inputs for InternVL2 models (#8201 )	2024-09-07 16:38:23 +08:00
Cyrus Leung	2f707fcb35	[Model] Multi-input support for LLaVA (#8238 )	2024-09-07 02:57:24 +00:00
Jiaxin Shan	db3bf7c991	[Core] Support load and unload LoRA in api server (#6566 ) Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2024-09-05 18:10:33 -07:00
sroy745	2febcf2777	[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM (#7962 )	2024-09-05 16:25:29 -04:00
Alex Brooks	9da25a88aa	[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-05 12:48:10 +00:00
Cyrus Leung	288a938872	[Doc] Indicate more information about supported modalities (#8181 )	2024-09-05 10:51:53 +00:00
Wenxiang	1248e8506a	[Model] Adding support for MSFT Phi-3.5-MoE (#7729 ) Co-authored-by: Your Name <you@example.com> Co-authored-by: Zeqi Lin <zelin@microsoft.com> Co-authored-by: Zeqi Lin <Zeqi.Lin@microsoft.com>	2024-08-30 13:42:57 -06:00
Yohan Na	dc13e99348	[MODEL] add Exaone model support (#7819 )	2024-08-29 23:34:20 -07:00
Peter Salas	57792ed469	[Doc] Fix incorrect docs from #7615 (#7788 )	2024-08-22 10:02:06 -07:00
zifeitong	df1a21131d	[Model] Fix Phi-3.5-vision-instruct 'num_crops' issue (#7710 )	2024-08-22 09:36:24 +08:00
Peter Salas	1ca0d4f86b	[Model] Add UltravoxModel and UltravoxConfig (#7615 )	2024-08-21 22:49:39 +00:00
Roger Wang	4506641212	[Doc] Section for Multimodal Language Models (#7719 )	2024-08-20 23:24:01 -07:00
Cyrus Leung	3f674a49b5	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
Peter Salas	00c3d68e45	[Frontend][Core] Add plumbing to support audio language models (#7446 )	2024-08-13 17:39:33 +00:00
Roger Wang	e6e42e4b17	[Core][VLM] Support image embeddings as input (#6613 )	2024-08-12 16:16:06 +08:00
Jee Jee Li	757ac70a64	[Model] Rename MiniCPMVQwen2 to MiniCPMV2.6 (#7273 )	2024-08-08 14:02:41 +00:00
Stas Bekman	0e12cd67a8	[Doc] add online speculative decoding example (#7243 )	2024-08-07 09:58:02 -07:00
Thomas Parnell	789937af2e	[Doc] [SpecDecode] Update MLPSpeculator documentation (#7100 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>	2024-08-05 23:29:43 +00:00
Jee Jee Li	179a6a36f2	[Model]Refactor MiniCPMV (#7020 ) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-08-04 08:12:41 +00:00
Alphi	2f4e108f75	[Bugfix] Clean up MiniCPM-V (#6939 ) Co-authored-by: hezhihui <hzh7269@modelbest.cn> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-07-31 14:39:19 +00:00
Isotr0py	7cbd9ec7a9	[Model] Initialize support for InternVL2 series models (#6514 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-07-29 10:16:30 +00:00
Cyrus Leung	1ad86acf17	[Model] Initial support for BLIP-2 (#5920 ) Co-authored-by: ywang96 <ywang@roblox.com>	2024-07-27 11:53:07 +00:00
Roger Wang	ecb33a28cb	[CI/Build][Doc] Update CI and Doc for VLM example changes (#6860 )	2024-07-27 09:54:14 +00:00
Michael Goin	281977bd6e	[Doc] Add Nemotron to supported model docs (#6843 )	2024-07-26 17:32:44 -04:00
Alphi	9e169a4c61	[Model] Adding support for MiniCPM-V (#4087 )	2024-07-24 20:59:30 -07:00

1 2 3 4 5 ...

259 Commits