Commit Graph

1139 Commits

Author SHA1 Message Date
wang.yuqi 6f1229f91d
[Model][2/N] Automatic conversion of CrossEncoding model (#19978)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-03 13:59:23 +00:00
Jee Jee Li 1819fbda63
[Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-03 21:58:46 +08:00
Cyrus Leung b024a42e93
[Core] Move multimodal placeholder from chat utils to model definition (#20355)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-03 08:18:30 +00:00
Michael Yao cb97f2bfc5
[Docs] Replace two list with tables in intel_gaudi.md (#20414)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-03 00:48:25 -07:00
qscqesze 363528de27
[Feature] Support MiniMax-M1 function calls features (#20297)
Signed-off-by: QscQ <qscqesze@gmail.com>
Signed-off-by: qingjun <qingjun@minimaxi.com>
2025-07-03 06:48:27 +00:00
Nicolò Lucchesi 3dd359147d
[Docs] Update EAGLE example (#20375)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-07-02 17:13:51 -07:00
cronoik-inceptionai b95877509b
Documentation update tool_calling: mapping back to function from response (#20373) 2025-07-02 05:55:49 -07:00
CSWYF3634076 e303dcf523
[Model] Add Ernie4.5 and Ernie4.5MoE Model Support (#20220)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
2025-07-02 03:37:01 -07:00
Michael Yao ae9c4d416f
[Docs] Make TPU ref prettier in google_tpu.md (#20356)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-02 02:04:08 -07:00
Michael Yao d853520b3e
[Docs] Fix indentations for 2-level items in deprecation_policy.md (#20352)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-01 23:50:31 -07:00
Kwai-Keye 8452946c06
[Model][VLM] Support Keye-VL-8B-Preview (#20126)
Signed-off-by: Kwai-Keye <Keye@kuaishou.com>
2025-07-01 23:35:04 -07:00
QiliangCui b205e8467d
[Doc][TPU] Add models and features supporting matrix. (#20230)
Signed-off-by: Qiliang Cui <cuiq@google.com>
2025-07-02 06:33:20 +00:00
yyzxw be0cfb2b68
fix[Docs]: link anchor is incorrect #20309 (#20315)
Signed-off-by: zxw <1020938856@qq.com>
2025-07-02 06:32:34 +00:00
Shintarou Okada 3d19d47d91
[Frontend] Expand tools even if tool_choice="none" (#17177)
Signed-off-by: okada shintarou <okada@preferred.jp>
2025-07-01 12:47:38 -04:00
aiyiwang2025 ecad851cbd
[Model]Add Tencent HunYuanMoEV1 Model Support (#20114)
Signed-off-by: aiyiwang <aiyiwang@tencent.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: quinnrong <quinnrong@tencent.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-01 07:28:13 -07:00
Yuxuan Zhang ed70f3c64f
Add GLM4.1V model (Draft) (#19331)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-01 12:48:26 +00:00
Reid 787b13389e
[doc] fix the incorrect logo in dark mode (#20289)
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-07-01 08:18:09 +00:00
Li, Jiang 6cc1e7d96d
[CPU] Update custom ops for the CPU backend (#20255)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-01 07:25:03 +00:00
Ernest Wong bd5038af07
[Doc] add config and troubleshooting guide for NCCL & GPUDirect RDMA (#15897)
Signed-off-by: Ernest Wong <chwong719@gmail.com>
2025-06-30 21:44:39 -07:00
Kyle Sayers d8cf819a9a
[Core] [Bugfix] [Multimodal] Fix multimodal profiling and generation for SFT/PTQed models (#20058)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-06-30 17:26:49 +00:00
Michael Yao 3ee56e26be
[Docs] Fix 1-2-3 list in v1/prefix_caching.md (#20243)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-06-30 11:20:51 +00:00
Reid 022c58b80f
[doc] Add Slack and Forum to the top navigation (#20208)
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-06-30 07:53:45 +00:00
redmoe-moutain 65b1cbb138
[Model] support dots1 (#18254)
Signed-off-by: redmoe-moutain <agiredmoe@gmail.com>
2025-06-29 19:34:36 -07:00
Robert Shaw d1c956dc0f
Gemma3n (Text-only) (#20134)
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-06-27 07:16:26 +00:00
Yazan Sharaya 6e244ae091
[Perf][Frontend] eliminate api_key and x_request_id headers middleware overhead (#19946)
Signed-off-by: Yazan-Sharaya <yazan.sharaya.yes@gmail.com>
2025-06-27 00:44:14 -04:00
Kyle Yu 07b8fae219
[Doc] correct LoRA capitalization (#20135)
Signed-off-by: kyolebu <kyu@redhat.com>
2025-06-26 15:22:12 -07:00
Cyrus Leung 34878a0b48
[Doc] Rename page titles (#20130)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-06-26 08:18:49 -07:00
Cyrus Leung 6393b03986
[Doc] Auto sign-off for VSCode (#20132)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-06-26 08:18:36 -07:00
wang.yuqi 0907d507bf
[Doc] Automatically signed-off by PyCharm (#20120)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-06-26 14:34:17 +00:00
Michael Yao 84c260caeb
[Docs] Improve frameworks/helm.md (#20113)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-06-26 10:41:51 +00:00
Cyrus Leung 1d7c29f5fe
[Doc] Update docs for New Model Implementation (#20115)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-06-26 00:47:06 -07:00
Nicolò Lucchesi e795d723ed
[Frontend] Add `/v1/audio/translations` OpenAI API endpoint (#19615)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-06-25 17:54:14 +00:00
Michael Goin bf5181583f
[Doc] Guide for Incremental Compilation Workflow (#19109) 2025-06-25 22:06:46 +09:00
Reid c53fec1fcb
[doc] add reference link for Intel XPU (#20064)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-25 12:24:07 +00:00
Kay Yan 0eed516951
[doc] Fix broken link in the installation for CPU (#19980)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-06-24 12:04:11 +08:00
Lukas Geiger c3649e4fee
[Docs] Fix syntax highlighting of shell commands (#19870)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-06-23 17:59:09 +00:00
Reid 53243e5c42
[doc] improve readability for long commands (#19920)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-23 14:27:07 +00:00
Reid b82e0f82cb
[doc] use MkDocs collapsible blocks - supplement (#19973)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-23 10:54:16 +00:00
Isotr0py 5111642a6f
[Doc] Update V1 status for decoder-only embedding models (#19952)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-06-23 09:31:06 +00:00
Reid f17aec0d63
[doc] Fold long code blocks to improve readability (#19926)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-23 05:24:23 +00:00
amit 4a0f7888a3
[Core] feat: Implement Priority Scheduling in V1 Engine (#19057)
Signed-off-by: amit <amit.man@gmail.com>
Co-authored-by: Roger Wang <Rogerw0108@gmail.com>
2025-06-22 20:18:08 -07:00
Reid ec0db6f51c
[doc] use snippets for contact us (#19944)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-22 10:26:13 +00:00
Adrian 3b1e4c6a23
[Docs] Add GPT2ForSequenceClassification to supported models in docs (#19932)
Signed-off-by: nie3e <adrcwiek@gmail.com>
2025-06-21 20:57:19 +00:00
Reid caa680fd2e
[doc] add contact us in community (#19922)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-21 17:29:06 +00:00
汪志鹏 c3bf9bad11
[New model support]Support Tarsier2 (#19887)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-06-21 04:01:51 +00:00
22quinn 6f68c49220
[Doc] Update V1 user guide for embedding models (#19842)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-06-19 09:43:27 +00:00
Zuxin 1d0ae26c85
Add xLAM tool parser support (#17148) 2025-06-19 14:26:41 +08:00
Russell Bryant ffacb222cb
[Docs] Add Huzaifa Sidhpurwala to vuln mgmt team doc (#19808)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-06-18 20:22:28 +00:00
Reid cca91a7a10
[doc] fix the incorrect label (#19787)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-18 10:30:58 +00:00
Zhonghua Deng eccdc8318c
[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242)
Signed-off-by: Abatom <abzhonghua@gmail.com>
2025-06-18 06:32:36 +00:00