Commit Graph

165 Commits

Author SHA1 Message Date
Feng XiaoLong 4fc1bf813a
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
2025-05-23 16:16:26 -07:00
Huy Do 2c4f59afc3
Update PyTorch to 2.7.0 (#16859) 2025-04-29 19:08:04 -07:00
Lucas Wilkinson d8bccde686
[BugFix] Fix vllm_flash_attn install issues (#17267)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-27 17:27:56 -07:00
Aaron Pham e782e0a170
[Chore] added stubs for `vllm_flash_attn` during development mode (#17228)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-26 07:45:26 -07:00
Isotr0py 4e5a0f6ae2
[Misc] Allow using OpenCV as video IO fallback (#15055)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-01 15:55:13 +00:00
Yang Chen f3aca1ee30
setup correct nvcc version with CUDA_HOME (#15725)
Signed-off-by: Yang Chen <yangche@fb.com>
2025-04-01 06:09:40 -07:00
yihong e7ae3bf3d6
fix: better install requirement for install in setup.py (#15796)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-31 05:13:32 -07:00
Manish Sethi 761702fd19
[Core] Integrate `fastsafetensors` loader for loading model weights (#10647)
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
2025-03-24 08:08:02 -07:00
Russell Bryant b877031d80
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00
Russell Bryant 0a74bfce9c
setup.py: drop assumption about local `main` branch (#14692)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-17 01:37:42 -07:00
Harry Mellor 206e2577fa
Move requirements into their own directory (#12547)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-08 16:44:35 +00:00
Isotr0py 63137cd922
[Build] Add nightly wheel fallback when latest commit wheel unavailable (#14358)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-06 22:10:57 -08:00
Cody Yu f35f8e2242
[Build] Make sure local main branch is synced when VLLM_USE_PRECOMPILED=1 (#13921)
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
2025-03-03 16:43:14 +08:00
Harry Mellor cf069aa8aa
Update deprecated Python 3.8 typing (#13971) 2025-03-02 17:34:51 -08:00
Michael Goin ca377cf1b9
Use CUDA 12.4 as default for release and nightly wheels (#12098) 2025-02-26 19:06:37 -08:00
Lucas Wilkinson f95903909f
[Kernel] FlashMLA integration (#13747)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-02-27 10:35:08 +08:00
Daniele 81dabf24a8
[CI/Build] force writing version file (#13544)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-02-19 18:48:03 +08:00
Daniele a02c86b4dd
[CI/Build] migrate static project metadata from setup.py to pyproject.toml (#8772) 2025-02-18 08:02:49 -08:00
Russell Bryant d46d490c27
[Frontend] Move CLI code into vllm.cmd package (#12971) 2025-02-12 23:12:21 -08:00
Cody Yu 60c68df6d1
[Build] Automatically use the wheel of the base commit with Python-only build (#13178) 2025-02-12 23:10:28 -08:00
Kevin H. Luu 91e876750e
[misc] Fix setup.py condition to avoid AMD from being mistaken with CPU (#13022)
Signed-off-by: kevin <kevin@anyscale.com>
2025-02-10 18:06:16 -08:00
Liangfu Chen c45d398e6f
[CI] Resolve transformers-neuronx version conflict (#12925) 2025-02-08 01:41:35 -08:00
wangxiyuan 407b5537db
[Build] Make pypi install work on CPU platform (#12874) 2025-02-08 01:15:15 -08:00
Sophie du Couédic 649550f27e
[Build] update requirements of no-device for plugin usage (#12630)
Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com>
2025-02-04 21:19:12 +08:00
Russell Bryant e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**

commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:18:24 2025 -0500

    Add SPDX license headers to python source files
    
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
    also be easily used by tools to help manage license compliance.
    
The Linux Foundation runs license scans against the codebase to help
ensure
    we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
    
    More information can be found on the SPDX site:
    
    - https://spdx.dev/learn/handling-license-info/
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:36:32 2025 -0500

    Check for SPDX headers using pre-commit
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

---------

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Robert Shaw 9b0c4bab36
[Kernel] Triton Configs for Fp8 Block Quantization (#11589)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
2025-01-30 11:53:22 -08:00
Harry Mellor 823ab79633
Update `pre-commit` hooks (#12475)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-27 17:23:08 -07:00
Lucas Wilkinson ab5bbf5ae3
[Bugfix][Kernel] Fix CUDA 11.8 being broken by FA3 build (#12375)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-01-24 15:27:59 +00:00
Lucas Wilkinson 978b45f399
[Kernel] Flash Attention 3 Support (#12093)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-01-23 06:45:48 -08:00
youkaichao 68ad4e3a8d
[Core] Support fully transparent sleep mode (#11743)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-22 14:39:32 +08:00
Mengqing Cao 4004f144f3
[Build] update requirements of no-device (#12299)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
2025-01-22 14:29:31 +08:00
Yuan Tang b8bfa46a18
[Bugfix] Fix issues in CPU build Dockerfile (#12135)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-17 12:54:01 +08:00
Konrad Zawora 1a51b9f872
[HPU][Bugfix] Don't use /dev/accel/accel0 for HPU autodetection in setup.py (#12046)
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
2025-01-15 02:59:18 +00:00
Wallas Henrique cfd3219f58
[Hardware][Apple] Native support for macOS Apple Silicon (#11696)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-01-08 16:35:49 +08:00
youkaichao ad9f1aa679
[doc] update wheels url (#11830)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 14:36:49 +08:00
hj-wei 59d6bb4c86
[Hardware][AMD]: Replace HIPCC version with more precise ROCm version (#11515)
Signed-off-by: hjwei <hjwei_xd@163.com>
2024-12-28 11:17:35 +00:00
Yuan Tang 409475a827
[Bugfix] Fix issues in CPU build Dockerfile. Fixes #9182 (#11435)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2024-12-24 16:53:28 +00:00
omer-dayan 995f56236b
[Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192)
Signed-off-by: OmerD <omer@run.ai>
2024-12-20 16:46:24 +00:00
Daniele 7c7aa37c69
[CI/Build] fix pre-compiled wheel install for exact tag (#11373)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2024-12-21 00:14:40 +08:00
youkaichao 9743d64e4e
[ci][build] add tests for python only compilation (#10915)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-05 08:54:47 -08:00
Daniele e4c34c23de
[CI/Build] improve python-only dev setup (#9621)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-12-04 21:48:13 +00:00
litianjian 28b2877d30
Online video support for VLMs (#10020)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-07 20:25:59 +00:00
Russell Bryant 098f94de42
[CI/Build] Drop Python 3.8 support (#10038)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-06 14:31:01 +00:00
Woosuk Kwon 6a585a23d2
[Hotfix] Fix ruff errors (#10073)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-11-06 01:24:28 -08:00
Konrad Zawora a02a50e6e5
[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend (#6143)
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Signed-off-by: zehao-intel <zehao.huang@intel.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Marceli Fylcek <mfylcek@habana.ai>
Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Co-authored-by: Vivek Goel <vgoel@habana.ai>
Co-authored-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Dominika Olszewska <dolszewska@habana.ai>
Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>
Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>
Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com>
Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Ilia Taraban <tarabanil@gmail.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai>
Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com>
Co-authored-by: Sun Choi <schoi@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com>
Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Co-authored-by: Zehao Huang <zehao.huang@intel.com>
Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com>
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
Co-authored-by: Nir David <ndavid@habana.ai>
Co-authored-by: Yu-Zhou <yu.zhou@intel.com>
Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
Co-authored-by: Jacek Czaja <jczaja@habana.ai>
Co-authored-by: Yuan <yuan.zhou@outlook.com>
2024-11-06 01:09:10 -08:00
Aaron Pham 21063c11c7
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2024-11-06 07:11:55 +00:00
Luka Govedič 51c24c9736
[Build] Fix `FetchContent` multiple build issue (#9596)
Signed-off-by: luka <luka@neuralmagic.com>
2024-10-23 12:43:07 +08:00
bnellnm eca2c5f7c0
[Bugfix] Fix support for dimension like integers and ScalarType (#9299) 2024-10-17 19:08:34 +00:00
Joe Rowell 7627172bf4
[Bugfix][Doc] Report neuron error in output (#9159) 2024-10-08 22:43:34 -07:00
Yuan Tang ffc4b27ea8
Add classifiers in setup.py (#9171) 2024-10-08 19:30:48 -07:00