vllm/tpu at add-utils - vllm - Gitea: Git with a cup of tea

History

Chengji Yao 7da296be04 [TPU] kv cache update kernel supports dynamic grid (#20235 ) Signed-off-by: Chengji Yao <chengjiyao@google.com>		2025-07-02 06:33:37 +00:00
..
worker	[Optimization] Use Shared `CachedRequestData` Instance Across All Requests (#20232 )	2025-06-30 09:07:50 -07:00
__init__.py	[V1] TPU - Add tensor parallel support via Ray (#13618 )	2025-03-08 08:19:38 -05:00
test_basic.py	[TPU] support attention head dim smaller than 128 (#19620 )	2025-06-16 06:40:53 +00:00
test_kv_cache_update_kernel.py	[TPU] kv cache update kernel supports dynamic grid (#20235 )	2025-07-02 06:33:37 +00:00
test_mha_attn.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_multimodal.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_pallas.py	[TPU] add kv cache update kernel (#19928 )	2025-06-26 10:01:37 -07:00
test_perf.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_sampler.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_spmd_model_weight_loading.py	[TPU] Skip hanging tests (#19115 )	2025-06-04 01:43:00 -07:00
test_topk_topp_sampler.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_tpu_qkv_linear.py	[Hardware][TPU] Initial support of model parallelism with single worker using SPMD (#18011 )	2025-06-03 00:06:20 +00:00