Chengji Yao
|
a77aea59fd
|
[TPU] support attention head dim smaller than 128 (#19620)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-06-16 06:40:53 +00:00 |
Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
XiongfeiWei
|
9765940824
|
[TPU] Enable gemma3-27b with TP>1 on multi-chips. (#17335)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
|
2025-05-05 14:19:58 -07:00 |
Chengji Yao
|
471fe65630
|
[TPU][V1] Implicitly adjust page size when there's SMEM OOM (#16871)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
|
2025-04-21 15:43:13 -06:00 |
Alexander Matveev
|
9a2160fa55
|
[V1] TPU CI - Add basic perf regression test (#15414)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-31 13:25:20 -04:00 |
Alexander Matveev
|
c3f687ac22
|
[V1] TPU - Fix the chunked prompt bug (#15713)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-28 20:19:04 +00:00 |
Robert Shaw
|
8a49eea74b
|
[CI][TPU] Temporarily Disable Quant Test on TPU (#15649)
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
|
2025-03-27 19:45:05 -07:00 |
Alexander Matveev
|
72a8639b68
|
[V1] TPU - CI/CD use smaller model (#15054)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-18 21:39:21 +00:00 |
Sibi
|
a73e183e36
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
Signed-off-by: sibi <85477603+t-sibiraj@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-03-16 20:35:57 -07:00 |
Alexander Matveev
|
cb8bdfade2
|
[V1] TPU - Add tensor parallel support via Ray (#13618)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-08 08:19:38 -05:00 |