Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
Satyajith Chilappagari
|
e0cbad4e30
|
[Neuron] Support quantization on neuron (#18283)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
|
2025-05-27 22:10:33 +00:00 |
Yong Hoon Shin
|
98c89e16ff
|
Make key optional for rotary embedding (#17566)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-05-07 00:11:46 -07:00 |
Satyajith Chilappagari
|
043e4c4955
|
Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
Co-authored-by: Aaron Dou <yzdou@amazon.com>
Co-authored-by: Shashwat Srijan <sssrijan@amazon.com>
Co-authored-by: Chongming Ni <chongmni@amazon.com>
Co-authored-by: Amulya Ballakur <amulyaab@amazon.com>
Co-authored-by: Patrick Lange <patlange@amazon.com>
Co-authored-by: Elaine Zhao <elaineyz@amazon.com>
Co-authored-by: Lin Lin Pan <tailinpa@amazon.com>
Co-authored-by: Navyadhara Gogineni <navyadha@amazon.com>
Co-authored-by: Yishan McNabb <yishanm@amazon.com>
Co-authored-by: Mrinal Shukla <181322398+mrinalks@users.noreply.github.com>
|
2025-05-07 00:07:30 -07:00 |
Liangfu Chen
|
d2b58ca203
|
[Neuron][kernel] Fuse kv cache into a single tensor (#15911)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
|
2025-04-03 09:51:32 -07:00 |
Liangfu Chen
|
53a0cf8b95
|
[Neuron] trim attention kernel tests to fit trn1.2x instance (#14988)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
|
2025-03-18 15:05:52 +08:00 |
Sibi
|
a73e183e36
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
Signed-off-by: sibi <85477603+t-sibiraj@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-03-16 20:35:57 -07:00 |
Liangfu Chen
|
d3d4956261
|
[Neuron] flatten test parameterization for neuron attention kernels (#14712)
|
2025-03-13 20:46:56 -07:00 |