.. |
attention
|
[ROCM] Fix blockReduceSum to use correct warp counts for ROCm and CUDA (#3262)
|
2024-03-10 15:27:45 -07:00 |
moe
|
Add fused top-K softmax kernel for MoE (#2769)
|
2024-02-05 17:38:02 -08:00 |
punica
|
[Kernel] support non-zero cuda devices in punica kernels (#3636)
|
2024-03-27 00:37:42 +00:00 |
quantization
|
Integrate Marlin Kernels for Int4 GPTQ inference (#2497)
|
2024-03-01 12:47:51 -08:00 |
activation_kernels.cu
|
Add kernel for GeGLU with approximate GELU (#3337)
|
2024-03-12 22:06:17 -07:00 |
cache.h
|
[Minor] Remove gather_cached_kv kernel (#3043)
|
2024-02-26 15:00:54 -08:00 |
cache_kernels.cu
|
[Minor] Remove gather_cached_kv kernel (#3043)
|
2024-02-26 15:00:54 -08:00 |
cuda_compat.h
|
[ROCM] Fix blockReduceSum to use correct warp counts for ROCm and CUDA (#3262)
|
2024-03-10 15:27:45 -07:00 |
cuda_utils.h
|
[ROCm] add support to ROCm 6.0 and MI300 (#2274)
|
2024-01-26 12:41:10 -08:00 |
cuda_utils_kernels.cu
|
[ROCm] add support to ROCm 6.0 and MI300 (#2274)
|
2024-01-26 12:41:10 -08:00 |
custom_all_reduce.cu
|
[BugFix] Some fixes for custom allreduce kernels (#2760)
|
2024-03-21 23:02:58 -07:00 |
custom_all_reduce.cuh
|
[BugFix] Some fixes for custom allreduce kernels (#2760)
|
2024-03-21 23:02:58 -07:00 |
custom_all_reduce_test.cu
|
[BugFix] Some fixes for custom allreduce kernels (#2760)
|
2024-03-21 23:02:58 -07:00 |
dispatch_utils.h
|
DeepseekMoE support with Fused MoE kernel (#2453)
|
2024-01-29 21:19:48 -08:00 |
layernorm_kernels.cu
|
[FIX] Support non-zero CUDA devices in custom kernels (#1959)
|
2024-01-02 19:09:59 -08:00 |
moe_align_block_size_kernels.cu
|
[Bugfix] Make moe_align_block_size AMD-compatible (#3470)
|
2024-03-18 11:26:24 -07:00 |
ops.h
|
Add batched RoPE kernel (#3095)
|
2024-03-13 13:45:26 -07:00 |
pos_encoding_kernels.cu
|
Add batched RoPE kernel (#3095)
|
2024-03-13 13:45:26 -07:00 |
pybind.cpp
|
Add batched RoPE kernel (#3095)
|
2024-03-13 13:45:26 -07:00 |
reduction_utils.cuh
|
[ROCm] Fix warp and lane calculation in blockReduceSum (#3321)
|
2024-03-11 13:14:07 -07:00 |