A high-throughput and memory-efficient inference and serving engine for LLMs
llm
mlops
pytorch
cuda
inference
llama
llm-serving
llmops
model-serving
qwen
rocm
tpu
trainium
transformer
amd
xpu
deepseek
gpt
hpu
inferentia
Updated 2025-07-04 16:00:34 +08:00
Automated Machine Learning on Kubernetes
kubernetes
machine-learning
kubeflow
ai
tensorflow
huggingface
llm
mlops
jax
pytorch
hyperparameter-tuning
neural-architecture-search
automl
scikit-learn
Updated 2025-06-26 22:13:16 +08:00
Distributed ML Training and Fine-Tuning on Kubernetes
kubernetes
huggingface
ai
llm
gpu
jax
kubeflow
distributed
xgboost
machine-learning
mlops
python
pytorch
tensorflow
fine-tuning
Updated 2025-06-20 16:05:11 +08:00