History

Andrew Chen 074d8b84bd Add question-answer example for v2 trainer (#2580 ) * Add question-answer example Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> * chore: remove unused lines, add TODO comment Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> * chore: update example description Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> Signed-off-by: Andrew Chen <14799876+solanyn@users.noreply.github.com> * chore: update question-answering example * run train job on CPU * reduce batch size, dataset size and train epochs * make upload to bucket optional * add notebook to e2e-test * set model name as trainjob argument Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> * chore: extend e2e-run-notebook timeout * e2e tests fail if trainjobs launched by notebook do not finish in 3s * extends the timeout to 5min to block and wait for longer trainjobs until timeout or trainjob completes Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> * chore: update example to wait for trainjob running status * revert change to e2e-run-notebook.sh Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> --------- Signed-off-by: solanyn <14799876+solanyn@users.noreply.github.com> Signed-off-by: Andrew Chen <14799876+solanyn@users.noreply.github.com> Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>		2025-05-09 21:08:41 +00:00
..
deepspeed/text-summarization	feat(runtimes): Support MLX Distributed Runtime with OpenMPI (#2565 )	2025-03-27 04:54:22 +00:00
mlx/image-classification	feat(runtimes): Support MLX Distributed Runtime with OpenMPI (#2565 )	2025-03-27 04:54:22 +00:00
pytorch	Add question-answer example for v2 trainer (#2580 )	2025-05-09 21:08:41 +00:00
README.md	Update the naming conventions for Kubeflow Trainer (#2415 )	2025-02-06 13:48:30 +00:00

README.md

Kubeflow Trainer Examples

Welcome to Kubeflow Trainer examples!

The Kubeflow Trainer documentation is available on kubeflow.org.