Commit Graph

6 Commits

Author SHA1 Message Date
Jagadeesh J 58d22d4ba9
chore(components/pytorch) - Samples fix for PT and PTL Upgrade (#8148)
* WIP: chore(components/pytorch) - Samples fix for PT and PTL Upgrade

* fix: bert example

 - fix minio secret
 - remove pth file upload to minio
 - add captum to pip packages

* fix: bert-dist training args

* fix: cifar10 example

* fix: cifar10 example notebook

* fix: captum example

* fix: gpu fixes for bert and cifar10 example

* fix: bert dist ptl upgrade
2022-09-22 05:29:38 +00:00
Jagadeesh J 49c3587591
chore(components/pytorch):kserve migration (#7615)
* chore(components/pytorch):kserve migration

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* fix: pytorch dist training

 - enable env vars in config.properties
 - upgrade pip in dockerfile

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* Bert - KServe v2 handler changes

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* fix: bert notebook for kserve v2

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* fix: add protocol verion to bert gpu yaml

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* Adding utility to convert image to bytes - Cifar

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Cifar10 - captum update

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* fix: cifar10 example

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* fix: predictor component for kserve v2

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* fix: pytorch dist training for kserve v2

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* fix: cifar10 hpo example

Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>

* Bumping pytorch-kfp-components version

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

Co-authored-by: Shrinath Suresh <shrinath@ideas2it.com>
2022-07-08 08:55:52 +00:00
shrinath-suresh f9d47d0ef9
fix(components/pytorch): PyTorch Samples - Generating component.yaml using templates (#6231)
* Adding code to generate component.yaml files from templates

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Adding templates for train and pre process

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating the build script to generate component.yaml from templates

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating the jupyter notebook to use templates

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Removing all component.yaml files

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Revert "Removing all component.yaml files"

This reverts commit db75951949.

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing bert notebook

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Changing cifar10 pipeline

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating cifar10 notebook

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing cifar10 template mapping

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing the cifar10 preprocess input

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* fixing ciar10 notebook arguments

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Adding examples in component.yaml templates

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating README.md files

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating cifar10 captum insights notebook

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Adding templates for hpo

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating distributed training notebook

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing yaml path in dist training

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Removing all component.yaml files

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Moving component.yaml templates into pytorch-kfp-components

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing template path

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Updating cifar10 script argument variable

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing argument or bert dist

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Adding image name to templates

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Adding template for ax train

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Creating template mapping for ax

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing hpo script arguments

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Ax Template mapping fix

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Addressing review comments

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
2021-08-08 23:34:25 -07:00
shrinath-suresh 4d42624d5b
fix(components/pytorch): Pytorch Cifar10 Captum Insights (#6105)
* Fix Captum Insights cifar10

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Fixing handler for GPU scenario

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>

* Removing comments and updating pytorch job operator variable in Bert distributed pipeline notebook

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
2021-07-23 12:18:40 -07:00
Jagadeesh J 01ec07b467
fix(components/pytorch): Clean up notebook and yaml files (#6070)
* Fix: Clean up notebook and yaml files

* Fix: Dockerfile
2021-07-16 10:33:38 -07:00
Jagadeesh J b1d0eb799b
feat(components/pytorch): Pytorch Distributed Training (#6021)
* Feature: Bert distributed training

* Feat: Adds staging volume for pytorch job

* Feat: Add PVC storage URI for KFserving

 - Update copy component

* Fix: gpu explain handler

* fix: notebook cleanup

* Fix: Update Dockerfile, requirement.txt

* Fix: Dockerfile
2021-07-16 05:29:38 -07:00