Jagadeesh J
49c3587591
chore(components/pytorch):kserve migration ( #7615 )
...
* chore(components/pytorch):kserve migration
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* fix: pytorch dist training
- enable env vars in config.properties
- upgrade pip in dockerfile
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* Bert - KServe v2 handler changes
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* fix: bert notebook for kserve v2
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* fix: add protocol verion to bert gpu yaml
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* Adding utility to convert image to bytes - Cifar
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Cifar10 - captum update
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* fix: cifar10 example
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* fix: predictor component for kserve v2
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* fix: pytorch dist training for kserve v2
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* fix: cifar10 hpo example
Signed-off-by: Jagadeesh J <jagadeeshj@ideas2it.com>
* Bumping pytorch-kfp-components version
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
Co-authored-by: Shrinath Suresh <shrinath@ideas2it.com>
2022-07-08 08:55:52 +00:00
shrinath-suresh
5e597b13c8
chore(components/pytorch): pytorch samples build fixes ( #6338 )
...
* build dependency fixes
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Pinning PTL to 1.4.2
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Revert "Pinning PTL to 1.4.2"
This reverts commit 8e207f0d3b .
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Pinning PTL to 1.4.2
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
2021-08-15 22:06:05 -07:00
Jagadeesh J
105e10090e
fix(components/pytorch): Update Readme docs ( #6186 )
...
* Fix: Update readme docs
* fix: Update Dockerfile with PTL latest
- fix typo in readme
* Fix: Add contributing.md
- cleanup
- add description
* Fix: Typo
2021-07-30 22:36:35 -07:00
Jagadeesh J
01ec07b467
fix(components/pytorch): Clean up notebook and yaml files ( #6070 )
...
* Fix: Clean up notebook and yaml files
* Fix: Dockerfile
2021-07-16 10:33:38 -07:00
Jagadeesh J
b1d0eb799b
feat(components/pytorch): Pytorch Distributed Training ( #6021 )
...
* Feature: Bert distributed training
* Feat: Adds staging volume for pytorch job
* Feat: Add PVC storage URI for KFserving
- Update copy component
* Fix: gpu explain handler
* fix: notebook cleanup
* Fix: Update Dockerfile, requirement.txt
* Fix: Dockerfile
2021-07-16 05:29:38 -07:00
shrinath-suresh
d88394ba4a
fix(components/pytorch): pytorch kfp components and Sample - GPU updates ( #5939 )
...
* Updating trainer test wiht model parameter to support multi gpu multi node scenario
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Changing get_model to lightning_module in bert example as PTL is used from source
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* removing ptl 1.3.x from dependency as Pytorch operator needs ptl to be installed from source
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Updating trainer args with gpu parameters
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Installing PTL from source
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
* Updating get_model to lightning_module
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
2021-07-11 22:06:26 -07:00
Jagadeesh J
0f222f11fc
Fix(Components/pytorch): Add single docker file for cpu and gpu ( #5863 )
...
* Fix: Add single docker file for cpu and gpu
* Feat: Add dockerfile for tensorboard with torch profiler plugin
2021-06-26 10:09:12 -07:00