Commit Graph

108 Commits

Author SHA1 Message Date
jkuruba 7c349f3f82
feat(components): AWS SageMaker - Changes for updating an existing endpoint (#4424)
* Changes for updating existing endpoint

* Review comments addressed

* Review comments addressed

* Review comments addressed

* Changed awscli and boto3 version. Ran black to format integration tests

* Removing temporarily to debug integration failures

* Adding back integration tests

* Control the number of parallel integration tests to 10

* Third Party License updated

* Version changed to 0.9.0

* Fixed a typo in Changelog
2020-09-18 16:00:28 -07:00
Nicholas Thomson f9e3a249d8
chore(components): AWS SageMaker - Add PYTEST_MARKER to container variable passthrough (#4459) 2020-09-02 18:03:40 -07:00
Nicholas Thomson 2f7a5e5a2b
chore(components): AWS SageMaker - Fix leaking resources (#4457)
* Add try/catch cleanup for integ resources

* Update pytest-xdist to 2.1

* Fix groundtruth workteam invocation
2020-09-02 16:11:40 -07:00
Kartik Kalamadi 05398cf475
[AWS SageMaker] Fix small bugs (#4161)
* fix small bugs

* add SKIP_OIDC_SETUP config

* address comments

* address comments and add KFP_VERSION to .env

* typo
2020-09-01 23:17:06 -07:00
Dustin Luong 3ebd075212
feat(components): AWS SageMaker - Add optional parameter to allow training component to accept parameters related to Debugger (#4283)
* Implemented debugger for training component with sample pipeline, unit tests, and integration test

* Implemented changes from PR, refactored utils.py, made sample pipeline more succinct, removed hardcoding from integration tests

* Added default parameter for sample pipeline and fixed grammar for sample README, refactored _utils.py for fstrings and fixed offset for errors

* Removed aws secret lines

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Removed extra files, cleaned integration test

* Changed integration test to use sample debugger pipeline

* Processing jobs created from debug rules will not terminate, fixing other small issues

* Removed debug from pipeline definition, removed extra line, removed unused function

* Changelog and image tag updates
2020-08-19 15:41:22 -07:00
Matthew Rose 2dca2b57cc
docs: AWS Sagemaker training job component README typos (#4295) 2020-08-10 13:41:58 -07:00
IvyBazan e3c33a650e
Updated components/aws/sagemaker/README.md (#3983)
* Create README.md

* Added README

Updated page to include information on Amazon SageMaker components

* Update README.md

* Integrated feedback

* Added link to SageMaker Components tutorial

* Update README.md

added link to sample pipelines
2020-08-04 19:42:27 -07:00
Nicholas Thomson 8014a44229
feat(components): AWS SageMaker - Support for assuming a role (#4212)
* Add client assume role functionality

* Add assume_role to component.yaml files

* Update image to personal

* Update input to force NoneType on empty

* Update integration test setup with assumed role

* Add assume role integration test

* Update boto session to use refreshing credentials

* Update assume role relax trust relationship

* Add check for defined assumed role name

* Add processing assume integ test

* Add assume role unit test for main methods

* Add assume_role to all READMEs

* Update session to use AssumeRoleProvider

* Remove region from child calls to session

* Fix extra region_name in test

* Update assume role processing integ test name

* Add processing integ test to list

* Update assumed role to remain if not generated

* Update license version

* Update image tag to new version

* Add new version to Changelog
2020-08-03 10:53:43 -07:00
Suraj Kota 900eeaec16
feat(components): AWS SageMaker - Add functionality to stop SageMaker jobs on run termination (#4167)
* Add functionality to stop SM jobs
	- Unit and Integration tests for the functionality

* unit test update and customer message update

* Changelog and image tag updates

* update version for deploy component and merge conflicts

* Update version in License file

* fix conflicting paths for download, add test for batch
2020-07-17 16:34:50 -07:00
Kartik Kalamadi 799db4714f
[AWS SageMaker] Integ test to check CloudWatch logs print feature (#4056)
* Integ test for cw logs

* Update license file version to 0.5.3

* update version in yaml

* add changelog
2020-07-09 15:14:33 -07:00
Nicholas Thomson c6754e3f13
fix(components): AWS SageMaker - Fix MinIO PID early exit (#4190)
* Add force minio error test

* Remove force error
2020-07-09 11:04:33 -07:00
Nicholas Thomson f0f8e5d178
[AWS SageMaker] De-hardcode output paths in AWS components (#4119)
* Update input arguments

* Remove fileOutputs

* Update outputs to new paths

* Modify integ test artifact path

* Add unit test for new output format

* Add unit test for write_output

* Migrate tests into test_utils

* Add clarifying comment

* Remove output path file extension

* Update license to 0.5.2

* Update component to 0.5.2

* Add 0.5.2 to changelog

* Remove JSON
2020-07-06 14:39:57 -07:00
Matthew Rose 131be23467
[AWS SageMaker] GroundTruth Pre/Post Lambda function region additions (#3932)
* add more region ARNs to groundtruth lambda map as per sagemaker documentation

* changes for patch release 0.5.1

* Make label_category_config optional in groundtruth component

* merge fix

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-07-01 10:32:17 -07:00
Nicholas Thomson 7619a01546
[AWS SageMaker] Processing job sample (#4009)
* Added preprocessing step to sample pipeline

* Add processing to components README

* Remove unused import

* Add data set prerequisite to training component

* Remove simple HPO pipeline

* Update MNIST header in README

* Remove simple HPO sample integration test

* Empty commit to trigger google bot
2020-06-30 16:28:06 -07:00
Kartik Kalamadi e0e4b982cd
[AWS SageMaker] Add unit tests for cloudwatch logs (#4051)
* add unit tests for cloudwatch logs

* address git comments
2020-06-30 15:06:07 -07:00
Kartik Kalamadi b3d8e04e1e
[AWS SageMaker] Print SageMaker job logs in kfp UI (#3954)
* Print logs for AWS SM Componenets on KFP UI

* address comments

* update version number to 0.5.0

* update yaml to version 0.5.0

* update changelog
2020-06-19 00:33:58 -07:00
Nicholas Thomson c29ee5de13
[AWS SageMaker] Component 0.4.1 Release (#4011)
* Update license version

* Update component YAML versions

* Add 0.4.1 to changelog
2020-06-17 18:02:04 -07:00
Nicholas Thomson 5717bc0fbb
Add missing import statement (#4010) 2020-06-17 16:16:05 -07:00
Nicholas Thomson bea63652e1
[AWS SageMaker] Processing job component (#3944)
* Add TDD processing definition

* Update README

* Update temporary image

* Update component entrypoint

* Add WORKDIR to fix Docker 18 support

* integration test for processing job

* Remove job links

* Add container outputs and tests

* Update default properties

* Remove max_run_time if none provided

* Update integration readme steps

* Updated README with more resources

* Add CloudWatch link back to logs

* Update input and output config to arrays

* Update processing integration test

* Update process README

* Update unit tests

* Updated license version

* Update component image versions

* Update changelog

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-06-17 11:29:24 -07:00
Suraj Kota bc6aed9cac
[AWS Sagemaker] aws-samples kmeans-hpo pipeline test (#3905)
* aws-samples kmeans-hpo pipeline test
	- code clean up
	- removed unused args

* add test dependency

* Trigger Build
2020-06-16 10:26:04 -07:00
Kartik Kalamadi 35019eb3ea
[AWS SageMaker] Add integration test for sample pipeline train (#3876)
* add integ test for sample pipeline train

* change docker build context integ test

* add spot test and use train ccomponent test for sample train pipeline

* small changes and ran flake8 and black

* address comments
2020-06-12 12:11:55 -07:00
Nicholas Thomson d6920ca2ad
[AWS SageMaker] Update GroundTruth integration test timeout (#3973)
Update the timeout for the GroundTruth integration test from 10 seconds to 1200 seconds.
2020-06-11 17:07:57 -07:00
Gautam Kumar a7be049b6d
Move the minio artifact download under try block (#3889)
If minio artifact download fails the workteam is not being deleted.
2020-06-01 19:06:15 -07:00
Nicholas Thomson 37a63638c7
[AWS SageMaker] Add working FSx setup and test (#3831)
* Add working FSx setup and test

* Removed duplicate test function

* Replaced failure return with exit

* Update parallel methods to export

* Update EKS cluster name outside parallel task

* Add SKIP_FSX_TEST in buildspec

* Add revoke security group ingress

* Add default pytest FSx values
2020-05-29 11:01:15 -07:00
Gautam Kumar 58ff65f330
fixing case when status is None (#3865)
* fixing case when status is None

* Fixing status update
2020-05-28 22:05:14 -07:00
Kartik Kalamadi b50305069b
[AWS SageMaker] Add more unit tests (#3783)
* add more tests for deploy and ground_truth components

* add more tests for workteam component

* add unit tests for model component

* add more unit tests for batchTransform component

* add more tests

* add 'request' function tests

* add more unit tests for ground truth
2020-05-27 10:04:10 -07:00
IvyBazan 3fe9b7e3aa
Added README for Amazon SageMaker Components for Kubeflow Pipelines (#3824)
* Create README.md

* Added README

Updated page to include information on Amazon SageMaker components

* Update README.md

* Integrated feedback
2020-05-26 18:14:41 -07:00
Meghna Baijal fb549531f1
[AWS SageMaker] Integration Test for AWS SageMaker GroundTruth Component (#3830)
* Integration Test for AWS SageMaker GroundTruth Component

* Unfix already fixed bug

* Fix the README I overwrote by mistake

* Remove use of aws-secret for OIDC

* Rev 2: Fix linting errors
2020-05-26 15:44:40 -07:00
Gautam Kumar bbe598db26
Adding HPO unit test (#3791)
* Adding HPO unit test

* Adding best training job

* Addressing comment
2020-05-22 22:57:10 -07:00
Kartik Kalamadi d18ad7a563
AWS SageMaker : Use IAM Roles for Service Account (#3719)
* don't use aws-secret and update readme for sample pipelines

* Addressed comments on PR and few more readme changes

* small changes to readme

* nit change

* Address comments
2020-05-21 10:36:14 -07:00
Nicholas Thomson f2a860b84c
[AWS SageMaker] Integration tests automation (#3768)
* # This is a combination of 5 commits.
# This is the 1st commit message:

Add initial scripts

# This is the commit message #2:

Add working pytest script

# This is the commit message #3:

Add initial scripts

# This is the commit message #4:

Add environment variable files

# This is the commit message #5:

Remove old cluster script

* Add initial scripts

Add working pytest script

Add initial scripts

Add environment variable files

Remove old cluster script

Update pipeline credentials to OIDC

Add initial scripts

Add working pytest script

Add initial scripts

Add working pytest script

* Remove debugging mark

* Update example EKS cluster name

* Remove quiet from Docker build

* Manually pass env

* Update env list vars as string

* Update use array directly

* Update variable array to export

* Update to using read for splitting

* Move to helper script

* Update export from CodeBuild

* Add wait for minio

* Update kubectl wait timeout

* Update minor changes for PR

* Update integration test buildspec to quiet build

* Add region to delete EKS

* Add wait for pods

* Updated README

* Add fixed interval wait

* Fix CodeBuild step order

* Add file lock for experiment ID

* Fix missing pytest parameter

* Update run create only once

* Add filelock to conda env

* Update experiment name ensuring creation each time

* Add try/catch with create experiment

* Remove caching from KFP deployment

* Remove disable KFP caching

* Move .gitignore changes to inside component

* Add blank line to default .gitignore
2020-05-20 14:18:19 -07:00
Gautam Kumar 6e2a55cf84
Changing the default volume size to 30 (#3792) 2020-05-20 12:36:20 -07:00
Jiaxin Shan af4e8efa3e
Add more approvers in AWS sagemaker components (#3740) 2020-05-15 11:27:36 -07:00
Suraj Kota bff83921d7
AWS Sagemaker Components - enhance integration test coverage (#3720)
* AWS Sagemaker Components - enhance integration test coverage
	- Add tests for create endpoint, hpo job and batch transform
	- Minor bug fixes and documentation

* rev2: Address comments and clean up generated artifacts

* rev3: address more comments

* rev4: add canary test marker

* Trigger Build
2020-05-15 10:21:36 -07:00
Nicholas Thomson ddd1969b34
[AWS SageMaker] Unit tests for Training component (#3722)
* Added additional training unit tests

* Add main training function tests

* Add full training test coverage

* Fix import sys

* Fix poorly named test
2020-05-13 16:14:22 -07:00
Nicholas Thomson bd8c1ddd38
[AWS SageMaker] Specify component input types (#3683)
* Replace all string types with Python types

* Update HPO yaml

* Update Batch YAML

* Update Deploy YAML

* Update GroundTruth YAML

* Update Model YAML

* Update Train YAML

* Update WorkTeam YAML

* Updated samples to remove strings

* Update to temporary image

* Remove unnecessary imports

* Update image to newer image

* Update components to python3

* Update bool parser type

* Remove empty ContentType in samples

* Update to temporary image

* Update to version 0.3.1

* Update deploy to login

* Update deploy load config path

* Fix export environment variable in deploy

* Fix env name

* Update deploy reflow env paths

* Add debug config line

* Use username and password directly

* Updated to 0.3.1

* Update field types to JsonObject and JsonArray
2020-05-11 22:06:21 -07:00
Suraj Kota 6beab2251d
Integration tests for AWS SageMaker Components (#3654)
* integration tests for aws sagemaker components with comment

* address comment related to S3 dataset creation

* rev3: bug fix in conda env yaml and resuse sagemaker method to get image URI

* Add createModel test

	- reduce code duplication
	- add some utility methods
2020-05-06 22:19:09 -07:00
Nicholas Thomson 9ade740ca6
[AWS SageMaker] Add CodeBuild Steps (#3668)
* Add initial unit test buildspec

* Add docker log output

* Add force no pytest color

* Update docker build to be quiet

* Add pass all environment variables

* Update unit test container env file

* Update env to use different syntax

* Remove daemon mode

* Remove TTY from docker run

* Add dryrun and dockercfg setup

* Update dryrun into CodeBuild logic

* Add mkdir for Docker config

* Update app version temporarily

* Revert app version temporarily

* Update unit test log file

* Add tag minor and major versions

* Update version temporarily

* Add print for major and minor tags

* Revert version back down

* Add deploy version override

* Update path to testing directories

* Fix tab formatting

* Fix pytest log directory
2020-05-04 14:13:07 -07:00
Kartik Kalamadi 2f4eafb031
AWS Sagemaker : Add unit tests (#3642)
* Initial changes

* add one test for each component

* Add readme for unit tests

* add empty string test and dockerfile

* added dockerfile

* use python3 in dockerfile

* add coverage report to unit tests

* update readme for PR

* small changes to resolve git comments

* copy requirements.txt separately in dockerfile

* small changes

* pin pip package versions in unit_tests
2020-04-30 01:32:18 -07:00
Gautam Kumar 45bc582374
Fixing volume size default value from 1 to 30 (#3598) 2020-04-26 17:17:28 -07:00
Kartik Kalamadi 0259fe50b3
AWS Sagemaker : Use json.dumps() to better organize the input and remove data_locations (#3518)
* construct channel input using json.dumps()

* remover data_location parameters

* add changelog

* Update version in license file and small changes to readme
2020-04-23 12:14:07 -07:00
Suraj Kota fbed280e55
add user agent header to boto3 client for aws components (#3487)
* add user agent header to boto client

* add component version according to license file

* fetch version from license file at runtime
2020-04-15 11:25:46 -07:00
Kartik Kalamadi f041b08190
AWS sagemaker: fixed a bug in ground_truth and updated all components to use images from new docker hub repo (#3474)
* Don't leave active_learning_model_arn.txt empty

* updated readme for ground_truth_pipeline_demo

* update docker repo

* Small changes to readme of ground truth sample pipeline
2020-04-14 10:26:13 -07:00
Suraj Kota fc5f977b19
Update documentation for AWS components (#3410)
* deploy_createModel_readme

* readme for batch and minor updates to deploy and create_model

* updates based on review comments 1

* correct SageMaker typo
2020-04-08 09:43:46 -07:00
Kartik Kalamadi 942be78bfe
Make endpoint_url None (#3374) 2020-04-07 13:19:43 -07:00
Kartik Kalamadi 060cabf911
AWS Sagemaker : Updated documents (#3440)
* Initial readme for Train component

* example input

* add train pipeline

* added simple_train_pipeline

* Updated readme to include kmeans-hpo-pipeline.py

* Updated train component readme

* fix typo

* Update details about how to get sample data for Train component

* update comment and give a defaault path for output

* change s3 bucket to match other sample pipelines
2020-04-07 11:17:44 -07:00
Kartik Kalamadi 956f645503
AWS sagemaker : Added license files and updated Dockerfile to use AmazonLinux (#3397)
* Added new LICENSE file

* added 2 more license files

* copy license files into the docker image

* pinned pip packages and rearranged the dockerfile
2020-04-06 20:55:43 -07:00
Redback 2fe8c0de61 [Component] Add VPC Interface Endpoint Support for SageMaker (#2299)
* Added Private Link Components

* Updated Component Dockerfile

* Added endpoint_url to Samples
2019-10-03 18:11:56 -07:00
Redback 12dde375b8 [Component] Add Managed Spot Training Support for SageMaker (#2219)
* Added Spot Instance Support

* Fixed missing output configuration

* Added spot instance support to example pipelines

* Updated image to new repository
2019-10-03 12:11:56 -07:00
Christian Clauss 8e1e823139 Lint Python code for undefined names (#1721)
* Lint Python code for undefined names

* Lint Python code for undefined names

* Exclude tfdv.py to workaround an overzealous pytest

* Fixup for tfdv.py

* Fixup for tfdv.py

* Fixup for tfdv.py
2019-08-21 15:04:31 -07:00
carolynwang 69ca3c7e4b Update images, bug fixes, clean up code (#1778)
* Update docker images and minor refactoring

* Update image tag, bug fixes, remove unneeded imports

* Revert to using image version, use origin batch transform output method

* Forgot to save 2 changes
2019-08-09 15:25:13 -07:00
carolynwang 351f4562a4 Refactor to match new samples folder structure (#1741) 2019-08-06 01:23:56 -07:00
carolynwang 3f8bcffaa7 Add SageMaker create workteam and Ground Truth components, sample demo pipeline, other minor updates (#1716)
* Add components for workteam and Ground Truth, minor update for HPO and train, add sample pipeline demo for workteam and GT, update images

* Minor style fixes

* Address PR comments

* Refactor for new folder structure
2019-08-05 15:33:50 -07:00
carolynwang 81b0f08a84 Update SageMaker components and sample pipeline (#1682)
* Update HPO, train, batch transform components, add MNIST kmeans with HPO pipeline

* Minor bug fixes

* Minor bug fixes, update Dockerfiles

* Update docker images and sample pipeline

* Update all components and sample pipeline

* Delete Dockerfiles for individual component

* Typo fix in Dockerfile
2019-07-30 12:47:51 -07:00
carolynwang 2778632ba2 Add SageMaker HPO component and sample usage in a pipeline (#1628)
* add HPO component and sample pipeline usage

* Update Dockerfile to include HPO component

* Update docker image used in hpo component

* Update HPO readme, make HPO job name required, allow empty string for int params, reintro some default values

* Resolve a couple todos

* Add Dockerfile for HPO and update docker image used in HPO component

* Add Dockerfile for HPO
2019-07-21 19:06:51 -07:00
Jiaxin Shan d5147b9776 Add HyperParameters back to SageMaker training job (#1377) 2019-05-30 19:24:21 -07:00
tiffany jernigan 778fe2ad7a Fix naming from sagamaker to sagemaker (#1386) 2019-05-24 14:15:30 -07:00
Jiaxin Shan 5374b6b2b4 Add SageMaker components and example pipeline (#1276)
* Add SageMaker components and example pipeline

* Address review feedbacks

* Expose more training job configs

* Update components/aws/sagemaker/batch_transform/component.yaml

Update components descriptions

Co-Authored-By: Jeffwan <seedjeffwan@gmail.com>
2019-05-03 14:55:37 -07:00