Commit Graph

72 Commits

Author SHA1 Message Date
Suraj Kota 2dd8de3f6f
chore(components): SageMaker fix flaky groundtruth test (#5044) 2021-01-27 20:52:01 -08:00
Leonard O' Sullivan 4aa11c3c7f
feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813)
* Adds RoboMaker and SageMaker RLEstimator components

* Genericise samples

* Genericise samples

* Adds better logging and updates shim component in samples

* Adds fixes for PR comments. Updates tests accordingly

* Adds docker image reference for integration tests. Allows for setting job_name for RLEstimator training jobs

* Separate RM and SM execution roles

* Remove README reference to VPC config items

* Adds more reliable integration test for RoboMaker Simulation Job

* Simplifies integration tests

* Reverted test container entrypoints

* Update black formatting

* Update components for redbackthomson repo

* Prefix RLEstimator job name

* Add RoboMakerFullAccess to generated roles

* Update version to official 1.1.0

* Formatting int test file

* Add PassRole IAM permission to OIDC

* Adds ROBOMAKER_EXECUTION_ROLE_ARN to build vars

Co-authored-by: Nicholas Thomson <nithomso@amazon.com>
2020-12-11 13:27:27 -08:00
Kartik Kalamadi c78781d96c
fix(components): Login to Dockerhub before running tests for Sa… (#4826)
* fix(components): Login to Dockerhub before running integ tests for SageMaker Components

* Add dockerhub login for unit tests
2020-12-09 16:52:53 -08:00
Kartik Kalamadi 008985a576
fix(components): AWS SageMaker - Retry delete EKS Cluster after Integ test failure (#4662)
* fix(components): AWS SageMaker - Retry delete EKS Cluster after Integ test failure

* decrease delete eks tiemout to 15 min
2020-11-06 12:12:29 -08:00
Cesar d6a2c23f56
fix(components): sending pyspark jobs and set generated step_id to /output.txt from the generated EMR step (#4725) 2020-11-05 19:22:50 -08:00
Cesar b9e2259b4b
fix(components): sending pyspark jobs to aws EMR with the correct py … (#4721) 2020-11-05 14:52:50 -08:00
Nicholas Thomson d81c8095d0
refactor(components): AWS SageMaker - Full component refactoring (#4336)
* Temporary rebase commit

* Add yaml compiler

* Add compiler CLI

* Update Dockerfile to copy all files

* Add validate input list vs dict

* Add unit test for new train

* Add minor bug fixes

* Override tag when generating specs

* Update pydocs with formatter

* Add contributing doc

* Add formatters to CONTRIBUTING

* Add working generic logic applied to train

* Update component input and output to inherit

* Downgrade to Python 3.7

* Update add outputValue to arg list

* Updated outputValue to outputPath

* Add empty string default to not-required inputs

* Update path to component relative to root

* Update faulty False-y condition

* Update outputs to write to file

* Update doc formatting

* Update docstrings to match structure

* Add unit tests for component and compiler

* Add unit tests for component

* Add spec unit tests

* Add training unit tests

* Update unit test automation

* Add sample formatting checks

* Remove extra flake8 check in integ tests

* Add unit test black check

* Update black formatting for all files

* Update include black formatting

* Add batch component

* Remove old transform components

* Update region input description

* Add all component specs

* Add deploy component

* Add ground truth component

* Add HPO component

* Add create model component

* Add processing component

* Add workteam component

* Add spec unit tests

* Add deploy unit tests

* Add ground truth unit tests

* Add tuning component unit tests

* Add create model component unit test

* Add process component unit tests

* Add workteam component unit tests

* Remove output_path from required_args

* Remove old component implementations

* Update black formatting

* Add assume role feature

* Compiled all components

* Update doc formatting

* Fix process terminate syntax error

* Update compiler to use kfp structures

* Update nits

* Update unified requirements

* Rebase on debugging commit

* Add debugger unit tests

* Update formatting

* Update component YAML

* Fix unit test Dockerfile relative directory

* Update unit test context to root

* Update Batch to auto-generate name

* Update minor docs and formatting changes

* Update deploy name to common autogenerated

* Add f-strings to logs

* Add update support

* Add Amazon license header

* Update autogen and autoformat

* Rename SpecValidator to SpecInputParser

* Split requirements by dev and prod

* Support for checking generated specs

* Update minor changes

* Update deploy component output description

* Update components to beta repository

* Update fix unit test requirements

* Update unit test build spec for new results path

* Update deploy wait for endpoint complete

* Update component configure AWS clients in new method

* Update boto3 retry method

* Update license version

* Update component YAML versions

* Add new version to Changelog

* Update component spec types

* Update deploy config ignore overwrite

* Update component for debugging

* Update images back to 1.0.0

* Remove coverage from components
2020-10-27 14:17:57 -07:00
Suraj Kota 466147a2d8
chore(components): SageMaker integ test changes (#4603)
- fix clean up in groundtruth
2020-10-09 16:06:47 -07:00
Suraj Kota e87d74f036
chore(components): SageMaker integ tests, fix for unbound variable (#4595) 2020-10-08 17:17:06 -07:00
Meghna Baijal 237795539f
chore(components): AWS SageMaker - Fix leaking Workteam(GroundTruth) resources (#4536) 2020-09-30 13:08:54 -07:00
jkuruba 7c349f3f82
feat(components): AWS SageMaker - Changes for updating an existing endpoint (#4424)
* Changes for updating existing endpoint

* Review comments addressed

* Review comments addressed

* Review comments addressed

* Changed awscli and boto3 version. Ran black to format integration tests

* Removing temporarily to debug integration failures

* Adding back integration tests

* Control the number of parallel integration tests to 10

* Third Party License updated

* Version changed to 0.9.0

* Fixed a typo in Changelog
2020-09-18 16:00:28 -07:00
Nicholas Thomson f9e3a249d8
chore(components): AWS SageMaker - Add PYTEST_MARKER to container variable passthrough (#4459) 2020-09-02 18:03:40 -07:00
Nicholas Thomson 2f7a5e5a2b
chore(components): AWS SageMaker - Fix leaking resources (#4457)
* Add try/catch cleanup for integ resources

* Update pytest-xdist to 2.1

* Fix groundtruth workteam invocation
2020-09-02 16:11:40 -07:00
Kartik Kalamadi 05398cf475
[AWS SageMaker] Fix small bugs (#4161)
* fix small bugs

* add SKIP_OIDC_SETUP config

* address comments

* address comments and add KFP_VERSION to .env

* typo
2020-09-01 23:17:06 -07:00
Dustin Luong 3ebd075212
feat(components): AWS SageMaker - Add optional parameter to allow training component to accept parameters related to Debugger (#4283)
* Implemented debugger for training component with sample pipeline, unit tests, and integration test

* Implemented changes from PR, refactored utils.py, made sample pipeline more succinct, removed hardcoding from integration tests

* Added default parameter for sample pipeline and fixed grammar for sample README, refactored _utils.py for fstrings and fixed offset for errors

* Removed aws secret lines

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Removed extra files, cleaned integration test

* Changed integration test to use sample debugger pipeline

* Processing jobs created from debug rules will not terminate, fixing other small issues

* Removed debug from pipeline definition, removed extra line, removed unused function

* Changelog and image tag updates
2020-08-19 15:41:22 -07:00
Matthew Rose 2dca2b57cc
docs: AWS Sagemaker training job component README typos (#4295) 2020-08-10 13:41:58 -07:00
Bohdan 262b288e5d
fix(components): remove needless arguments from AWS EMR scripts (#4252)
remove needless submit_pyspark_job arguments
2020-08-10 10:08:20 -07:00
IvyBazan e3c33a650e
Updated components/aws/sagemaker/README.md (#3983)
* Create README.md

* Added README

Updated page to include information on Amazon SageMaker components

* Update README.md

* Integrated feedback

* Added link to SageMaker Components tutorial

* Update README.md

added link to sample pipelines
2020-08-04 19:42:27 -07:00
Nicholas Thomson 8014a44229
feat(components): AWS SageMaker - Support for assuming a role (#4212)
* Add client assume role functionality

* Add assume_role to component.yaml files

* Update image to personal

* Update input to force NoneType on empty

* Update integration test setup with assumed role

* Add assume role integration test

* Update boto session to use refreshing credentials

* Update assume role relax trust relationship

* Add check for defined assumed role name

* Add processing assume integ test

* Add assume role unit test for main methods

* Add assume_role to all READMEs

* Update session to use AssumeRoleProvider

* Remove region from child calls to session

* Fix extra region_name in test

* Update assume role processing integ test name

* Add processing integ test to list

* Update assumed role to remain if not generated

* Update license version

* Update image tag to new version

* Add new version to Changelog
2020-08-03 10:53:43 -07:00
Suraj Kota 900eeaec16
feat(components): AWS SageMaker - Add functionality to stop SageMaker jobs on run termination (#4167)
* Add functionality to stop SM jobs
	- Unit and Integration tests for the functionality

* unit test update and customer message update

* Changelog and image tag updates

* update version for deploy component and merge conflicts

* Update version in License file

* fix conflicting paths for download, add test for batch
2020-07-17 16:34:50 -07:00
Kartik Kalamadi 799db4714f
[AWS SageMaker] Integ test to check CloudWatch logs print feature (#4056)
* Integ test for cw logs

* Update license file version to 0.5.3

* update version in yaml

* add changelog
2020-07-09 15:14:33 -07:00
Nicholas Thomson c6754e3f13
fix(components): AWS SageMaker - Fix MinIO PID early exit (#4190)
* Add force minio error test

* Remove force error
2020-07-09 11:04:33 -07:00
Nicholas Thomson f0f8e5d178
[AWS SageMaker] De-hardcode output paths in AWS components (#4119)
* Update input arguments

* Remove fileOutputs

* Update outputs to new paths

* Modify integ test artifact path

* Add unit test for new output format

* Add unit test for write_output

* Migrate tests into test_utils

* Add clarifying comment

* Remove output path file extension

* Update license to 0.5.2

* Update component to 0.5.2

* Add 0.5.2 to changelog

* Remove JSON
2020-07-06 14:39:57 -07:00
Matthew Rose 131be23467
[AWS SageMaker] GroundTruth Pre/Post Lambda function region additions (#3932)
* add more region ARNs to groundtruth lambda map as per sagemaker documentation

* changes for patch release 0.5.1

* Make label_category_config optional in groundtruth component

* merge fix

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-07-01 10:32:17 -07:00
Nicholas Thomson 7619a01546
[AWS SageMaker] Processing job sample (#4009)
* Added preprocessing step to sample pipeline

* Add processing to components README

* Remove unused import

* Add data set prerequisite to training component

* Remove simple HPO pipeline

* Update MNIST header in README

* Remove simple HPO sample integration test

* Empty commit to trigger google bot
2020-06-30 16:28:06 -07:00
Kartik Kalamadi e0e4b982cd
[AWS SageMaker] Add unit tests for cloudwatch logs (#4051)
* add unit tests for cloudwatch logs

* address git comments
2020-06-30 15:06:07 -07:00
Kartik Kalamadi b3d8e04e1e
[AWS SageMaker] Print SageMaker job logs in kfp UI (#3954)
* Print logs for AWS SM Componenets on KFP UI

* address comments

* update version number to 0.5.0

* update yaml to version 0.5.0

* update changelog
2020-06-19 00:33:58 -07:00
Nicholas Thomson c29ee5de13
[AWS SageMaker] Component 0.4.1 Release (#4011)
* Update license version

* Update component YAML versions

* Add 0.4.1 to changelog
2020-06-17 18:02:04 -07:00
Nicholas Thomson 5717bc0fbb
Add missing import statement (#4010) 2020-06-17 16:16:05 -07:00
Nicholas Thomson bea63652e1
[AWS SageMaker] Processing job component (#3944)
* Add TDD processing definition

* Update README

* Update temporary image

* Update component entrypoint

* Add WORKDIR to fix Docker 18 support

* integration test for processing job

* Remove job links

* Add container outputs and tests

* Update default properties

* Remove max_run_time if none provided

* Update integration readme steps

* Updated README with more resources

* Add CloudWatch link back to logs

* Update input and output config to arrays

* Update processing integration test

* Update process README

* Update unit tests

* Updated license version

* Update component image versions

* Update changelog

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-06-17 11:29:24 -07:00
Suraj Kota bc6aed9cac
[AWS Sagemaker] aws-samples kmeans-hpo pipeline test (#3905)
* aws-samples kmeans-hpo pipeline test
	- code clean up
	- removed unused args

* add test dependency

* Trigger Build
2020-06-16 10:26:04 -07:00
Kartik Kalamadi 35019eb3ea
[AWS SageMaker] Add integration test for sample pipeline train (#3876)
* add integ test for sample pipeline train

* change docker build context integ test

* add spot test and use train ccomponent test for sample train pipeline

* small changes and ran flake8 and black

* address comments
2020-06-12 12:11:55 -07:00
Nicholas Thomson d6920ca2ad
[AWS SageMaker] Update GroundTruth integration test timeout (#3973)
Update the timeout for the GroundTruth integration test from 10 seconds to 1200 seconds.
2020-06-11 17:07:57 -07:00
Gautam Kumar a7be049b6d
Move the minio artifact download under try block (#3889)
If minio artifact download fails the workteam is not being deleted.
2020-06-01 19:06:15 -07:00
Nicholas Thomson 37a63638c7
[AWS SageMaker] Add working FSx setup and test (#3831)
* Add working FSx setup and test

* Removed duplicate test function

* Replaced failure return with exit

* Update parallel methods to export

* Update EKS cluster name outside parallel task

* Add SKIP_FSX_TEST in buildspec

* Add revoke security group ingress

* Add default pytest FSx values
2020-05-29 11:01:15 -07:00
Gautam Kumar 58ff65f330
fixing case when status is None (#3865)
* fixing case when status is None

* Fixing status update
2020-05-28 22:05:14 -07:00
Kartik Kalamadi b50305069b
[AWS SageMaker] Add more unit tests (#3783)
* add more tests for deploy and ground_truth components

* add more tests for workteam component

* add unit tests for model component

* add more unit tests for batchTransform component

* add more tests

* add 'request' function tests

* add more unit tests for ground truth
2020-05-27 10:04:10 -07:00
IvyBazan 3fe9b7e3aa
Added README for Amazon SageMaker Components for Kubeflow Pipelines (#3824)
* Create README.md

* Added README

Updated page to include information on Amazon SageMaker components

* Update README.md

* Integrated feedback
2020-05-26 18:14:41 -07:00
Meghna Baijal fb549531f1
[AWS SageMaker] Integration Test for AWS SageMaker GroundTruth Component (#3830)
* Integration Test for AWS SageMaker GroundTruth Component

* Unfix already fixed bug

* Fix the README I overwrote by mistake

* Remove use of aws-secret for OIDC

* Rev 2: Fix linting errors
2020-05-26 15:44:40 -07:00
Gautam Kumar bbe598db26
Adding HPO unit test (#3791)
* Adding HPO unit test

* Adding best training job

* Addressing comment
2020-05-22 22:57:10 -07:00
Kartik Kalamadi d18ad7a563
AWS SageMaker : Use IAM Roles for Service Account (#3719)
* don't use aws-secret and update readme for sample pipelines

* Addressed comments on PR and few more readme changes

* small changes to readme

* nit change

* Address comments
2020-05-21 10:36:14 -07:00
Nicholas Thomson f2a860b84c
[AWS SageMaker] Integration tests automation (#3768)
* # This is a combination of 5 commits.
# This is the 1st commit message:

Add initial scripts

# This is the commit message #2:

Add working pytest script

# This is the commit message #3:

Add initial scripts

# This is the commit message #4:

Add environment variable files

# This is the commit message #5:

Remove old cluster script

* Add initial scripts

Add working pytest script

Add initial scripts

Add environment variable files

Remove old cluster script

Update pipeline credentials to OIDC

Add initial scripts

Add working pytest script

Add initial scripts

Add working pytest script

* Remove debugging mark

* Update example EKS cluster name

* Remove quiet from Docker build

* Manually pass env

* Update env list vars as string

* Update use array directly

* Update variable array to export

* Update to using read for splitting

* Move to helper script

* Update export from CodeBuild

* Add wait for minio

* Update kubectl wait timeout

* Update minor changes for PR

* Update integration test buildspec to quiet build

* Add region to delete EKS

* Add wait for pods

* Updated README

* Add fixed interval wait

* Fix CodeBuild step order

* Add file lock for experiment ID

* Fix missing pytest parameter

* Update run create only once

* Add filelock to conda env

* Update experiment name ensuring creation each time

* Add try/catch with create experiment

* Remove caching from KFP deployment

* Remove disable KFP caching

* Move .gitignore changes to inside component

* Add blank line to default .gitignore
2020-05-20 14:18:19 -07:00
Gautam Kumar 6e2a55cf84
Changing the default volume size to 30 (#3792) 2020-05-20 12:36:20 -07:00
Jiaxin Shan af4e8efa3e
Add more approvers in AWS sagemaker components (#3740) 2020-05-15 11:27:36 -07:00
Suraj Kota bff83921d7
AWS Sagemaker Components - enhance integration test coverage (#3720)
* AWS Sagemaker Components - enhance integration test coverage
	- Add tests for create endpoint, hpo job and batch transform
	- Minor bug fixes and documentation

* rev2: Address comments and clean up generated artifacts

* rev3: address more comments

* rev4: add canary test marker

* Trigger Build
2020-05-15 10:21:36 -07:00
Nicholas Thomson ddd1969b34
[AWS SageMaker] Unit tests for Training component (#3722)
* Added additional training unit tests

* Add main training function tests

* Add full training test coverage

* Fix import sys

* Fix poorly named test
2020-05-13 16:14:22 -07:00
Nicholas Thomson bd8c1ddd38
[AWS SageMaker] Specify component input types (#3683)
* Replace all string types with Python types

* Update HPO yaml

* Update Batch YAML

* Update Deploy YAML

* Update GroundTruth YAML

* Update Model YAML

* Update Train YAML

* Update WorkTeam YAML

* Updated samples to remove strings

* Update to temporary image

* Remove unnecessary imports

* Update image to newer image

* Update components to python3

* Update bool parser type

* Remove empty ContentType in samples

* Update to temporary image

* Update to version 0.3.1

* Update deploy to login

* Update deploy load config path

* Fix export environment variable in deploy

* Fix env name

* Update deploy reflow env paths

* Add debug config line

* Use username and password directly

* Updated to 0.3.1

* Update field types to JsonObject and JsonArray
2020-05-11 22:06:21 -07:00
Suraj Kota 6beab2251d
Integration tests for AWS SageMaker Components (#3654)
* integration tests for aws sagemaker components with comment

* address comment related to S3 dataset creation

* rev3: bug fix in conda env yaml and resuse sagemaker method to get image URI

* Add createModel test

	- reduce code duplication
	- add some utility methods
2020-05-06 22:19:09 -07:00
Nicholas Thomson 9ade740ca6
[AWS SageMaker] Add CodeBuild Steps (#3668)
* Add initial unit test buildspec

* Add docker log output

* Add force no pytest color

* Update docker build to be quiet

* Add pass all environment variables

* Update unit test container env file

* Update env to use different syntax

* Remove daemon mode

* Remove TTY from docker run

* Add dryrun and dockercfg setup

* Update dryrun into CodeBuild logic

* Add mkdir for Docker config

* Update app version temporarily

* Revert app version temporarily

* Update unit test log file

* Add tag minor and major versions

* Update version temporarily

* Add print for major and minor tags

* Revert version back down

* Add deploy version override

* Update path to testing directories

* Fix tab formatting

* Fix pytest log directory
2020-05-04 14:13:07 -07:00
Kartik Kalamadi 2f4eafb031
AWS Sagemaker : Add unit tests (#3642)
* Initial changes

* add one test for each component

* Add readme for unit tests

* add empty string test and dockerfile

* added dockerfile

* use python3 in dockerfile

* add coverage report to unit tests

* update readme for PR

* small changes to resolve git comments

* copy requirements.txt separately in dockerfile

* small changes

* pin pip package versions in unit_tests
2020-04-30 01:32:18 -07:00