Commit Graph

32 Commits

Author SHA1 Message Date
rd-pong fdb25f6e6d
test(components): fix k8s_client 401 unauthorized error (#9749)
* Initiate a new k8s client when calling _get_resource

* Remove k8s_client for methods that use _get_resource

* Initiate a new k8s client when calling _delete_resource
2023-07-18 18:37:22 +00:00
ananth102 7de50a5839
test(components): Check the Sagemaker component output rather than Controller (#9402)
* update tests to check output

bugfix

fix another bug

* adress pr comments

* bug fix

* test fix

* namefix
2023-05-19 00:27:46 +00:00
rd-pong 1fa0893800
test(component): Update integration test for Model Monitor component (#9384)
* Add test that check component outputs

* Remove sagemaker check

* Extract get_output_ack_resource_metadata to a function
Extract "Scheduled" to constant FINAL_STATUS

* Extract to function: verify_monitoring_schedule_component_outputs
2023-05-12 22:29:29 +00:00
rd-pong 07e67bb0ca
feat(components): SageMaker V2 model monitor component and testing (#9253)
* Add model monitor component and integration tests

* Generate model monitor using updated generator

* Add sleep for monitoring schedule

* Update requirements v2

* Change model monitor image url

* minor fix

* minor fix

* minor fix

* Add unit testing for MonitoringSchedule

* Delete assume-role.json

* Add doc and sample pipeline for Monitoring Schedule

* Regenerate using the latest code generator.
Make parameter description 1 sentence long.

* Revert "Add doc and sample pipeline for Monitoring Schedule"

This reverts commit 6b3b7cc6f5.

* Delete print statements

* Update component with new generator

* address comments

* Add retry for _delete_resource

* Add try catch protection for _get_resource and _delete_resource

* Add integration tests for monitoring job definition components

* Update is_endpoint_deleted for new _get_resource

* Add integration test for updating monitoring schedule

* Remove update from canary

* Add doc and sample pipeline for Monitoring Schedule

* Add doc for monitoring job definition
Update doc for monitoring schedule

* Remove sample for monitoring schedule

* Address comments

* Address comments

* Address comment for unit test

* Address doc comments

* Address test comments
2023-05-09 19:42:33 +00:00
ananth102 57b52d29ea
test(components): fix Sagemaker component shallow canary (#9314) 2023-05-03 23:56:44 +00:00
ananth102 4818e849f8
feat(components): Sagemaker V2 Hosting components and tests (#9243)
* Hosting Components and test

* update dependency

* Regenerating with spec trimming

* handle None case

* adress pr comments

* another way of handling update not supported

* test changes

* removing unused logic

* Staging pr

* Added READMEs

* Main doc changes

* minor edit
2023-05-03 17:56:15 +00:00
rd-pong bec2834cdc
test: fix "ValueError: Unsupported region". Only populate necessary values for shallow_canary (#9236) 2023-04-25 19:46:45 +00:00
rd-pong 9c7aa16176
test: Upgrade package versions and remove dependency on "sagemaker-sample-data-<region>" bucket (#9204)
* Upgrade package version and change default instance type

Upgrade sagemaker version

Upgrade boto3 version

Upgrade pyyaml version

Change training and endpoint instance type

* Remove dependency on "sagemaker-sample-data-<region>"
2023-04-25 02:31:44 +00:00
rd-pong 3e7354e483
test(components): Add shallow_canary marker (#9180)
* Add shallow_canary marker

* Delete trainingV1 from shallow canary
2023-04-20 17:30:40 +00:00
ananth102 916777e62f
test(components): Added integration test for Sagemaker TrainingJob component v2. (#8247)
* Integration tests for sagemaker training v2

* removed redundant check

* removed redundant print

* added safety check

* pr changes

* updated python and kubernetes

* reverting dependency versions

* Revert "updated python and kubernetes"

This reverts commit e92034d5f9.

* added linting
2022-09-14 17:42:04 -07:00
Suraj Kota 2dd8de3f6f
chore(components): SageMaker fix flaky groundtruth test (#5044) 2021-01-27 20:52:01 -08:00
Leonard O' Sullivan 4aa11c3c7f
feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813)
* Adds RoboMaker and SageMaker RLEstimator components

* Genericise samples

* Genericise samples

* Adds better logging and updates shim component in samples

* Adds fixes for PR comments. Updates tests accordingly

* Adds docker image reference for integration tests. Allows for setting job_name for RLEstimator training jobs

* Separate RM and SM execution roles

* Remove README reference to VPC config items

* Adds more reliable integration test for RoboMaker Simulation Job

* Simplifies integration tests

* Reverted test container entrypoints

* Update black formatting

* Update components for redbackthomson repo

* Prefix RLEstimator job name

* Add RoboMakerFullAccess to generated roles

* Update version to official 1.1.0

* Formatting int test file

* Add PassRole IAM permission to OIDC

* Adds ROBOMAKER_EXECUTION_ROLE_ARN to build vars

Co-authored-by: Nicholas Thomson <nithomso@amazon.com>
2020-12-11 13:27:27 -08:00
Nicholas Thomson d81c8095d0
refactor(components): AWS SageMaker - Full component refactoring (#4336)
* Temporary rebase commit

* Add yaml compiler

* Add compiler CLI

* Update Dockerfile to copy all files

* Add validate input list vs dict

* Add unit test for new train

* Add minor bug fixes

* Override tag when generating specs

* Update pydocs with formatter

* Add contributing doc

* Add formatters to CONTRIBUTING

* Add working generic logic applied to train

* Update component input and output to inherit

* Downgrade to Python 3.7

* Update add outputValue to arg list

* Updated outputValue to outputPath

* Add empty string default to not-required inputs

* Update path to component relative to root

* Update faulty False-y condition

* Update outputs to write to file

* Update doc formatting

* Update docstrings to match structure

* Add unit tests for component and compiler

* Add unit tests for component

* Add spec unit tests

* Add training unit tests

* Update unit test automation

* Add sample formatting checks

* Remove extra flake8 check in integ tests

* Add unit test black check

* Update black formatting for all files

* Update include black formatting

* Add batch component

* Remove old transform components

* Update region input description

* Add all component specs

* Add deploy component

* Add ground truth component

* Add HPO component

* Add create model component

* Add processing component

* Add workteam component

* Add spec unit tests

* Add deploy unit tests

* Add ground truth unit tests

* Add tuning component unit tests

* Add create model component unit test

* Add process component unit tests

* Add workteam component unit tests

* Remove output_path from required_args

* Remove old component implementations

* Update black formatting

* Add assume role feature

* Compiled all components

* Update doc formatting

* Fix process terminate syntax error

* Update compiler to use kfp structures

* Update nits

* Update unified requirements

* Rebase on debugging commit

* Add debugger unit tests

* Update formatting

* Update component YAML

* Fix unit test Dockerfile relative directory

* Update unit test context to root

* Update Batch to auto-generate name

* Update minor docs and formatting changes

* Update deploy name to common autogenerated

* Add f-strings to logs

* Add update support

* Add Amazon license header

* Update autogen and autoformat

* Rename SpecValidator to SpecInputParser

* Split requirements by dev and prod

* Support for checking generated specs

* Update minor changes

* Update deploy component output description

* Update components to beta repository

* Update fix unit test requirements

* Update unit test build spec for new results path

* Update deploy wait for endpoint complete

* Update component configure AWS clients in new method

* Update boto3 retry method

* Update license version

* Update component YAML versions

* Add new version to Changelog

* Update component spec types

* Update deploy config ignore overwrite

* Update component for debugging

* Update images back to 1.0.0

* Remove coverage from components
2020-10-27 14:17:57 -07:00
Suraj Kota 466147a2d8
chore(components): SageMaker integ test changes (#4603)
- fix clean up in groundtruth
2020-10-09 16:06:47 -07:00
Suraj Kota e87d74f036
chore(components): SageMaker integ tests, fix for unbound variable (#4595) 2020-10-08 17:17:06 -07:00
Meghna Baijal 237795539f
chore(components): AWS SageMaker - Fix leaking Workteam(GroundTruth) resources (#4536) 2020-09-30 13:08:54 -07:00
jkuruba 7c349f3f82
feat(components): AWS SageMaker - Changes for updating an existing endpoint (#4424)
* Changes for updating existing endpoint

* Review comments addressed

* Review comments addressed

* Review comments addressed

* Changed awscli and boto3 version. Ran black to format integration tests

* Removing temporarily to debug integration failures

* Adding back integration tests

* Control the number of parallel integration tests to 10

* Third Party License updated

* Version changed to 0.9.0

* Fixed a typo in Changelog
2020-09-18 16:00:28 -07:00
Nicholas Thomson 2f7a5e5a2b
chore(components): AWS SageMaker - Fix leaking resources (#4457)
* Add try/catch cleanup for integ resources

* Update pytest-xdist to 2.1

* Fix groundtruth workteam invocation
2020-09-02 16:11:40 -07:00
Dustin Luong 3ebd075212
feat(components): AWS SageMaker - Add optional parameter to allow training component to accept parameters related to Debugger (#4283)
* Implemented debugger for training component with sample pipeline, unit tests, and integration test

* Implemented changes from PR, refactored utils.py, made sample pipeline more succinct, removed hardcoding from integration tests

* Added default parameter for sample pipeline and fixed grammar for sample README, refactored _utils.py for fstrings and fixed offset for errors

* Removed aws secret lines

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Terminate debug rules when terminating training job, Terminate debug rules if terminate is pressed after training job has completed, added integration tests for stop_debug_rules, updated READMEs for train and sample, renamed sample pipeline, removed tensorboard, updated sagemaker version to sagemaker 2.1.0.

* Removed extra files, cleaned integration test

* Changed integration test to use sample debugger pipeline

* Processing jobs created from debug rules will not terminate, fixing other small issues

* Removed debug from pipeline definition, removed extra line, removed unused function

* Changelog and image tag updates
2020-08-19 15:41:22 -07:00
Nicholas Thomson 8014a44229
feat(components): AWS SageMaker - Support for assuming a role (#4212)
* Add client assume role functionality

* Add assume_role to component.yaml files

* Update image to personal

* Update input to force NoneType on empty

* Update integration test setup with assumed role

* Add assume role integration test

* Update boto session to use refreshing credentials

* Update assume role relax trust relationship

* Add check for defined assumed role name

* Add processing assume integ test

* Add assume role unit test for main methods

* Add assume_role to all READMEs

* Update session to use AssumeRoleProvider

* Remove region from child calls to session

* Fix extra region_name in test

* Update assume role processing integ test name

* Add processing integ test to list

* Update assumed role to remain if not generated

* Update license version

* Update image tag to new version

* Add new version to Changelog
2020-08-03 10:53:43 -07:00
Suraj Kota 900eeaec16
feat(components): AWS SageMaker - Add functionality to stop SageMaker jobs on run termination (#4167)
* Add functionality to stop SM jobs
	- Unit and Integration tests for the functionality

* unit test update and customer message update

* Changelog and image tag updates

* update version for deploy component and merge conflicts

* Update version in License file

* fix conflicting paths for download, add test for batch
2020-07-17 16:34:50 -07:00
Kartik Kalamadi 799db4714f
[AWS SageMaker] Integ test to check CloudWatch logs print feature (#4056)
* Integ test for cw logs

* Update license file version to 0.5.3

* update version in yaml

* add changelog
2020-07-09 15:14:33 -07:00
Nicholas Thomson f0f8e5d178
[AWS SageMaker] De-hardcode output paths in AWS components (#4119)
* Update input arguments

* Remove fileOutputs

* Update outputs to new paths

* Modify integ test artifact path

* Add unit test for new output format

* Add unit test for write_output

* Migrate tests into test_utils

* Add clarifying comment

* Remove output path file extension

* Update license to 0.5.2

* Update component to 0.5.2

* Add 0.5.2 to changelog

* Remove JSON
2020-07-06 14:39:57 -07:00
Nicholas Thomson 7619a01546
[AWS SageMaker] Processing job sample (#4009)
* Added preprocessing step to sample pipeline

* Add processing to components README

* Remove unused import

* Add data set prerequisite to training component

* Remove simple HPO pipeline

* Update MNIST header in README

* Remove simple HPO sample integration test

* Empty commit to trigger google bot
2020-06-30 16:28:06 -07:00
Nicholas Thomson bea63652e1
[AWS SageMaker] Processing job component (#3944)
* Add TDD processing definition

* Update README

* Update temporary image

* Update component entrypoint

* Add WORKDIR to fix Docker 18 support

* integration test for processing job

* Remove job links

* Add container outputs and tests

* Update default properties

* Remove max_run_time if none provided

* Update integration readme steps

* Updated README with more resources

* Add CloudWatch link back to logs

* Update input and output config to arrays

* Update processing integration test

* Update process README

* Update unit tests

* Updated license version

* Update component image versions

* Update changelog

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-06-17 11:29:24 -07:00
Suraj Kota bc6aed9cac
[AWS Sagemaker] aws-samples kmeans-hpo pipeline test (#3905)
* aws-samples kmeans-hpo pipeline test
	- code clean up
	- removed unused args

* add test dependency

* Trigger Build
2020-06-16 10:26:04 -07:00
Kartik Kalamadi 35019eb3ea
[AWS SageMaker] Add integration test for sample pipeline train (#3876)
* add integ test for sample pipeline train

* change docker build context integ test

* add spot test and use train ccomponent test for sample train pipeline

* small changes and ran flake8 and black

* address comments
2020-06-12 12:11:55 -07:00
Gautam Kumar a7be049b6d
Move the minio artifact download under try block (#3889)
If minio artifact download fails the workteam is not being deleted.
2020-06-01 19:06:15 -07:00
Nicholas Thomson 37a63638c7
[AWS SageMaker] Add working FSx setup and test (#3831)
* Add working FSx setup and test

* Removed duplicate test function

* Replaced failure return with exit

* Update parallel methods to export

* Update EKS cluster name outside parallel task

* Add SKIP_FSX_TEST in buildspec

* Add revoke security group ingress

* Add default pytest FSx values
2020-05-29 11:01:15 -07:00
Meghna Baijal fb549531f1
[AWS SageMaker] Integration Test for AWS SageMaker GroundTruth Component (#3830)
* Integration Test for AWS SageMaker GroundTruth Component

* Unfix already fixed bug

* Fix the README I overwrote by mistake

* Remove use of aws-secret for OIDC

* Rev 2: Fix linting errors
2020-05-26 15:44:40 -07:00
Suraj Kota bff83921d7
AWS Sagemaker Components - enhance integration test coverage (#3720)
* AWS Sagemaker Components - enhance integration test coverage
	- Add tests for create endpoint, hpo job and batch transform
	- Minor bug fixes and documentation

* rev2: Address comments and clean up generated artifacts

* rev3: address more comments

* rev4: add canary test marker

* Trigger Build
2020-05-15 10:21:36 -07:00
Suraj Kota 6beab2251d
Integration tests for AWS SageMaker Components (#3654)
* integration tests for aws sagemaker components with comment

* address comment related to S3 dataset creation

* rev3: bug fix in conda env yaml and resuse sagemaker method to get image URI

* Add createModel test

	- reduce code duplication
	- add some utility methods
2020-05-06 22:19:09 -07:00