Commit Graph

25 Commits

Author SHA1 Message Date
rd-pong 5cfaec6ebb
chore(test): Increase time out for git fetching and test run (#9462)
Increase timeout for rlestimator-training
2023-06-05 18:38:23 +00:00
jsitu777 fe9fc4de79
test(Component): push test stats to cloudwatch (#9130)
* push stats to cloudwatch

* remove h.py

* grab project name from env
2023-04-11 18:29:55 +00:00
Suraj Kota 56f1c80c7e
chore(components): add test image cache (#9111) 2023-04-06 19:26:41 +00:00
ananth102 943fb6bdff
test(components): disable imdsv1 (#8630)
* test: disable imdsv1

* move autokubeconfig

* update timeout

* Fix bad line

* test: removed redundancies
2022-12-28 20:46:19 +00:00
ananth102 6a6cfdbafb
feat(components): New sagemaker training job parameters (#8538)
* unit tests

* feature: generated new sagemaker features

* update unit test

* remove unit tests

* Release: Staging component for release

* reformatted files
2022-12-12 21:32:28 +00:00
ananth102 14ce09d506
test: Make ephemeral sagemaker component tests more stable (#8346)
* increase timeout

Increase timeout to make canaries less flakey

* Increase minio timeout

Make canaries less flakey

* Update run_integration_tests

* correct sleep

* remove unnecessary wait
2022-10-11 01:28:25 +00:00
ananth102 bf2389a66c
test: Upgrade kfp version in sagemaker component test (#8331)
* upgrade kfp

* update eksctl

* upgrade kfp

* downgrade cluster

* upgrade node count

* updated cert-manager
2022-10-06 00:05:50 +00:00
ananth102 916777e62f
test(components): Added integration test for Sagemaker TrainingJob component v2. (#8247)
* Integration tests for sagemaker training v2

* removed redundant check

* removed redundant print

* added safety check

* pr changes

* updated python and kubernetes

* reverting dependency versions

* Revert "updated python and kubernetes"

This reverts commit e92034d5f9.

* added linting
2022-09-14 17:42:04 -07:00
Meghna Baijal 8a4b06754a
chore: update EKS version and increase the EKS cluster creation timeout (#7975) 2022-07-01 01:00:09 +00:00
Meghna Baijal e867d1c31d
chore: update eks version to 1.19 for aws sagemaker integration tests (#7256) 2022-02-05 00:20:15 +00:00
ryansteakley 373dfe3792
chore: update eks version to 1.18 for aws sagemaker integration tests (#6847) 2021-11-01 17:04:59 -07:00
ryansteakley 40d8242bb0
chore: update aws sagemaker components tests to kfp 1.7.0 (#6805)
* Update to support kfp 1.7.0

* use variable for s3 bucket name
2021-10-27 11:08:25 -07:00
Suraj Kota b50a5cfc4e
chore(components): AWS SageMaker update eks cluster version for tests (#5735) 2021-05-25 19:53:40 -07:00
Suraj Kota 52b40ed1ac
chore(components): AWS SageMaker tests enable ssm on eks worker nodes (#5437) 2021-04-06 18:31:01 -07:00
Leonard O' Sullivan 4aa11c3c7f
feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813)
* Adds RoboMaker and SageMaker RLEstimator components

* Genericise samples

* Genericise samples

* Adds better logging and updates shim component in samples

* Adds fixes for PR comments. Updates tests accordingly

* Adds docker image reference for integration tests. Allows for setting job_name for RLEstimator training jobs

* Separate RM and SM execution roles

* Remove README reference to VPC config items

* Adds more reliable integration test for RoboMaker Simulation Job

* Simplifies integration tests

* Reverted test container entrypoints

* Update black formatting

* Update components for redbackthomson repo

* Prefix RLEstimator job name

* Add RoboMakerFullAccess to generated roles

* Update version to official 1.1.0

* Formatting int test file

* Add PassRole IAM permission to OIDC

* Adds ROBOMAKER_EXECUTION_ROLE_ARN to build vars

Co-authored-by: Nicholas Thomson <nithomso@amazon.com>
2020-12-11 13:27:27 -08:00
Kartik Kalamadi 008985a576
fix(components): AWS SageMaker - Retry delete EKS Cluster after Integ test failure (#4662)
* fix(components): AWS SageMaker - Retry delete EKS Cluster after Integ test failure

* decrease delete eks tiemout to 15 min
2020-11-06 12:12:29 -08:00
jkuruba 7c349f3f82
feat(components): AWS SageMaker - Changes for updating an existing endpoint (#4424)
* Changes for updating existing endpoint

* Review comments addressed

* Review comments addressed

* Review comments addressed

* Changed awscli and boto3 version. Ran black to format integration tests

* Removing temporarily to debug integration failures

* Adding back integration tests

* Control the number of parallel integration tests to 10

* Third Party License updated

* Version changed to 0.9.0

* Fixed a typo in Changelog
2020-09-18 16:00:28 -07:00
Nicholas Thomson 2f7a5e5a2b
chore(components): AWS SageMaker - Fix leaking resources (#4457)
* Add try/catch cleanup for integ resources

* Update pytest-xdist to 2.1

* Fix groundtruth workteam invocation
2020-09-02 16:11:40 -07:00
Kartik Kalamadi 05398cf475
[AWS SageMaker] Fix small bugs (#4161)
* fix small bugs

* add SKIP_OIDC_SETUP config

* address comments

* address comments and add KFP_VERSION to .env

* typo
2020-09-01 23:17:06 -07:00
Nicholas Thomson 8014a44229
feat(components): AWS SageMaker - Support for assuming a role (#4212)
* Add client assume role functionality

* Add assume_role to component.yaml files

* Update image to personal

* Update input to force NoneType on empty

* Update integration test setup with assumed role

* Add assume role integration test

* Update boto session to use refreshing credentials

* Update assume role relax trust relationship

* Add check for defined assumed role name

* Add processing assume integ test

* Add assume role unit test for main methods

* Add assume_role to all READMEs

* Update session to use AssumeRoleProvider

* Remove region from child calls to session

* Fix extra region_name in test

* Update assume role processing integ test name

* Add processing integ test to list

* Update assumed role to remain if not generated

* Update license version

* Update image tag to new version

* Add new version to Changelog
2020-08-03 10:53:43 -07:00
Nicholas Thomson c6754e3f13
fix(components): AWS SageMaker - Fix MinIO PID early exit (#4190)
* Add force minio error test

* Remove force error
2020-07-09 11:04:33 -07:00
Nicholas Thomson bea63652e1
[AWS SageMaker] Processing job component (#3944)
* Add TDD processing definition

* Update README

* Update temporary image

* Update component entrypoint

* Add WORKDIR to fix Docker 18 support

* integration test for processing job

* Remove job links

* Add container outputs and tests

* Update default properties

* Remove max_run_time if none provided

* Update integration readme steps

* Updated README with more resources

* Add CloudWatch link back to logs

* Update input and output config to arrays

* Update processing integration test

* Update process README

* Update unit tests

* Updated license version

* Update component image versions

* Update changelog

Co-authored-by: Suraj Kota <surakota@amazon.com>
2020-06-17 11:29:24 -07:00
Kartik Kalamadi 35019eb3ea
[AWS SageMaker] Add integration test for sample pipeline train (#3876)
* add integ test for sample pipeline train

* change docker build context integ test

* add spot test and use train ccomponent test for sample train pipeline

* small changes and ran flake8 and black

* address comments
2020-06-12 12:11:55 -07:00
Nicholas Thomson 37a63638c7
[AWS SageMaker] Add working FSx setup and test (#3831)
* Add working FSx setup and test

* Removed duplicate test function

* Replaced failure return with exit

* Update parallel methods to export

* Update EKS cluster name outside parallel task

* Add SKIP_FSX_TEST in buildspec

* Add revoke security group ingress

* Add default pytest FSx values
2020-05-29 11:01:15 -07:00
Nicholas Thomson f2a860b84c
[AWS SageMaker] Integration tests automation (#3768)
* # This is a combination of 5 commits.
# This is the 1st commit message:

Add initial scripts

# This is the commit message #2:

Add working pytest script

# This is the commit message #3:

Add initial scripts

# This is the commit message #4:

Add environment variable files

# This is the commit message #5:

Remove old cluster script

* Add initial scripts

Add working pytest script

Add initial scripts

Add environment variable files

Remove old cluster script

Update pipeline credentials to OIDC

Add initial scripts

Add working pytest script

Add initial scripts

Add working pytest script

* Remove debugging mark

* Update example EKS cluster name

* Remove quiet from Docker build

* Manually pass env

* Update env list vars as string

* Update use array directly

* Update variable array to export

* Update to using read for splitting

* Move to helper script

* Update export from CodeBuild

* Add wait for minio

* Update kubectl wait timeout

* Update minor changes for PR

* Update integration test buildspec to quiet build

* Add region to delete EKS

* Add wait for pods

* Updated README

* Add fixed interval wait

* Fix CodeBuild step order

* Add file lock for experiment ID

* Fix missing pytest parameter

* Update run create only once

* Add filelock to conda env

* Update experiment name ensuring creation each time

* Add try/catch with create experiment

* Remove caching from KFP deployment

* Remove disable KFP caching

* Move .gitignore changes to inside component

* Add blank line to default .gitignore
2020-05-20 14:18:19 -07:00