History

Nicholas Thomson d81c8095d0 refactor(components): AWS SageMaker - Full component refactoring (#4336 ) * Temporary rebase commit * Add yaml compiler * Add compiler CLI * Update Dockerfile to copy all files * Add validate input list vs dict * Add unit test for new train * Add minor bug fixes * Override tag when generating specs * Update pydocs with formatter * Add contributing doc * Add formatters to CONTRIBUTING * Add working generic logic applied to train * Update component input and output to inherit * Downgrade to Python 3.7 * Update add outputValue to arg list * Updated outputValue to outputPath * Add empty string default to not-required inputs * Update path to component relative to root * Update faulty False-y condition * Update outputs to write to file * Update doc formatting * Update docstrings to match structure * Add unit tests for component and compiler * Add unit tests for component * Add spec unit tests * Add training unit tests * Update unit test automation * Add sample formatting checks * Remove extra flake8 check in integ tests * Add unit test black check * Update black formatting for all files * Update include black formatting * Add batch component * Remove old transform components * Update region input description * Add all component specs * Add deploy component * Add ground truth component * Add HPO component * Add create model component * Add processing component * Add workteam component * Add spec unit tests * Add deploy unit tests * Add ground truth unit tests * Add tuning component unit tests * Add create model component unit test * Add process component unit tests * Add workteam component unit tests * Remove output_path from required_args * Remove old component implementations * Update black formatting * Add assume role feature * Compiled all components * Update doc formatting * Fix process terminate syntax error * Update compiler to use kfp structures * Update nits * Update unified requirements * Rebase on debugging commit * Add debugger unit tests * Update formatting * Update component YAML * Fix unit test Dockerfile relative directory * Update unit test context to root * Update Batch to auto-generate name * Update minor docs and formatting changes * Update deploy name to common autogenerated * Add f-strings to logs * Add update support * Add Amazon license header * Update autogen and autoformat * Rename SpecValidator to SpecInputParser * Split requirements by dev and prod * Support for checking generated specs * Update minor changes * Update deploy component output description * Update components to beta repository * Update fix unit test requirements * Update unit test build spec for new results path * Update deploy wait for endpoint complete * Update component configure AWS clients in new method * Update boto3 retry method * Update license version * Update component YAML versions * Add new version to Changelog * Update component spec types * Update deploy config ignore overwrite * Update component for debugging * Update images back to 1.0.0 * Remove coverage from components	2020-10-27 14:17:57 -07:00
..
README.md	feat(components): AWS SageMaker - Add optional parameter to allow training component to accept parameters related to Debugger (#4283 )	2020-08-19 15:41:22 -07:00
training-pipeline.py	refactor(components): AWS SageMaker - Full component refactoring (#4336 )	2020-10-27 14:17:57 -07:00

refactor(components): AWS SageMaker - Full component refactoring (#4336 )

* Temporary rebase commit

* Add yaml compiler

* Add compiler CLI

* Update Dockerfile to copy all files

* Add validate input list vs dict

* Add unit test for new train

* Add minor bug fixes

* Override tag when generating specs

* Update pydocs with formatter

* Add contributing doc

* Add formatters to CONTRIBUTING

* Add working generic logic applied to train

* Update component input and output to inherit

* Downgrade to Python 3.7

* Update add outputValue to arg list

* Updated outputValue to outputPath

* Add empty string default to not-required inputs

* Update path to component relative to root

* Update faulty False-y condition

* Update outputs to write to file

* Update doc formatting

* Update docstrings to match structure

* Add unit tests for component and compiler

* Add unit tests for component

* Add spec unit tests

* Add training unit tests

* Update unit test automation

* Add sample formatting checks

* Remove extra flake8 check in integ tests

* Add unit test black check

* Update black formatting for all files

* Update include black formatting

* Add batch component

* Remove old transform components

* Update region input description

* Add all component specs

* Add deploy component

* Add ground truth component

* Add HPO component

* Add create model component

* Add processing component

* Add workteam component

* Add spec unit tests

* Add deploy unit tests

* Add ground truth unit tests

* Add tuning component unit tests

* Add create model component unit test

* Add process component unit tests

* Add workteam component unit tests

* Remove output_path from required_args

* Remove old component implementations

* Update black formatting

* Add assume role feature

* Compiled all components

* Update doc formatting

* Fix process terminate syntax error

* Update compiler to use kfp structures

* Update nits

* Update unified requirements

* Rebase on debugging commit

* Add debugger unit tests

* Update formatting

* Update component YAML

* Fix unit test Dockerfile relative directory

* Update unit test context to root

* Update Batch to auto-generate name

* Update minor docs and formatting changes

* Update deploy name to common autogenerated

* Add f-strings to logs

* Add update support

* Add Amazon license header

* Update autogen and autoformat

* Rename SpecValidator to SpecInputParser

* Split requirements by dev and prod

* Support for checking generated specs

* Update minor changes

* Update deploy component output description

* Update components to beta repository

* Update fix unit test requirements

* Update unit test build spec for new results path

* Update deploy wait for endpoint complete

* Update component configure AWS clients in new method

* Update boto3 retry method

* Update license version

* Update component YAML versions

* Add new version to Changelog

* Update component spec types

* Update deploy config ignore overwrite

* Update component for debugging

* Update images back to 1.0.0

* Remove coverage from components

2020-10-27 14:17:57 -07:00

README.md

feat(components): AWS SageMaker - Add optional parameter to allow training component to accept parameters related to Debugger (#4283 )

2020-08-19 15:41:22 -07:00

training-pipeline.py

refactor(components): AWS SageMaker - Full component refactoring (#4336 )

2020-10-27 14:17:57 -07:00

README.md

Simple pipeline for the training component

An example pipeline with only one training component.

Prerequisites

Make sure you have set up your EKS cluster as described in this README.md.

Use the following python script to copy train_data, test_data, and valid_data.csv to your bucket.
Create a bucket in us-east-1 region if you don't have one already. For the purposes of this demonstration, all resources will be created in the us-east-1 region.

Create a new file named s3_sample_data_creator.py with the following content:

import pickle, gzip, numpy, urllib.request, json
from urllib.parse import urlparse

###################################################################
# This is the only thing that you need to change to run this code 
# Give the name of your S3 bucket 
bucket = '<bucket-name>' 

# If you are gonna use the default values of the pipeline then 
# give a bucket name which is in us-east-1 region 
###################################################################


# Load the dataset
urllib.request.urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz")
with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')


# Upload dataset to S3
from sagemaker.amazon.common import write_numpy_to_dense_tensor
import io
import boto3

train_data_key = 'mnist_kmeans_example/train_data'
test_data_key = 'mnist_kmeans_example/test_data'
train_data_location = 's3://{}/{}'.format(bucket, train_data_key)
test_data_location = 's3://{}/{}'.format(bucket, test_data_key)
print('training data will be uploaded to: {}'.format(train_data_location))
print('training data will be uploaded to: {}'.format(test_data_location))

# Convert the training data into the format required by the SageMaker KMeans algorithm
buf = io.BytesIO()
write_numpy_to_dense_tensor(buf, train_set[0], train_set[1])
buf.seek(0)

boto3.resource('s3').Bucket(bucket).Object(train_data_key).upload_fileobj(buf)

# Convert the test data into the format required by the SageMaker KMeans algorithm
write_numpy_to_dense_tensor(buf, test_set[0], test_set[1])
buf.seek(0)

boto3.resource('s3').Bucket(bucket).Object(test_data_key).upload_fileobj(buf)

# Convert the valid data into the format required by the SageMaker KMeans algorithm
numpy.savetxt('valid-data.csv', valid_set[0], delimiter=',', fmt='%g')
s3_client = boto3.client('s3')
input_key = "{}/valid_data.csv".format("mnist_kmeans_example/input")
s3_client.upload_file('valid-data.csv', bucket, input_key)

Run this file with the follow command: python3 s3_sample_data_creator.py

Steps

Compile the pipeline:
dsl-compile --py training-pipeline.py --output training-pipeline.tar.gz
In the Kubeflow UI, upload this compiled pipeline specification (the .tar.gz file) and click on create run.
Once the pipeline completes, you can see the outputs under 'Output parameters' in the Training component's Input/Output section.

Example inputs to this pipeline :

region : us-east-1
endpoint_url : <leave this empty>
image : 382416733822.dkr.ecr.us-east-1.amazonaws.com/kmeans:1
training_input_mode : File
hyperparameters : {"k": "10", "feature_dim": "784"}
channels : In this JSON, along with other parameters you need to pass the S3 Uri where you have data

                [
                  {
                    "ChannelName": "train",
                    "DataSource": {
                      "S3DataSource": {
                        "S3Uri": "s3://<your_bucket_name>/mnist_kmeans_example/train_data",
                        "S3DataType": "S3Prefix",
                        "S3DataDistributionType": "FullyReplicated"
                      }
                    },
                    "ContentType": "",
                    "CompressionType": "None",
                    "RecordWrapperType": "None",
                    "InputMode": "File"
                  }
                ]

instance_type : ml.m5.2xlarge
instance_count : 1
volume_size : 50
max_run_time : 3600
model_artifact_path : This is where the output model will be stored 
                      s3://<your_bucket_name>/mnist_kmeans_example/output
output_encryption_key : <leave this empty>
network_isolation : True
traffic_encryption : False
spot_instance : False
max_wait_time : 3600
checkpoint_config : {}
role : Paste the role ARN that you noted down  
       (The IAM role with Full SageMaker permissions and S3 access)
       Example role input->  arn:aws:iam::999999999999:role/SageMakerExecutorKFP

Resources

Using Amazon built-in algorithms