History

Thang Minh Vu 328edd8117 fix(components): make inputs.model_artifact_url optional in sagemaker model component (#8336 ) * fix(components): make inputs.model_artifact_url optional in sagemaker model component * chore: run black * Fixed Stop bug commit f2092382ee941c2f33935db3e886093a15f103f7 Author: ananth102 <abashyam@amazon.com> Date: Fri Oct 7 19:51:55 2022 +0000 replaced image commit 2f0e2daa54fe80a3dfc471d393be62d612217b84 Merge: `bf2389a66` `7ce165432` Author: ananth102 <abashyam@amazon.com> Date: Fri Oct 7 19:50:28 2022 +0000 Merge remote-tracking branch 'stopfix/handle_stopped' into kfpv1fixes2 commit `7ce165432e` Author: Kartik Kalamadi <kalamadi@amazon.com> Date: Thu Mar 3 09:58:16 2022 -0800 Run black commit `32d6e1388a` Author: Kartik Kalamadi <kalamadi@amazon.com> Date: Tue Mar 1 15:25:32 2022 -0800 Change image for testing commit `7875d9aa27` Author: Kartik Kalamadi <kalamadi@amazon.com> Date: Mon Jan 31 09:29:50 2022 -0800 Handle Stopped state for all components and fix bug in robomaker simulation function * chore(docs): Update model README.md Update README * updated image and liscense * chore: pop ModelDataUrl if not exist * fix: make field as option in aws batch_transform component chore: run black chore: revert docker version pump up chore(docs): update valid instance types Remove key if not use Pop KmsKeyId * update changelog * chore: pop DataProcessing if no value supplied * test(components): Update test * fix(batch_transform): only pop input and output * fixed log bug Co-authored-by: ananth102 <abashyam@amazon.com>		2022-10-14 22:12:49 +00:00
..
src	feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813 )	2020-12-11 13:27:27 -08:00
README.md	feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813 )	2020-12-11 13:27:27 -08:00
component.yaml	fix(components): make inputs.model_artifact_url optional in sagemaker model component (#8336 )	2022-10-14 22:12:49 +00:00

README.md

SageMaker RLEstimator Kubeflow Pipelines component

Summary

Component to submit SageMaker RLEstimator (Reinforcement Learning) training jobs directly from a Kubeflow Pipelines workflow. https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-rl-workflow.html

Intended Use

For running your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation using AWS SageMaker.

Runtime Arguments

Argument	Description	Optional	Data type	Accepted values	Default
region	The region where the cluster launches	No	String
endpoint_url	The endpoint URL for the private link VPC endpoint	Yes	String
assume_role	The ARN of an IAM role to assume when connecting to SageMaker	Yes	String
job_name	The name of the training job. Must be unique within the same AWS account and AWS region	Yes	String		TrainingJob-[datetime]-[random id]
role	The Amazon Resource Name (ARN) that Amazon SageMaker assumes to perform tasks on your behalf	No	String
image	The registry path of the Docker image that contains your custom image, or you can use a prebuilt AWS RL image	Yes	String
entry_point	Path (absolute or relative) to the Python source file which should be executed as the entry point to training	No	String
source_dir	Path (S3 URI) to a directory with any other training source code dependencies aside from the entry point file	Yes	String
toolkit	RL toolkit you want to use for executing your model training code	Yes	String
toolkit_version	RL toolkit version you want to be use for executing your model training code	Yes	String
framework	Framework (MXNet, TensorFlow or PyTorch) you want to be used as a toolkit backed for reinforcement learning training	Yes	String
metric_definitions	The dictionary of name-regex pairs specify the metrics that the algorithm emits	Yes	Dict		{}
training_input_mode	The input mode that the algorithm supports	No	String	File, Pipe	File
hyperparameters	Hyperparameters for the selected algorithm	No	Dict	Depends on Algo
instance_type	The ML compute instance type	Yes	String	ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge and many more	ml.m4.xlarge
instance_count	The number of ML compute instances to use in each training job	Yes	Int	≥ 1	1
volume_size	The size of the ML storage volume that you want to provision in GB	Yes	Int	≥ 1	30
max_run	The maximum run time in seconds per training job	Yes	Int	≤ 432000 (5 days)	86400 (1 day)
model_artifact_path		No	String
output_encryption_key	The AWS KMS key that Amazon SageMaker uses to encrypt the model artifacts	Yes	String
vpc_security_group_ids	A comma-delimited list of security group IDs, in the form sg-xxxxxxxx	Yes	String
vpc_subnets	A comma-delimited list of subnet IDs in the VPC to which you want to connect your RLEstimator job	Yes	String
spot_instance	Use managed spot training if true	No	Boolean	False, True	False
max_wait_time	The maximum time in seconds you are willing to wait for a managed spot training job to complete	Yes	Int	≤ 432000 (5 days)	86400 (1 day)
checkpoint_config	Dictionary of information about the output location for managed spot training checkpoint data	Yes	Dict		{}
debug_hook_config	Dictionary of configuration information for the debug hook parameters, collection configurations, and storage paths	Yes	Dict		{}
debug_rule_config	List of configuration information for debugging rules	Yes	List of Dicts		[]
tags	Key-value pairs to categorize AWS resources	Yes	Dict		{}

Notes:

There are two ways to use this compnent, you can build your own Docker image with baked in code or pass code in via the source_dir input. You then use the entry_point to provide a filename to use as the code entrypoint.
The format for the debug_hook_config field is:

{
    "CollectionConfigurations": [
    {
        'CollectionName': 'string',
        'CollectionParameters': {
           'string' : 'string'
        }
     }
    ],
    'HookParameters': {
        'string' : 'string'
    },
    'LocalPath': 'string',
    'S3OutputPath': 'string'
}

The format for the debug_rule_config field is:

[
    {
        'InstanceType': 'string',
        'LocalPath': 'string',
        'RuleConfigurationName': 'string',
        'RuleEvaluatorImage': 'string',
        'RuleParameters': {
            'string' : 'string'
        },
        'S3OutputPath': 'string',
        'VolumeSizeInGB': number
    }
]

Output

Stores the Model in the s3 bucket you specified via model_artifact_path

Example code

Simple example pipeline that uses a custom image : rlestimator_pipeline_custom_image Sample pipeline for using an image selected for you by the RLEstimator class dependent on the framework and toolkit you provide: rlestimator_pipeline_toolkit_image