History

Leonard O' Sullivan 4aa11c3c7f feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813 ) * Adds RoboMaker and SageMaker RLEstimator components * Genericise samples * Genericise samples * Adds better logging and updates shim component in samples * Adds fixes for PR comments. Updates tests accordingly * Adds docker image reference for integration tests. Allows for setting job_name for RLEstimator training jobs * Separate RM and SM execution roles * Remove README reference to VPC config items * Adds more reliable integration test for RoboMaker Simulation Job * Simplifies integration tests * Reverted test container entrypoints * Update black formatting * Update components for redbackthomson repo * Prefix RLEstimator job name * Add RoboMakerFullAccess to generated roles * Update version to official 1.1.0 * Formatting int test file * Add PassRole IAM permission to OIDC * Adds ROBOMAKER_EXECUTION_ROLE_ARN to build vars Co-authored-by: Nicholas Thomson <nithomso@amazon.com>		2020-12-11 13:27:27 -08:00
..
src	refactor(components): AWS SageMaker - Full component refactoring (#4336 )	2020-10-27 14:17:57 -07:00
README.md	feat(components): AWS SageMaker - Support for assuming a role (#4212 )	2020-08-03 10:53:43 -07:00
component.yaml	feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813 )	2020-12-11 13:27:27 -08:00

README.md

SageMaker Ground Truth Kubeflow Pipelines component

Summary

Component to submit SageMaker Ground Truth labeling jobs directly from a Kubeflow Pipelines workflow.

Details

Intended Use

For Ground Truth jobs using AWS SageMaker.

Runtime Arguments

Argument	Description	Optional	Data type	Accepted values	Default
region	The region where the cluster launches	No	String
endpoint_url	The endpoint URL for the private link VPC endpoint	Yes	String
assume_role	The ARN of an IAM role to assume when connecting to SageMaker	Yes	String
role	The Amazon Resource Name (ARN) that Amazon SageMaker assumes to perform tasks on your behalf	No	String
job_name	The name of the Ground Truth job. Must be unique within the same AWS account and AWS region	Yes	String		LabelingJob-[datetime]-[random id]
label_attribute_name	The attribute name to use for the label in the output manifest file	Yes	String		job_name
manifest_location	The Amazon S3 location of the manifest file that describes the input data objects	No	String
output_location	The Amazon S3 path where you want Amazon SageMaker to store the results of the transform job	No	String
output_encryption_key	The AWS KMS key that Amazon SageMaker uses to encrypt the model artifacts	Yes	String
task_type	Built in image classification, bounding box, text classification, or semantic segmentation, or custom; If custom, please provide pre- and post-labeling task lambda functions	No	String	Image Classification, Bounding Box, Text Classification, Semantic Segmentation, Custom
worker_type	The workteam for data labeling	No	String	Public, Private, Vendor
workteam_arn	The ARN of the work team assigned to complete the tasks; specify if worker type is private or vendor	Yes	String
no_adult_content	If data is free of adult content; specify if worker type is public	Yes	Boolean	False, True	False
no_ppi	If data is free of personally identifiable information; specify if worker type is public	Yes	Boolean	False, True	False
label_category_config	The S3 URL of the JSON structured file that defines the categories used to label the data objects	Yes	String
max_human_labeled_objects	The maximum number of objects that can be labeled by human workers	Yes	Int	≥ 1	all objects
max_percent_objects	The maximum percentage of input data objects that should be labeled	Yes	Int	[1, 100]	100
enable_auto_labeling	Enables auto-labeling; only for bounding box, text classification, and image classification	Yes	Boolean	False, True	False
initial_model_arn	The ARN of the final model used for a previous auto-labeling job	Yes	String
resource_encryption_key	The AWS KMS key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s)	Yes	String
ui_template	The Amazon S3 bucket location of the UI template	No	String
pre_human_task_function	The ARN of a Lambda function that is run before a data object is sent to a human worker	Yes	String
post_human_task_function	The ARN of a Lambda function implements the logic for annotation consolidation	Yes	String
task_keywords	Keywords used to describe the task so that workers on Amazon Mechanical Turk can discover the task	Yes	String
title	A title for the task for your human workers	No	String
description	A description of the task for your human workers	No	String
num_workers_per_object	The number of human workers that will label an object	No	Int	[1, 9]
time_limit	The maximum run time in seconds per training job	No	Int	[30, 28800]
task_availibility	The length of time that a task remains available for labeling by human workers	Yes	Int	Public workforce: [1, 43200], other: [1, 864000]
max_concurrent_tasks	The maximum number of data objects that can be labeled by human workers at the same time	Yes	Int	[1, 1000]
workforce_task_price	The price that you pay for each task performed by a public worker in USD; Specify to the tenth fractions of a cent; Format as "0.000"	Yes	Float	0.000
tags	Key-value pairs to categorize AWS resources	Yes	Dict		{}

Outputs

Name	Description
output_manifest_location	URL where labeling results were stored
active_learning_model_arn	ARN of the resulting active learning model

README.md

SageMaker Ground Truth Kubeflow Pipelines component

Summary

Details

Intended Use

Runtime Arguments

Outputs

Requirements

Samples

Used in a pipeline with workteam creation and training

References