History

Nicholas Thomson bd8c1ddd38 [AWS SageMaker] Specify component input types (#3683 ) * Replace all string types with Python types * Update HPO yaml * Update Batch YAML * Update Deploy YAML * Update GroundTruth YAML * Update Model YAML * Update Train YAML * Update WorkTeam YAML * Updated samples to remove strings * Update to temporary image * Remove unnecessary imports * Update image to newer image * Update components to python3 * Update bool parser type * Remove empty ContentType in samples * Update to temporary image * Update to version 0.3.1 * Update deploy to login * Update deploy load config path * Fix export environment variable in deploy * Fix env name * Update deploy reflow env paths * Add debug config line * Use username and password directly * Updated to 0.3.1 * Update field types to JsonObject and JsonArray		2020-05-11 22:06:21 -07:00
..
src	[AWS SageMaker] Specify component input types (#3683 )	2020-05-11 22:06:21 -07:00
README.md	Update documentation for AWS components (#3410 )	2020-04-08 09:43:46 -07:00
component.yaml	[AWS SageMaker] Specify component input types (#3683 )	2020-05-11 22:06:21 -07:00

README.md

SageMaker Hosting Services - Create Endpoint Kubeflow Pipeline component

Summary

Component to deploy a model in SageMaker Hosting Service from a Kubeflow Pipelines workflow.

Details

Deploying a model using Amazon SageMaker hosting services is a three-step process:

Create a model in Amazon SageMaker - Specify the S3 path where model artifacts are stored and Docker registry path for the image that contains the inference code
Create an endpoint configuration for an HTTPS endpoint - Specify the name of model in production variants and the type of instance that you want Amazon SageMaker to launch to host the model.
Create an HTTPS endpoint - Launch the ML compute instances and deploy the model as specified in the endpoint configuration

This component handles Step 2 and 3. Step 1 can be done using the create model component for AWS SageMaker.

Intended Use

Create an endpoint in AWS SageMaker Hosting Service for model deployment.

Runtime Arguments

Argument	Description	Optional (in pipeline definition)	Optional (in UI)	Data type	Default
region	The region where the endpoint is created	No	No	String
endpoint_url	The endpoint URL for the private link VPC endpoint	Yes	Yes	String
endpoint_config_name	The name of the endpoint configuration	Yes	Yes	String
endpoint_config_tags	Key-value pairs to tag endpoint configurations in AWS	Yes	Yes	Dict	{}
endpoint_tags	Key-value pairs to tag the Hosting endpoint in AWS	Yes	Yes	Dict	{}
endpoint_name	The name of the endpoint. The name must be unique within an AWS Region in your AWS account	Yes	Yes	String

In SageMaker, you can create an endpoint that can host multiple models. The set of parameters below represent a production variant. A production variant identifies a model that you want to host and the resources (e.g. instance type, initial traffic distribution etc.) to deploy for hosting it. You must specify at least one production variant to create an endpoint.

Argument	Description	Optional (in pipeline definition)	Optional (in UI)	Data type	Accepted values	Default
model_name_[1, 3]	The name of the model that you want to host. This is the name that you specified when creating the model	No	No	String
variant_name_[1, 3]	The name of the production variant	Yes	Yes	String		variant_name_[1, 3]
instance_type_[1, 3]	The ML compute instance type	Yes	Yes	String	ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge	ml.m4.xlarge
initial_instance_count_[1, 3]	Number of instances to launch initially	Yes	Yes	Integer	≥ 1	1
initial_variant_weight_[1, 3]	Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants.	Yes	Yes	Float	Minimum value of 0
accelerator_type_[1, 3]	The size of the Elastic Inference (EI) instance to use for the production variant	Yes	Yes	String	ml.eia1.medium, ml.eia1.large, ml.eia1.xlarge

Notes:

Please use the links in the Resources section for detailed information on each input parameter and SageMaker APIs used in this component
The parameters, model_name_1 through 3, is intended to be output of create model component from previous steps in the pipeline. model_name_[1, 3] and other parameters for a production variant can be specified directly as well if the component is being used on its own.

Outputs

Name	Description
endpoint_name	HTTPS Endpoint URL where client applications can send requests using InvokeEndpoint API

Requirements

Samples

Integrated into a pipeline

MNIST Classification pipeline: Pipeline | Steps

Resources

Create Endpoint Configuration
- Create Endpoint Configuration API documentation
- Boto3 API reference
Create Endpoint
- Create Endpoint API documentation
- Boto3 API reference