* Replace all string types with Python types * Update HPO yaml * Update Batch YAML * Update Deploy YAML * Update GroundTruth YAML * Update Model YAML * Update Train YAML * Update WorkTeam YAML * Updated samples to remove strings * Update to temporary image * Remove unnecessary imports * Update image to newer image * Update components to python3 * Update bool parser type * Remove empty ContentType in samples * Update to temporary image * Update to version 0.3.1 * Update deploy to login * Update deploy load config path * Fix export environment variable in deploy * Fix env name * Update deploy reflow env paths * Add debug config line * Use username and password directly * Updated to 0.3.1 * Update field types to JsonObject and JsonArray |
||
|---|---|---|
| .. | ||
| src | ||
| README.md | ||
| component.yaml | ||
README.md
SageMaker Hosting Services - Create Endpoint Kubeflow Pipeline component
Summary
Component to deploy a model in SageMaker Hosting Service from a Kubeflow Pipelines workflow.
Details
Deploying a model using Amazon SageMaker hosting services is a three-step process:
- Create a model in Amazon SageMaker - Specify the S3 path where model artifacts are stored and Docker registry path for the image that contains the inference code
- Create an endpoint configuration for an HTTPS endpoint - Specify the name of model in production variants and the type of instance that you want Amazon SageMaker to launch to host the model.
- Create an HTTPS endpoint - Launch the ML compute instances and deploy the model as specified in the endpoint configuration
This component handles Step 2 and 3. Step 1 can be done using the create model component for AWS SageMaker.
Intended Use
Create an endpoint in AWS SageMaker Hosting Service for model deployment.
Runtime Arguments
| Argument | Description | Optional (in pipeline definition) | Optional (in UI) | Data type | Accepted values | Default |
|---|---|---|---|---|---|---|
| region | The region where the endpoint is created | No | No | String | ||
| endpoint_url | The endpoint URL for the private link VPC endpoint | Yes | Yes | String | ||
| endpoint_config_name | The name of the endpoint configuration | Yes | Yes | String | ||
| endpoint_config_tags | Key-value pairs to tag endpoint configurations in AWS | Yes | Yes | Dict | {} | |
| endpoint_tags | Key-value pairs to tag the Hosting endpoint in AWS | Yes | Yes | Dict | {} | |
| endpoint_name | The name of the endpoint. The name must be unique within an AWS Region in your AWS account | Yes | Yes | String |
In SageMaker, you can create an endpoint that can host multiple models. The set of parameters below represent a production variant. A production variant identifies a model that you want to host and the resources (e.g. instance type, initial traffic distribution etc.) to deploy for hosting it. You must specify at least one production variant to create an endpoint.
| Argument | Description | Optional (in pipeline definition) | Optional (in UI) | Data type | Accepted values | Default |
|---|---|---|---|---|---|---|
| model_name_[1, 3] | The name of the model that you want to host. This is the name that you specified when creating the model | No | No | String | ||
| variant_name_[1, 3] | The name of the production variant | Yes | Yes | String | variant_name_[1, 3] | |
| instance_type_[1, 3] | The ML compute instance type | Yes | Yes | String | ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge | ml.m4.xlarge |
| initial_instance_count_[1, 3] | Number of instances to launch initially | Yes | Yes | Integer | ≥ 1 | 1 |
| initial_variant_weight_[1, 3] | Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants. | Yes | Yes | Float | Minimum value of 0 | |
| accelerator_type_[1, 3] | The size of the Elastic Inference (EI) instance to use for the production variant | Yes | Yes | String | ml.eia1.medium, ml.eia1.large, ml.eia1.xlarge |
Notes:
- Please use the links in the Resources section for detailed information on each input parameter and SageMaker APIs used in this component
- The parameters,
model_name_1through3, is intended to be output of create model component from previous steps in the pipeline.model_name_[1, 3]and other parameters for a production variant can be specified directly as well if the component is being used on its own.
Outputs
| Name | Description |
|---|---|
| endpoint_name | HTTPS Endpoint URL where client applications can send requests using InvokeEndpoint API |
Requirements
Samples
Integrated into a pipeline
MNIST Classification pipeline: Pipeline | Steps
Resources
- Create Endpoint Configuration
- Create Endpoint