* Update mnist sample with Hosting * updated sample with new directory * Updating sample to use output rather than after * Address comments |
||
|---|---|---|
| .. | ||
| README.md | ||
| invoke_endpoint.py | ||
| kmeans_preprocessing.py | ||
| mnist-classification-pipeline.py | ||
README.md
MNIST Classification with KMeans
The mnist-classification-pipeline.py sample pipeline shows how to create an end to end ML workflow to train and deploy a model on SageMaker. We will train a classification model using Kmeans algorithm with MNIST dataset on SageMaker. Additionally, this sample also demonstrates how to use SageMaker components v1 and v2 together in a Kubeflow pipeline workflow. This example was taken from an existing SageMaker example and modified to work with the Amazon SageMaker Components for Kubeflow Pipelines.
Prerequisites
- Make sure you have completed all the pre-requisites mentioned in this README.md.
Sample MNIST dataset
- Clone this repository to use the pipelines and sample scripts.
git clone https://github.com/kubeflow/pipelines.git cd pipelines/samples/contrib/aws-samples/mnist-kmeans-sagemaker
The following commands will copy the data extraction pre-processing script to an S3 bucket which we will use to store artifacts for the pipeline.
- Create a bucket in
us-east-1region. For the purposes of this demonstration, all resources will be created in the us-east-1 region.- Specify your S3_BUCKET_NAME
export S3_BUCKET_NAME=
export SAGEMAKER_REGION=us-east-1 if [[ $SAGEMAKER_REGION == "us-east-1" ]]; then aws s3api create-bucket --bucket ${S3_BUCKET_NAME} --region ${SAGEMAKER_REGION} else aws s3api create-bucket --bucket ${S3_BUCKET_NAME} --region ${SAGEMAKER_REGION} \ --create-bucket-configuration LocationConstraint=${SAGEMAKER_REGION} fi echo ${S3_BUCKET_NAME} - Specify your S3_BUCKET_NAME
- Upload the
mnist-kmeans-sagemaker/kmeans_preprocessing.pyfile to your bucket with the prefixmnist_kmeans_example/processing_code/kmeans_preprocessing.py.aws s3 cp kmeans_preprocessing.py s3://${S3_BUCKET_NAME}/mnist_kmeans_example/processing_code/kmeans_preprocessing.py
Compile and run the pipelines
- To compile the pipeline run:
python mnist-classification-pipeline.py. This will create atar.gzfile. - In the Kubeflow Pipelines UI, upload this compiled pipeline specification (the .tar.gz file) and click on create run.
- Provide the sagemaker execution
role_arnyou created andbucket_nameyou created as pipeline inputs. - Once the pipeline completes, you can go to
batch_transform_outputto check your batch prediction results. You will also have an model endpoint in service. Refer to Prediction section below to run predictions aganist your deployed model aganist the endpoint.
Prediction
- Find your endpoint name by:
- Checking the
sagemaker_resource_namefield under Output artifacts of the Endpoint component in the pipeline run.
export ENDPOINT_NAME= - Checking the
- Setup AWS credentials with
sagemaker:InvokeEndpointaccess. Sample commands - Run the script below to invoke the endpoint
python invoke_endpoint.py $ENDPOINT_NAME
Cleaning up the endpoint
You can find the model/endpoint configuration name in the sagemaker_resource_name field under Output artifacts of the EndpointConfig/Model component in the pipeline run.
export ENDPOINT_CONFIG_NAME=
export MODEL_NAME=
To delete all the endpoint resources use:
Note: The namespace for the standard kubeflow installation is "kubeflow". For multi-tenant installations the namespace is located at the left in the navigation bar.
export MY_KUBEFLOW_NAMESPACE=
kubectl delete endpoint $ENDPOINT_NAME -n $MY_KUBEFLOW_NAMESPACE
kubectl delete endpointconfig $ENDPOINT_CONFIG_NAME -n $MY_KUBEFLOW_NAMESPACE
kubectl delete model $MODEL_NAME -n $MY_KUBEFLOW_NAMESPACE
Components source
Hyperparameter Tuning: source code
Training: source code
Endpoint: source code
Endpoint Config: source code
Model: source code
Batch Transformation: source code