pipelines/samples/contrib/aws-samples
Leonard O' Sullivan 4aa11c3c7f
feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813)
* Adds RoboMaker and SageMaker RLEstimator components

* Genericise samples

* Genericise samples

* Adds better logging and updates shim component in samples

* Adds fixes for PR comments. Updates tests accordingly

* Adds docker image reference for integration tests. Allows for setting job_name for RLEstimator training jobs

* Separate RM and SM execution roles

* Remove README reference to VPC config items

* Adds more reliable integration test for RoboMaker Simulation Job

* Simplifies integration tests

* Reverted test container entrypoints

* Update black formatting

* Update components for redbackthomson repo

* Prefix RLEstimator job name

* Add RoboMakerFullAccess to generated roles

* Update version to official 1.1.0

* Formatting int test file

* Add PassRole IAM permission to OIDC

* Adds ROBOMAKER_EXECUTION_ROLE_ARN to build vars

Co-authored-by: Nicholas Thomson <nithomso@amazon.com>
2020-12-11 13:27:27 -08:00
..
ground_truth_pipeline_demo refactor(components): AWS SageMaker - Full component refactoring (#4336) 2020-10-27 14:17:57 -07:00
mnist-kmeans-sagemaker refactor(components): AWS SageMaker - Full component refactoring (#4336) 2020-10-27 14:17:57 -07:00
rlestimator_pipeline feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813) 2020-12-11 13:27:27 -08:00
robomaker_simulation feat(components) Adds RoboMaker and SageMaker RLEstimator components (#4813) 2020-12-11 13:27:27 -08:00
sagemaker_debugger_demo refactor(components): AWS SageMaker - Full component refactoring (#4336) 2020-10-27 14:17:57 -07:00
simple_train_pipeline refactor(components): AWS SageMaker - Full component refactoring (#4336) 2020-10-27 14:17:57 -07:00
titanic-survival-prediction refactor(components): AWS SageMaker - Full component refactoring (#4336) 2020-10-27 14:17:57 -07:00
OWNERS Add more approvers in AWS sagemaker components (#3740) 2020-05-15 11:27:36 -07:00
README.md feat(components): [AWS SageMaker] Minimize inputs for mnist classification pipeline (#4192) 2020-07-17 13:30:50 -07:00

README.md

Sample AWS SageMaker Kubeflow Pipelines

This folder contains many example pipelines which use AWS SageMaker Components for KFP. The following sections explain the setup needed to run these pipelines. Once you are done with the setup, simple_train_pipeline is a good place to start if you have never used these components before.

Prerequisites

  1. You need a cluster with Kubeflow installed on it. Install Kubeflow on AWS cluster
  2. Install the following on your local machine or EC2 instance (These are recommended tools. Not all of these are required)
    1. AWS CLI. If you are using an IAM user, configure your Access Key ID, Secret Access Key and preferred AWS Region by running: aws configure
    2. aws-iam-authenticator version 0.1.31 and above
    3. eksctl version above 0.15
    4. kubectl version needs to be your k8s version +/- 1 minor version.
    5. KFP SDK (installs the dsl-compile and kfp cli)

IAM Permissions

To use AWS KFP Components the KFP component pods need access to AWS SageMaker. There are two ways you can give them access to SageMaker. (You need EKS cluster for Option 1)

Option 1 (Recommended) IAM roles for service account.

  1. Enable OIDC support on EKS cluster
    eksctl utils associate-iam-oidc-provider --cluster <cluster_name> \
     --region <cluster_region> --approve
    
  2. Take note of the OIDC issuer URL. This URL is in the form oidc.eks.<region>.amazonaws.com/id/<OIDC_ID> . Note down the URL.
    aws eks describe-cluster --name <cluster_name> --query "cluster.identity.oidc.issuer" --output text
    
  3. Create a file named trust.json with the following content.
    Replace <OIDC_URL> with your OIDC issuer URL (Dont include https://) and <AWS_ACCOUNT_NUMBER> with your AWS account number.
    # Replace these two with proper values 
    OIDC_URL="<OIDC_URL>"
    AWS_ACC_NUM="<AWS_ACCOUNT_NUMBER>"
    
    # Run this to create trust.json file
    cat <<EOF > trust.json
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": "arn:aws:iam::$AWS_ACC_NUM:oidc-provider/$OIDC_URL"
          },
          "Action": "sts:AssumeRoleWithWebIdentity",
          "Condition": {
            "StringEquals": {
              "$OIDC_URL:aud": "sts.amazonaws.com",
              "$OIDC_URL:sub": "system:serviceaccount:kubeflow:pipeline-runner"
            }
          }
        }
      ]
    }
    EOF
    
  4. Create an IAM role using trust.json. Make a note of the ARN returned in the output.
    aws iam create-role --role-name kfp-example-pod-role --assume-role-policy-document file://trust.json
    aws iam attach-role-policy --role-name kfp-example-pod-role --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
    aws iam get-role --role-name kfp-example-pod-role --output text --query 'Role.Arn'
    
  5. Edit your pipeline-runner service account.
    kubectl edit -n kubeflow serviceaccount pipeline-runner
    
    Add eks.amazonaws.com/role-arn: <role_arn> to annotations, then save the file. Example: (add only line 5)
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      annotations:
        eks.amazonaws.com/role-arn: <role_arn>
      creationTimestamp: "2020-04-16T05:48:06Z"
      labels:
        app: pipeline-runner
        app.kubernetes.io/component: pipelines-runner
        app.kubernetes.io/instance: pipelines-runner-0.2.0
        app.kubernetes.io/managed-by: kfctl
        app.kubernetes.io/name: pipelines-runner
        app.kubernetes.io/part-of: kubeflow
        app.kubernetes.io/version: 0.2.0
      name: pipeline-runner
      namespace: kubeflow
      resourceVersion: "11787"
      selfLink: /api/v1/namespaces/kubeflow/serviceaccounts/pipeline-runner
      uid: d86234bd-7fa5-11ea-a8f2-02934be6dc88
    secrets:
    - name: pipeline-runner-token-dkjrk
    

Option 2 Store the IAM credentials as a aws-secret in kubernetes cluster. Then use those in the components.

  1. You need credentials for an IAM user with SageMakerFullAccess. Apply them to k8s cluster. Replace AWS_ACCESS_KEY_IN_BASE64 and AWS_SECRET_ACCESS_IN_BASE64.

    Note: To get base64 string you can do echo -n $AWS_ACCESS_KEY_ID | base64

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Secret
    metadata:
      name: aws-secret
      namespace: kubeflow
    type: Opaque
    data:
      AWS_ACCESS_KEY_ID: <AWS_ACCESS_KEY_IN_BASE64>
      AWS_SECRET_ACCESS_KEY: <AWS_SECRET_ACCESS_IN_BASE64>
    EOF
    
  2. Use the stored aws-secret in pipeline code by adding this line to each component in your pipeline .apply(use_aws_secret('aws-secret', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY'))
    Kubeflow Document
    Example Code (uncomment this line)

Inputs to the pipeline

Role Input

Note: Ignore this section if you plan to run titanic-survival-prediction example

This role is used by SageMaker jobs created by the KFP to access the S3 buckets and other AWS resources. Run these commands to create the sagemaker-execution-role.
Note down the Role ARN. You need to give this Role ARN as input in pipeline.

TRUST="{ \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Principal\": { \"Service\": \"sagemaker.amazonaws.com\" }, \"Action\": \"sts:AssumeRole\" } ] }"
aws iam create-role --role-name kfp-example-sagemaker-execution-role --assume-role-policy-document "$TRUST"
aws iam attach-role-policy --role-name kfp-example-sagemaker-execution-role --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
aws iam attach-role-policy --role-name kfp-example-sagemaker-execution-role --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
aws iam get-role --role-name kfp-example-sagemaker-execution-role --output text --query 'Role.Arn'

# note down the Role ARN.