feat(components): SageMaker V2 model monitor component and testing (#9253)

* Add model monitor component and integration tests

* Generate model monitor using updated generator

* Add sleep for monitoring schedule

* Update requirements v2

* Change model monitor image url

* minor fix

* minor fix

* minor fix

* Add unit testing for MonitoringSchedule

* Delete assume-role.json

* Add doc and sample pipeline for Monitoring Schedule

* Regenerate using the latest code generator.
Make parameter description 1 sentence long.

* Revert "Add doc and sample pipeline for Monitoring Schedule"

This reverts commit 6b3b7cc6f5.

* Delete print statements

* Update component with new generator

* address comments

* Add retry for _delete_resource

* Add try catch protection for _get_resource and _delete_resource

* Add integration tests for monitoring job definition components

* Update is_endpoint_deleted for new _get_resource

* Add integration test for updating monitoring schedule

* Remove update from canary

* Add doc and sample pipeline for Monitoring Schedule

* Add doc for monitoring job definition
Update doc for monitoring schedule

* Remove sample for monitoring schedule

* Address comments

* Address comments

* Address comment for unit test

* Address doc comments

* Address test comments
This commit is contained in:
rd-pong 2023-05-09 12:42:33 -07:00 committed by GitHub
parent 2bc30e9669
commit 07e67bb0ca
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
46 changed files with 3506 additions and 25 deletions

View File

@ -0,0 +1,49 @@
# SageMaker Data Quality Job Definition Kubeflow Pipelines component v2
## Overview
Component to create a definition for a job that monitors data quality and drift.
The component can be used with [Monitoring Schedule component](../MonitoringSchedule) to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint.
See the SageMaker Components for Kubeflow Pipelines versions section in [SageMaker Components for Kubeflow Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-components-for-kubeflow-pipelines.html#kubeflow-pipeline-components) to learn about the differences between the version 1 and version 2 components.
### Kubeflow Pipelines backend compatibility
SageMaker components are currently supported with Kubeflow pipelines backend v1. This means, you will have to use KFP sdk 1.8.x to create your pipelines.
## Getting Started
Follow [this guide](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/aws-samples#prerequisites) to setup the prerequisites for DataQualityJobDefinition depending on your deployment.
## Input Parameters
Find the high level component input parameters and their description in the [component's input specification](./component.yaml). The parameters with `JsonObject` or `JsonArray` type inputs have nested fields, you will have to refer to the [DataQualityJobDefinition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/dataqualityjobdefinition/) for the respective structure and pass the input in JSON format.
A quick way to see the converted JSON style input is to copy the [sample DataQualityJobDefinition spec](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/dataqualityjobdefinition/#spec) and convert it to JSON using a YAML to JSON converter like [this website](https://jsonformatter.org/yaml-to-json).
For example, `dataQualityBaselineConfig` is of type `object` and has the following structure:
```
dataQualityBaselineConfig:
baseliningJobName: string
constraintsResource:
s3URI: string
statisticsResource:
s3URI: string
```
The JSON style input for the above parameter would be:
```
data_quality_baseline_config = {
"constraintsResource": {
"s3URI": f"s3://<path-to-file>/constraints.json",
},
"statisticsResource": {
"s3URI": f"s3://<path-to-file>/statistics.json"
},
}
```
For a more detailed explanation of parameters, please refer to the [AWS SageMaker API Documentation for CreateDataQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDataQualityJobDefinition.html).
## References
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Data Quality Job Definition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/dataqualityjobdefinition/)
- [AWS SageMaker API Documentation for CreateDataQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDataQualityJobDefinition.html).

View File

@ -0,0 +1,124 @@
name: "Sagemaker - DataQualityJobDefinition"
description: Create DataQualityJobDefinition
inputs:
- {
name: region,
type: String,
description: "The region to use for the training job",
}
###########################GENERATED SECTION BELOW############################
- {
name: data_quality_app_specification,
type: JsonObject,
default: '{}',
description: "Specifies the container that runs the monitoring job.",
}
- {
name: data_quality_baseline_config,
type: JsonObject,
default: '{}',
description: "Configures the constraints and baselines for the monitoring job.",
}
- {
name: data_quality_job_input,
type: JsonObject,
default: '{}',
description: "A list of inputs for the monitoring job.",
}
- {
name: data_quality_job_output_config,
type: JsonObject,
default: '{}',
description: "The output configuration for monitoring jobs.",
}
- {
name: job_definition_name,
type: String,
default: '',
description: "The name for the monitoring job definition.",
}
- {
name: job_resources,
type: JsonObject,
default: '{}',
description: "Identifies the resources to deploy for a monitoring job.",
}
- {
name: network_config,
type: JsonObject,
default: '{}',
description: "Specifies networking configuration for the monitoring job.",
}
- {
name: role_arn,
type: String,
default: '',
description: "The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
}
- {
name: stopping_condition,
type: JsonObject,
default: '{}',
description: "A time limit for how long the monitoring job is allowed to run before stopping.",
}
- {
name: tags,
type: JsonArray,
default: '[]',
description: "(Optional) An array of key-value pairs.",
}
###########################GENERATED SECTION ABOVE############################
outputs:
###########################GENERATED SECTION BELOW############################
- {
name: ack_resource_metadata,
type: JsonObject,
description: "All CRs managed by ACK have a common `Status.",
}
- {
name: conditions,
type: JsonArray,
description: "All CRS managed by ACK have a common `Status.",
}
- {
name: sagemaker_resource_name,
type: String,
description: "Resource name on Sagemaker",
}
###########################GENERATED SECTION ABOVE############################
implementation:
container:
image: rdpen/test-modelmoni:31
command: [python3]
args:
- DataQualityJobDefinition/src/DataQualityJobDefinition_component.py
- --region
- { inputValue: region }
###########################GENERATED SECTION BELOW############################
- --data_quality_app_specification
- { inputValue: data_quality_app_specification }
- --data_quality_baseline_config
- { inputValue: data_quality_baseline_config }
- --data_quality_job_input
- { inputValue: data_quality_job_input }
- --data_quality_job_output_config
- { inputValue: data_quality_job_output_config }
- --job_definition_name
- { inputValue: job_definition_name }
- --job_resources
- { inputValue: job_resources }
- --network_config
- { inputValue: network_config }
- --role_arn
- { inputValue: role_arn }
- --stopping_condition
- { inputValue: stopping_condition }
- --tags
- { inputValue: tags }
###########################GENERATED SECTION ABOVE############################

View File

@ -0,0 +1,133 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict
import json
from DataQualityJobDefinition.src.DataQualityJobDefinition_spec import (
SageMakerDataQualityJobDefinitionInputs,
SageMakerDataQualityJobDefinitionOutputs,
SageMakerDataQualityJobDefinitionSpec,
)
from commonv2.sagemaker_component import (
SageMakerComponent,
ComponentMetadata,
SageMakerJobStatus,
)
from commonv2 import snake_to_camel
@ComponentMetadata(
name="SageMaker - DataQualityJobDefinition",
description="",
spec=SageMakerDataQualityJobDefinitionSpec,
)
class SageMakerDataQualityJobDefinitionComponent(SageMakerComponent):
"""SageMaker component for DataQualityJobDefinition."""
def Do(self, spec: SageMakerDataQualityJobDefinitionSpec):
self.namespace = self._get_current_namespace()
logging.info("Current namespace: " + self.namespace)
############GENERATED SECTION BELOW############
self.job_name = spec.inputs.job_definition_name = (
spec.inputs.job_definition_name
if spec.inputs.job_definition_name
else SageMakerComponent._generate_unique_timestamped_id(
prefix="data-quality-job-definition"
)
)
self.group = "sagemaker.services.k8s.aws"
self.version = "v1alpha1"
self.plural = "dataqualityjobdefinitions"
self.spaced_out_resource_name = "Data Quality Job Definition"
self.job_request_outline_location = (
"DataQualityJobDefinition/src/DataQualityJobDefinition_request.yaml.tpl"
)
self.job_request_location = (
"DataQualityJobDefinition/src/DataQualityJobDefinition_request.yaml"
)
self.update_supported = False
############GENERATED SECTION ABOVE############
super().Do(spec.inputs, spec.outputs, spec.output_paths)
def _create_job_request(
self,
inputs: SageMakerDataQualityJobDefinitionInputs,
outputs: SageMakerDataQualityJobDefinitionOutputs,
) -> Dict:
return super()._create_job_yaml(inputs, outputs)
def _submit_job_request(self, request: Dict) -> object:
return super()._create_resource(request, 12, 15)
def _on_job_terminated(self):
super()._delete_custom_resource()
def _after_submit_job_request(
self,
job: object,
request: Dict,
inputs: SageMakerDataQualityJobDefinitionInputs,
outputs: SageMakerDataQualityJobDefinitionOutputs,
):
pass
def _get_job_status(self):
return SageMakerJobStatus(is_completed=True, raw_status="Completed")
def _get_upgrade_status(self):
return self._get_job_status()
def _after_job_complete(
self,
job: object,
request: Dict,
inputs: SageMakerDataQualityJobDefinitionInputs,
outputs: SageMakerDataQualityJobDefinitionOutputs,
):
# prepare component outputs (defined in the spec)
ack_statuses = super()._get_resource()["status"]
############GENERATED SECTION BELOW############
outputs.ack_resource_metadata = str(
ack_statuses["ackResourceMetadata"]
if "ackResourceMetadata" in ack_statuses
else None
)
outputs.conditions = str(
ack_statuses["conditions"] if "conditions" in ack_statuses else None
)
outputs.sagemaker_resource_name = self.job_name
############GENERATED SECTION ABOVE############
if __name__ == "__main__":
import sys
spec = SageMakerDataQualityJobDefinitionSpec(sys.argv[1:])
component = SageMakerDataQualityJobDefinitionComponent()
component.Do(spec)

View File

@ -0,0 +1,17 @@
apiVersion: sagemaker.services.k8s.aws/v1alpha1
kind: DataQualityJobDefinition
metadata:
name:
annotations:
services.k8s.aws/region:
spec:
dataQualityAppSpecification:
dataQualityBaselineConfig:
dataQualityJobInput:
dataQualityJobOutputConfig:
jobDefinitionName:
jobResources:
networkConfig:
roleARN:
stoppingCondition:
tags:

View File

@ -0,0 +1,147 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Specification for the SageMaker - DataQualityJobDefinition"""
from dataclasses import dataclass
from typing import List
from commonv2.sagemaker_component_spec import (
SageMakerComponentSpec,
SageMakerComponentBaseOutputs,
)
from commonv2.spec_input_parsers import SpecInputParsers
from commonv2.common_inputs import (
COMMON_INPUTS,
SageMakerComponentCommonInputs,
SageMakerComponentInput as Input,
SageMakerComponentOutput as Output,
SageMakerComponentInputValidator as InputValidator,
SageMakerComponentOutputValidator as OutputValidator,
)
@dataclass(frozen=False)
class SageMakerDataQualityJobDefinitionInputs(SageMakerComponentCommonInputs):
"""Defines the set of inputs for the DataQualityJobDefinition component."""
data_quality_app_specification: Input
data_quality_baseline_config: Input
data_quality_job_input: Input
data_quality_job_output_config: Input
job_definition_name: Input
job_resources: Input
network_config: Input
role_arn: Input
stopping_condition: Input
tags: Input
@dataclass
class SageMakerDataQualityJobDefinitionOutputs(SageMakerComponentBaseOutputs):
"""Defines the set of outputs for the DataQualityJobDefinition component."""
ack_resource_metadata: Output
conditions: Output
sagemaker_resource_name: Output
class SageMakerDataQualityJobDefinitionSpec(
SageMakerComponentSpec[
SageMakerDataQualityJobDefinitionInputs,
SageMakerDataQualityJobDefinitionOutputs,
]
):
INPUTS: SageMakerDataQualityJobDefinitionInputs = SageMakerDataQualityJobDefinitionInputs(
data_quality_app_specification=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Specifies the container that runs the monitoring job.",
required=True,
),
data_quality_baseline_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Configures the constraints and baselines for the monitoring job.",
required=False,
),
data_quality_job_input=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A list of inputs for the monitoring job.",
required=True,
),
data_quality_job_output_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The output configuration for monitoring jobs.",
required=True,
),
job_definition_name=InputValidator(
input_type=str,
description="The name for the monitoring job definition.",
required=True,
),
job_resources=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Identifies the resources to deploy for a monitoring job.",
required=True,
),
network_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Specifies networking configuration for the monitoring job.",
required=False,
),
role_arn=InputValidator(
input_type=str,
description="The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
required=True,
),
stopping_condition=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A time limit for how long the monitoring job is allowed to run before stopping.",
required=False,
),
tags=InputValidator(
input_type=SpecInputParsers.yaml_or_json_list,
description="(Optional) An array of key-value pairs.",
required=False,
),
**vars(COMMON_INPUTS),
)
OUTPUTS = SageMakerDataQualityJobDefinitionOutputs(
ack_resource_metadata=OutputValidator(
description="All CRs managed by ACK have a common `Status.",
),
conditions=OutputValidator(
description="All CRS managed by ACK have a common `Status.",
),
sagemaker_resource_name=OutputValidator(
description="Resource name on Sagemaker",
),
)
def __init__(self, arguments: List[str]):
super().__init__(
arguments,
SageMakerDataQualityJobDefinitionInputs,
SageMakerDataQualityJobDefinitionOutputs,
)
@property
def inputs(self) -> SageMakerDataQualityJobDefinitionInputs:
return self._inputs
@property
def outputs(self) -> SageMakerDataQualityJobDefinitionOutputs:
return self._outputs
@property
def output_paths(self) -> SageMakerDataQualityJobDefinitionOutputs:
return self._output_paths

View File

@ -0,0 +1,62 @@
# SageMaker Model Bias Job Definition Kubeflow Pipelines component v2
## Overview
Component to create the definition for a model bias job.
The component can be used with [Monitoring Schedule component](../MonitoringSchedule) to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint.
See the SageMaker Components for Kubeflow Pipelines versions section in [SageMaker Components for Kubeflow Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-components-for-kubeflow-pipelines.html#kubeflow-pipeline-components) to learn about the differences between the version 1 and version 2 components.
### Kubeflow Pipelines backend compatibility
SageMaker components are currently supported with Kubeflow pipelines backend v1. This means, you will have to use KFP sdk 1.8.x to create your pipelines.
## Getting Started
Follow [this guide](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/aws-samples#prerequisites) to setup the prerequisites for ModelBiasJobDefinition depending on your deployment.
## Input Parameters
Find the high level component input parameters and their description in the [component's input specification](./component.yaml). The parameters with `JsonObject` or `JsonArray` type inputs have nested fields, you will have to refer to the [ModelBiasJobDefinition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelbiasjobdefinition/) for the respective structure and pass the input in JSON format.
A quick way to see the converted JSON style input is to copy the [sample ModelBiasJobDefinition spec](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelbiasjobdefinition/#spec) and convert it to JSON using a YAML to JSON converter like [this website](https://jsonformatter.org/yaml-to-json).
For example, `modelBiasJobInput` is of type `object` and has the following structure:
```
modelBiasJobInput:
endpointInput:
endTimeOffset: string
endpointName: string
featuresAttribute: string
inferenceAttribute: string
localPath: string
probabilityAttribute: string
probabilityThresholdAttribute: number
s3DataDistributionType: string
s3InputMode: string
startTimeOffset: string
groundTruthS3Input:
s3URI: string
```
The JSON style input for the above parameter would be:
```
model_bias_job_input = {
"endpointInput": {
"endpointName": "<endpoint-name>", # change to your endpoint
"localPath": "<path>",
"s3InputMode": "File",
"s3DataDistributionType": "FullyReplicated",
"probabilityThresholdAttribute": 0.8,
"startTimeOffset": "-PT1H",
"endTimeOffset": "-PT0H",
},
"groundTruthS3Input": {
"s3URI": "s3://<path-to-directory>/ground_truth_data"
},
}
```
For a more detailed explanation of parameters, please refer to the [AWS SageMaker API Documentation for CreateModelBiasJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelBiasJobDefinition.html).
## References
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Model Bias Job Definition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelbiasjobdefinition/)
- [AWS SageMaker API Documentation for CreateModelBiasJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelBiasJobDefinition.html).

View File

@ -0,0 +1,124 @@
name: "Sagemaker - ModelBiasJobDefinition"
description: Create ModelBiasJobDefinition
inputs:
- {
name: region,
type: String,
description: "The region to use for the training job",
}
###########################GENERATED SECTION BELOW############################
- {
name: job_definition_name,
type: String,
default: '',
description: "The name of the bias job definition.",
}
- {
name: job_resources,
type: JsonObject,
default: '{}',
description: "Identifies the resources to deploy for a monitoring job.",
}
- {
name: model_bias_app_specification,
type: JsonObject,
default: '{}',
description: "Configures the model bias job to run a specified Docker container image.",
}
- {
name: model_bias_baseline_config,
type: JsonObject,
default: '{}',
description: "The baseline configuration for a model bias job.",
}
- {
name: model_bias_job_input,
type: JsonObject,
default: '{}',
description: "Inputs for the model bias job.",
}
- {
name: model_bias_job_output_config,
type: JsonObject,
default: '{}',
description: "The output configuration for monitoring jobs.",
}
- {
name: network_config,
type: JsonObject,
default: '{}',
description: "Networking options for a model bias job.",
}
- {
name: role_arn,
type: String,
default: '',
description: "The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
}
- {
name: stopping_condition,
type: JsonObject,
default: '{}',
description: "A time limit for how long the monitoring job is allowed to run before stopping.",
}
- {
name: tags,
type: JsonArray,
default: '[]',
description: "(Optional) An array of key-value pairs.",
}
###########################GENERATED SECTION ABOVE############################
outputs:
###########################GENERATED SECTION BELOW############################
- {
name: ack_resource_metadata,
type: JsonObject,
description: "All CRs managed by ACK have a common `Status.",
}
- {
name: conditions,
type: JsonArray,
description: "All CRS managed by ACK have a common `Status.",
}
- {
name: sagemaker_resource_name,
type: String,
description: "Resource name on Sagemaker",
}
###########################GENERATED SECTION ABOVE############################
implementation:
container:
image: rdpen/test-modelmoni:31
command: [python3]
args:
- ModelBiasJobDefinition/src/ModelBiasJobDefinition_component.py
- --region
- { inputValue: region }
###########################GENERATED SECTION BELOW############################
- --job_definition_name
- { inputValue: job_definition_name }
- --job_resources
- { inputValue: job_resources }
- --model_bias_app_specification
- { inputValue: model_bias_app_specification }
- --model_bias_baseline_config
- { inputValue: model_bias_baseline_config }
- --model_bias_job_input
- { inputValue: model_bias_job_input }
- --model_bias_job_output_config
- { inputValue: model_bias_job_output_config }
- --network_config
- { inputValue: network_config }
- --role_arn
- { inputValue: role_arn }
- --stopping_condition
- { inputValue: stopping_condition }
- --tags
- { inputValue: tags }
###########################GENERATED SECTION ABOVE############################

View File

@ -0,0 +1,133 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict
import json
from ModelBiasJobDefinition.src.ModelBiasJobDefinition_spec import (
SageMakerModelBiasJobDefinitionInputs,
SageMakerModelBiasJobDefinitionOutputs,
SageMakerModelBiasJobDefinitionSpec,
)
from commonv2.sagemaker_component import (
SageMakerComponent,
ComponentMetadata,
SageMakerJobStatus,
)
from commonv2 import snake_to_camel
@ComponentMetadata(
name="SageMaker - ModelBiasJobDefinition",
description="",
spec=SageMakerModelBiasJobDefinitionSpec,
)
class SageMakerModelBiasJobDefinitionComponent(SageMakerComponent):
"""SageMaker component for ModelBiasJobDefinition."""
def Do(self, spec: SageMakerModelBiasJobDefinitionSpec):
self.namespace = self._get_current_namespace()
logging.info("Current namespace: " + self.namespace)
############GENERATED SECTION BELOW############
self.job_name = spec.inputs.job_definition_name = (
spec.inputs.job_definition_name
if spec.inputs.job_definition_name
else SageMakerComponent._generate_unique_timestamped_id(
prefix="model-bias-job-definition"
)
)
self.group = "sagemaker.services.k8s.aws"
self.version = "v1alpha1"
self.plural = "modelbiasjobdefinitions"
self.spaced_out_resource_name = "Model Bias Job Definition"
self.job_request_outline_location = (
"ModelBiasJobDefinition/src/ModelBiasJobDefinition_request.yaml.tpl"
)
self.job_request_location = (
"ModelBiasJobDefinition/src/ModelBiasJobDefinition_request.yaml"
)
self.update_supported = False
############GENERATED SECTION ABOVE############
super().Do(spec.inputs, spec.outputs, spec.output_paths)
def _create_job_request(
self,
inputs: SageMakerModelBiasJobDefinitionInputs,
outputs: SageMakerModelBiasJobDefinitionOutputs,
) -> Dict:
return super()._create_job_yaml(inputs, outputs)
def _submit_job_request(self, request: Dict) -> object:
return super()._create_resource(request, 12, 15)
def _on_job_terminated(self):
super()._delete_custom_resource()
def _after_submit_job_request(
self,
job: object,
request: Dict,
inputs: SageMakerModelBiasJobDefinitionInputs,
outputs: SageMakerModelBiasJobDefinitionOutputs,
):
pass
def _get_job_status(self):
return SageMakerJobStatus(is_completed=True, raw_status="Completed")
def _get_upgrade_status(self):
return self._get_job_status()
def _after_job_complete(
self,
job: object,
request: Dict,
inputs: SageMakerModelBiasJobDefinitionInputs,
outputs: SageMakerModelBiasJobDefinitionOutputs,
):
# prepare component outputs (defined in the spec)
ack_statuses = super()._get_resource()["status"]
############GENERATED SECTION BELOW############
outputs.ack_resource_metadata = str(
ack_statuses["ackResourceMetadata"]
if "ackResourceMetadata" in ack_statuses
else None
)
outputs.conditions = str(
ack_statuses["conditions"] if "conditions" in ack_statuses else None
)
outputs.sagemaker_resource_name = self.job_name
############GENERATED SECTION ABOVE############
if __name__ == "__main__":
import sys
spec = SageMakerModelBiasJobDefinitionSpec(sys.argv[1:])
component = SageMakerModelBiasJobDefinitionComponent()
component.Do(spec)

View File

@ -0,0 +1,17 @@
apiVersion: sagemaker.services.k8s.aws/v1alpha1
kind: ModelBiasJobDefinition
metadata:
name:
annotations:
services.k8s.aws/region:
spec:
jobDefinitionName:
jobResources:
modelBiasAppSpecification:
modelBiasBaselineConfig:
modelBiasJobInput:
modelBiasJobOutputConfig:
networkConfig:
roleARN:
stoppingCondition:
tags:

View File

@ -0,0 +1,146 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Specification for the SageMaker - ModelBiasJobDefinition"""
from dataclasses import dataclass
from typing import List
from commonv2.sagemaker_component_spec import (
SageMakerComponentSpec,
SageMakerComponentBaseOutputs,
)
from commonv2.spec_input_parsers import SpecInputParsers
from commonv2.common_inputs import (
COMMON_INPUTS,
SageMakerComponentCommonInputs,
SageMakerComponentInput as Input,
SageMakerComponentOutput as Output,
SageMakerComponentInputValidator as InputValidator,
SageMakerComponentOutputValidator as OutputValidator,
)
@dataclass(frozen=False)
class SageMakerModelBiasJobDefinitionInputs(SageMakerComponentCommonInputs):
"""Defines the set of inputs for the ModelBiasJobDefinition component."""
job_definition_name: Input
job_resources: Input
model_bias_app_specification: Input
model_bias_baseline_config: Input
model_bias_job_input: Input
model_bias_job_output_config: Input
network_config: Input
role_arn: Input
stopping_condition: Input
tags: Input
@dataclass
class SageMakerModelBiasJobDefinitionOutputs(SageMakerComponentBaseOutputs):
"""Defines the set of outputs for the ModelBiasJobDefinition component."""
ack_resource_metadata: Output
conditions: Output
sagemaker_resource_name: Output
class SageMakerModelBiasJobDefinitionSpec(
SageMakerComponentSpec[
SageMakerModelBiasJobDefinitionInputs, SageMakerModelBiasJobDefinitionOutputs
]
):
INPUTS: SageMakerModelBiasJobDefinitionInputs = SageMakerModelBiasJobDefinitionInputs(
job_definition_name=InputValidator(
input_type=str,
description="The name of the bias job definition.",
required=True,
),
job_resources=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Identifies the resources to deploy for a monitoring job.",
required=True,
),
model_bias_app_specification=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Configures the model bias job to run a specified Docker container image.",
required=True,
),
model_bias_baseline_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The baseline configuration for a model bias job.",
required=False,
),
model_bias_job_input=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Inputs for the model bias job.",
required=True,
),
model_bias_job_output_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The output configuration for monitoring jobs.",
required=True,
),
network_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Networking options for a model bias job.",
required=False,
),
role_arn=InputValidator(
input_type=str,
description="The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
required=True,
),
stopping_condition=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A time limit for how long the monitoring job is allowed to run before stopping.",
required=False,
),
tags=InputValidator(
input_type=SpecInputParsers.yaml_or_json_list,
description="(Optional) An array of key-value pairs.",
required=False,
),
**vars(COMMON_INPUTS),
)
OUTPUTS = SageMakerModelBiasJobDefinitionOutputs(
ack_resource_metadata=OutputValidator(
description="All CRs managed by ACK have a common `Status.",
),
conditions=OutputValidator(
description="All CRS managed by ACK have a common `Status.",
),
sagemaker_resource_name=OutputValidator(
description="Resource name on Sagemaker",
),
)
def __init__(self, arguments: List[str]):
super().__init__(
arguments,
SageMakerModelBiasJobDefinitionInputs,
SageMakerModelBiasJobDefinitionOutputs,
)
@property
def inputs(self) -> SageMakerModelBiasJobDefinitionInputs:
return self._inputs
@property
def outputs(self) -> SageMakerModelBiasJobDefinitionOutputs:
return self._outputs
@property
def output_paths(self) -> SageMakerModelBiasJobDefinitionOutputs:
return self._output_paths

View File

@ -0,0 +1,42 @@
# SageMaker Model Explainability Job Definition Kubeflow Pipelines component v2
## Overview
Component to create the definition for a model explainability job.
The component can be used with [Monitoring Schedule component](../MonitoringSchedule) to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint.
See the SageMaker Components for Kubeflow Pipelines versions section in [SageMaker Components for Kubeflow Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-components-for-kubeflow-pipelines.html#kubeflow-pipeline-components) to learn about the differences between the version 1 and version 2 components.
### Kubeflow Pipelines backend compatibility
SageMaker components are currently supported with Kubeflow pipelines backend v1. This means, you will have to use KFP sdk 1.8.x to create your pipelines.
## Getting Started
Follow [this guide](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/aws-samples#prerequisites) to setup the prerequisites for ModelExplainabilityJobDefinition depending on your deployment.
## Input Parameters
Find the high level component input parameters and their description in the [component's input specification](./component.yaml). The parameters with `JsonObject` or `JsonArray` type inputs have nested fields, you will have to refer to the [ModelExplainabilityJobDefinition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelexplainabilityjobdefinition/) for the respective structure and pass the input in JSON format.
A quick way to see the converted JSON style input is to copy the [sample ModelExplainabilityJobDefinition spec](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelexplainabilityjobdefinition/#spec) and convert it to JSON using a YAML to JSON converter like [this website](https://jsonformatter.org/yaml-to-json).
For example, `modelExplainabilityAppSpecification` is of type `object` and has the following structure:
```
modelExplainabilityAppSpecification:
configURI: string
environment: {}
imageURI: string
```
The JSON style input for the above parameter would be:
```
model_explainability_app_specification = {
"imageURI": "<account-number>.dkr.ecr.<region>.amazonaws.com/sagemaker-clarify-processing:1.0",
"configURI": "s3://<path-to-file>/analysis_config.json",
}
```
For a more detailed explanation of parameters, please refer to the [AWS SageMaker API Documentation for CreateModelExplainabilityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelExplainabilityJobDefinition.html).
## References
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Model Explainability Job Definition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelexplainabilityjobdefinition/)
- [AWS SageMaker API Documentation for CreateModelExplainabilityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelExplainabilityJobDefinition.html).

View File

@ -0,0 +1,124 @@
name: "Sagemaker - ModelExplainabilityJobDefinition"
description: Create ModelExplainabilityJobDefinition
inputs:
- {
name: region,
type: String,
description: "The region to use for the training job",
}
###########################GENERATED SECTION BELOW############################
- {
name: job_definition_name,
type: String,
default: '',
description: "The name of the model explainability job definition.",
}
- {
name: job_resources,
type: JsonObject,
default: '{}',
description: "Identifies the resources to deploy for a monitoring job.",
}
- {
name: model_explainability_app_specification,
type: JsonObject,
default: '{}',
description: "Configures the model explainability job to run a specified Docker container image.",
}
- {
name: model_explainability_baseline_config,
type: JsonObject,
default: '{}',
description: "The baseline configuration for a model explainability job.",
}
- {
name: model_explainability_job_input,
type: JsonObject,
default: '{}',
description: "Inputs for the model explainability job.",
}
- {
name: model_explainability_job_output_config,
type: JsonObject,
default: '{}',
description: "The output configuration for monitoring jobs.",
}
- {
name: network_config,
type: JsonObject,
default: '{}',
description: "Networking options for a model explainability job.",
}
- {
name: role_arn,
type: String,
default: '',
description: "The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
}
- {
name: stopping_condition,
type: JsonObject,
default: '{}',
description: "A time limit for how long the monitoring job is allowed to run before stopping.",
}
- {
name: tags,
type: JsonArray,
default: '[]',
description: "(Optional) An array of key-value pairs.",
}
###########################GENERATED SECTION ABOVE############################
outputs:
###########################GENERATED SECTION BELOW############################
- {
name: ack_resource_metadata,
type: JsonObject,
description: "All CRs managed by ACK have a common `Status.",
}
- {
name: conditions,
type: JsonArray,
description: "All CRS managed by ACK have a common `Status.",
}
- {
name: sagemaker_resource_name,
type: String,
description: "Resource name on Sagemaker",
}
###########################GENERATED SECTION ABOVE############################
implementation:
container:
image: rdpen/test-modelmoni:31
command: [python3]
args:
- ModelExplainabilityJobDefinition/src/ModelExplainabilityJobDefinition_component.py
- --region
- { inputValue: region }
###########################GENERATED SECTION BELOW############################
- --job_definition_name
- { inputValue: job_definition_name }
- --job_resources
- { inputValue: job_resources }
- --model_explainability_app_specification
- { inputValue: model_explainability_app_specification }
- --model_explainability_baseline_config
- { inputValue: model_explainability_baseline_config }
- --model_explainability_job_input
- { inputValue: model_explainability_job_input }
- --model_explainability_job_output_config
- { inputValue: model_explainability_job_output_config }
- --network_config
- { inputValue: network_config }
- --role_arn
- { inputValue: role_arn }
- --stopping_condition
- { inputValue: stopping_condition }
- --tags
- { inputValue: tags }
###########################GENERATED SECTION ABOVE############################

View File

@ -0,0 +1,129 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict
import json
from ModelExplainabilityJobDefinition.src.ModelExplainabilityJobDefinition_spec import (
SageMakerModelExplainabilityJobDefinitionInputs,
SageMakerModelExplainabilityJobDefinitionOutputs,
SageMakerModelExplainabilityJobDefinitionSpec,
)
from commonv2.sagemaker_component import (
SageMakerComponent,
ComponentMetadata,
SageMakerJobStatus,
)
from commonv2 import snake_to_camel
@ComponentMetadata(
name="SageMaker - ModelExplainabilityJobDefinition",
description="",
spec=SageMakerModelExplainabilityJobDefinitionSpec,
)
class SageMakerModelExplainabilityJobDefinitionComponent(SageMakerComponent):
"""SageMaker component for ModelExplainabilityJobDefinition."""
def Do(self, spec: SageMakerModelExplainabilityJobDefinitionSpec):
self.namespace = self._get_current_namespace()
logging.info("Current namespace: " + self.namespace)
############GENERATED SECTION BELOW############
self.job_name = spec.inputs.job_definition_name = (
spec.inputs.job_definition_name
if spec.inputs.job_definition_name
else SageMakerComponent._generate_unique_timestamped_id(
prefix="model-explainability-job-definition"
)
)
self.group = "sagemaker.services.k8s.aws"
self.version = "v1alpha1"
self.plural = "modelexplainabilityjobdefinitions"
self.spaced_out_resource_name = "Model Explainability Job Definition"
self.job_request_outline_location = "ModelExplainabilityJobDefinition/src/ModelExplainabilityJobDefinition_request.yaml.tpl"
self.job_request_location = "ModelExplainabilityJobDefinition/src/ModelExplainabilityJobDefinition_request.yaml"
self.update_supported = False
############GENERATED SECTION ABOVE############
super().Do(spec.inputs, spec.outputs, spec.output_paths)
def _create_job_request(
self,
inputs: SageMakerModelExplainabilityJobDefinitionInputs,
outputs: SageMakerModelExplainabilityJobDefinitionOutputs,
) -> Dict:
return super()._create_job_yaml(inputs, outputs)
def _submit_job_request(self, request: Dict) -> object:
return super()._create_resource(request, 12, 15)
def _on_job_terminated(self):
super()._delete_custom_resource()
def _after_submit_job_request(
self,
job: object,
request: Dict,
inputs: SageMakerModelExplainabilityJobDefinitionInputs,
outputs: SageMakerModelExplainabilityJobDefinitionOutputs,
):
pass
def _get_job_status(self):
return SageMakerJobStatus(is_completed=True, raw_status="Completed")
def _get_upgrade_status(self):
return self._get_job_status()
def _after_job_complete(
self,
job: object,
request: Dict,
inputs: SageMakerModelExplainabilityJobDefinitionInputs,
outputs: SageMakerModelExplainabilityJobDefinitionOutputs,
):
# prepare component outputs (defined in the spec)
ack_statuses = super()._get_resource()["status"]
############GENERATED SECTION BELOW############
outputs.ack_resource_metadata = str(
ack_statuses["ackResourceMetadata"]
if "ackResourceMetadata" in ack_statuses
else None
)
outputs.conditions = str(
ack_statuses["conditions"] if "conditions" in ack_statuses else None
)
outputs.sagemaker_resource_name = self.job_name
############GENERATED SECTION ABOVE############
if __name__ == "__main__":
import sys
spec = SageMakerModelExplainabilityJobDefinitionSpec(sys.argv[1:])
component = SageMakerModelExplainabilityJobDefinitionComponent()
component.Do(spec)

View File

@ -0,0 +1,17 @@
apiVersion: sagemaker.services.k8s.aws/v1alpha1
kind: ModelExplainabilityJobDefinition
metadata:
name:
annotations:
services.k8s.aws/region:
spec:
jobDefinitionName:
jobResources:
modelExplainabilityAppSpecification:
modelExplainabilityBaselineConfig:
modelExplainabilityJobInput:
modelExplainabilityJobOutputConfig:
networkConfig:
roleARN:
stoppingCondition:
tags:

View File

@ -0,0 +1,147 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Specification for the SageMaker - ModelExplainabilityJobDefinition"""
from dataclasses import dataclass
from typing import List
from commonv2.sagemaker_component_spec import (
SageMakerComponentSpec,
SageMakerComponentBaseOutputs,
)
from commonv2.spec_input_parsers import SpecInputParsers
from commonv2.common_inputs import (
COMMON_INPUTS,
SageMakerComponentCommonInputs,
SageMakerComponentInput as Input,
SageMakerComponentOutput as Output,
SageMakerComponentInputValidator as InputValidator,
SageMakerComponentOutputValidator as OutputValidator,
)
@dataclass(frozen=False)
class SageMakerModelExplainabilityJobDefinitionInputs(SageMakerComponentCommonInputs):
"""Defines the set of inputs for the ModelExplainabilityJobDefinition component."""
job_definition_name: Input
job_resources: Input
model_explainability_app_specification: Input
model_explainability_baseline_config: Input
model_explainability_job_input: Input
model_explainability_job_output_config: Input
network_config: Input
role_arn: Input
stopping_condition: Input
tags: Input
@dataclass
class SageMakerModelExplainabilityJobDefinitionOutputs(SageMakerComponentBaseOutputs):
"""Defines the set of outputs for the ModelExplainabilityJobDefinition component."""
ack_resource_metadata: Output
conditions: Output
sagemaker_resource_name: Output
class SageMakerModelExplainabilityJobDefinitionSpec(
SageMakerComponentSpec[
SageMakerModelExplainabilityJobDefinitionInputs,
SageMakerModelExplainabilityJobDefinitionOutputs,
]
):
INPUTS: SageMakerModelExplainabilityJobDefinitionInputs = SageMakerModelExplainabilityJobDefinitionInputs(
job_definition_name=InputValidator(
input_type=str,
description="The name of the model explainability job definition.",
required=True,
),
job_resources=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Identifies the resources to deploy for a monitoring job.",
required=True,
),
model_explainability_app_specification=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Configures the model explainability job to run a specified Docker container image.",
required=True,
),
model_explainability_baseline_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The baseline configuration for a model explainability job.",
required=False,
),
model_explainability_job_input=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Inputs for the model explainability job.",
required=True,
),
model_explainability_job_output_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The output configuration for monitoring jobs.",
required=True,
),
network_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Networking options for a model explainability job.",
required=False,
),
role_arn=InputValidator(
input_type=str,
description="The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
required=True,
),
stopping_condition=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A time limit for how long the monitoring job is allowed to run before stopping.",
required=False,
),
tags=InputValidator(
input_type=SpecInputParsers.yaml_or_json_list,
description="(Optional) An array of key-value pairs.",
required=False,
),
**vars(COMMON_INPUTS),
)
OUTPUTS = SageMakerModelExplainabilityJobDefinitionOutputs(
ack_resource_metadata=OutputValidator(
description="All CRs managed by ACK have a common `Status.",
),
conditions=OutputValidator(
description="All CRS managed by ACK have a common `Status.",
),
sagemaker_resource_name=OutputValidator(
description="Resource name on Sagemaker",
),
)
def __init__(self, arguments: List[str]):
super().__init__(
arguments,
SageMakerModelExplainabilityJobDefinitionInputs,
SageMakerModelExplainabilityJobDefinitionOutputs,
)
@property
def inputs(self) -> SageMakerModelExplainabilityJobDefinitionInputs:
return self._inputs
@property
def outputs(self) -> SageMakerModelExplainabilityJobDefinitionOutputs:
return self._outputs
@property
def output_paths(self) -> SageMakerModelExplainabilityJobDefinitionOutputs:
return self._output_paths

View File

@ -0,0 +1,37 @@
# SageMaker Model Quality Job Definition Kubeflow Pipelines component v2
## Overview
Component to creates a definition for a job that monitors model quality and drift.
The component can be used with [Monitoring Schedule component](../MonitoringSchedule) to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint.
See the SageMaker Components for Kubeflow Pipelines versions section in [SageMaker Components for Kubeflow Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-components-for-kubeflow-pipelines.html#kubeflow-pipeline-components) to learn about the differences between the version 1 and version 2 components.
### Kubeflow Pipelines backend compatibility
SageMaker components are currently supported with Kubeflow pipelines backend v1. This means, you will have to use KFP sdk 1.8.x to create your pipelines.
## Getting Started
Follow [this guide](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/aws-samples#prerequisites) to setup the prerequisites for ModelQualityJobDefinition depending on your deployment.
## Input Parameters
Find the high level component input parameters and their description in the [component's input specification](./component.yaml). The parameters with `JsonObject` or `JsonArray` type inputs have nested fields, you will have to refer to the [ModelQualityJobDefinition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelqualityjobdefinition/) for the respective structure and pass the input in JSON format.
A quick way to see the converted JSON style input is to copy the [sample ModelQualityJobDefinition spec](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelqualityjobdefinition/#spec) and convert it to JSON using a YAML to JSON converter like [this website](https://jsonformatter.org/yaml-to-json).
For example, `modelQualityAppSpecification` is of type `object` and has the following structure:
```
jobDefinitionName: string
```
The JSON style input for the above parameter would be:
```
job_definition_name = "<your-job-definition-name>"
```
For a more detailed explanation of parameters, please refer to the [AWS SageMaker API Documentation for CreateModelQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelQualityJobDefinition.html).
## References
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Model Quality Job Definition CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/modelqualityjobdefinition/)
- [AWS SageMaker API Documentation for CreateModelQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelQualityJobDefinition.html).

View File

@ -0,0 +1,124 @@
name: "Sagemaker - ModelQualityJobDefinition"
description: Create ModelQualityJobDefinition
inputs:
- {
name: region,
type: String,
description: "The region to use for the training job",
}
###########################GENERATED SECTION BELOW############################
- {
name: job_definition_name,
type: String,
default: '',
description: "The name of the monitoring job definition.",
}
- {
name: job_resources,
type: JsonObject,
default: '{}',
description: "Identifies the resources to deploy for a monitoring job.",
}
- {
name: model_quality_app_specification,
type: JsonObject,
default: '{}',
description: "The container that runs the monitoring job.",
}
- {
name: model_quality_baseline_config,
type: JsonObject,
default: '{}',
description: "Specifies the constraints and baselines for the monitoring job.",
}
- {
name: model_quality_job_input,
type: JsonObject,
default: '{}',
description: "A list of the inputs that are monitored.",
}
- {
name: model_quality_job_output_config,
type: JsonObject,
default: '{}',
description: "The output configuration for monitoring jobs.",
}
- {
name: network_config,
type: JsonObject,
default: '{}',
description: "Specifies the network configuration for the monitoring job.",
}
- {
name: role_arn,
type: String,
default: '',
description: "The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
}
- {
name: stopping_condition,
type: JsonObject,
default: '{}',
description: "A time limit for how long the monitoring job is allowed to run before stopping.",
}
- {
name: tags,
type: JsonArray,
default: '[]',
description: "(Optional) An array of key-value pairs.",
}
###########################GENERATED SECTION ABOVE############################
outputs:
###########################GENERATED SECTION BELOW############################
- {
name: ack_resource_metadata,
type: JsonObject,
description: "All CRs managed by ACK have a common `Status.",
}
- {
name: conditions,
type: JsonArray,
description: "All CRS managed by ACK have a common `Status.",
}
- {
name: sagemaker_resource_name,
type: String,
description: "Resource name on Sagemaker",
}
###########################GENERATED SECTION ABOVE############################
implementation:
container:
image: rdpen/test-modelmoni:31
command: [python3]
args:
- ModelQualityJobDefinition/src/ModelQualityJobDefinition_component.py
- --region
- { inputValue: region }
###########################GENERATED SECTION BELOW############################
- --job_definition_name
- { inputValue: job_definition_name }
- --job_resources
- { inputValue: job_resources }
- --model_quality_app_specification
- { inputValue: model_quality_app_specification }
- --model_quality_baseline_config
- { inputValue: model_quality_baseline_config }
- --model_quality_job_input
- { inputValue: model_quality_job_input }
- --model_quality_job_output_config
- { inputValue: model_quality_job_output_config }
- --network_config
- { inputValue: network_config }
- --role_arn
- { inputValue: role_arn }
- --stopping_condition
- { inputValue: stopping_condition }
- --tags
- { inputValue: tags }
###########################GENERATED SECTION ABOVE############################

View File

@ -0,0 +1,133 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict
import json
from ModelQualityJobDefinition.src.ModelQualityJobDefinition_spec import (
SageMakerModelQualityJobDefinitionInputs,
SageMakerModelQualityJobDefinitionOutputs,
SageMakerModelQualityJobDefinitionSpec,
)
from commonv2.sagemaker_component import (
SageMakerComponent,
ComponentMetadata,
SageMakerJobStatus,
)
from commonv2 import snake_to_camel
@ComponentMetadata(
name="SageMaker - ModelQualityJobDefinition",
description="",
spec=SageMakerModelQualityJobDefinitionSpec,
)
class SageMakerModelQualityJobDefinitionComponent(SageMakerComponent):
"""SageMaker component for ModelQualityJobDefinition."""
def Do(self, spec: SageMakerModelQualityJobDefinitionSpec):
self.namespace = self._get_current_namespace()
logging.info("Current namespace: " + self.namespace)
############GENERATED SECTION BELOW############
self.job_name = spec.inputs.job_definition_name = (
spec.inputs.job_definition_name
if spec.inputs.job_definition_name
else SageMakerComponent._generate_unique_timestamped_id(
prefix="model-quality-job-definition"
)
)
self.group = "sagemaker.services.k8s.aws"
self.version = "v1alpha1"
self.plural = "modelqualityjobdefinitions"
self.spaced_out_resource_name = "Model Quality Job Definition"
self.job_request_outline_location = (
"ModelQualityJobDefinition/src/ModelQualityJobDefinition_request.yaml.tpl"
)
self.job_request_location = (
"ModelQualityJobDefinition/src/ModelQualityJobDefinition_request.yaml"
)
self.update_supported = False
############GENERATED SECTION ABOVE############
super().Do(spec.inputs, spec.outputs, spec.output_paths)
def _create_job_request(
self,
inputs: SageMakerModelQualityJobDefinitionInputs,
outputs: SageMakerModelQualityJobDefinitionOutputs,
) -> Dict:
return super()._create_job_yaml(inputs, outputs)
def _submit_job_request(self, request: Dict) -> object:
return super()._create_resource(request, 12, 15)
def _on_job_terminated(self):
super()._delete_custom_resource()
def _after_submit_job_request(
self,
job: object,
request: Dict,
inputs: SageMakerModelQualityJobDefinitionInputs,
outputs: SageMakerModelQualityJobDefinitionOutputs,
):
pass
def _get_job_status(self):
return SageMakerJobStatus(is_completed=True, raw_status="Completed")
def _get_upgrade_status(self):
return self._get_job_status()
def _after_job_complete(
self,
job: object,
request: Dict,
inputs: SageMakerModelQualityJobDefinitionInputs,
outputs: SageMakerModelQualityJobDefinitionOutputs,
):
# prepare component outputs (defined in the spec)
ack_statuses = super()._get_resource()["status"]
############GENERATED SECTION BELOW############
outputs.ack_resource_metadata = str(
ack_statuses["ackResourceMetadata"]
if "ackResourceMetadata" in ack_statuses
else None
)
outputs.conditions = str(
ack_statuses["conditions"] if "conditions" in ack_statuses else None
)
outputs.sagemaker_resource_name = self.job_name
############GENERATED SECTION ABOVE############
if __name__ == "__main__":
import sys
spec = SageMakerModelQualityJobDefinitionSpec(sys.argv[1:])
component = SageMakerModelQualityJobDefinitionComponent()
component.Do(spec)

View File

@ -0,0 +1,17 @@
apiVersion: sagemaker.services.k8s.aws/v1alpha1
kind: ModelQualityJobDefinition
metadata:
name:
annotations:
services.k8s.aws/region:
spec:
jobDefinitionName:
jobResources:
modelQualityAppSpecification:
modelQualityBaselineConfig:
modelQualityJobInput:
modelQualityJobOutputConfig:
networkConfig:
roleARN:
stoppingCondition:
tags:

View File

@ -0,0 +1,147 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Specification for the SageMaker - ModelQualityJobDefinition"""
from dataclasses import dataclass
from typing import List
from commonv2.sagemaker_component_spec import (
SageMakerComponentSpec,
SageMakerComponentBaseOutputs,
)
from commonv2.spec_input_parsers import SpecInputParsers
from commonv2.common_inputs import (
COMMON_INPUTS,
SageMakerComponentCommonInputs,
SageMakerComponentInput as Input,
SageMakerComponentOutput as Output,
SageMakerComponentInputValidator as InputValidator,
SageMakerComponentOutputValidator as OutputValidator,
)
@dataclass(frozen=False)
class SageMakerModelQualityJobDefinitionInputs(SageMakerComponentCommonInputs):
"""Defines the set of inputs for the ModelQualityJobDefinition component."""
job_definition_name: Input
job_resources: Input
model_quality_app_specification: Input
model_quality_baseline_config: Input
model_quality_job_input: Input
model_quality_job_output_config: Input
network_config: Input
role_arn: Input
stopping_condition: Input
tags: Input
@dataclass
class SageMakerModelQualityJobDefinitionOutputs(SageMakerComponentBaseOutputs):
"""Defines the set of outputs for the ModelQualityJobDefinition component."""
ack_resource_metadata: Output
conditions: Output
sagemaker_resource_name: Output
class SageMakerModelQualityJobDefinitionSpec(
SageMakerComponentSpec[
SageMakerModelQualityJobDefinitionInputs,
SageMakerModelQualityJobDefinitionOutputs,
]
):
INPUTS: SageMakerModelQualityJobDefinitionInputs = SageMakerModelQualityJobDefinitionInputs(
job_definition_name=InputValidator(
input_type=str,
description="The name of the monitoring job definition.",
required=True,
),
job_resources=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Identifies the resources to deploy for a monitoring job.",
required=True,
),
model_quality_app_specification=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The container that runs the monitoring job.",
required=True,
),
model_quality_baseline_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Specifies the constraints and baselines for the monitoring job.",
required=False,
),
model_quality_job_input=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A list of the inputs that are monitored.",
required=True,
),
model_quality_job_output_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The output configuration for monitoring jobs.",
required=True,
),
network_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="Specifies the network configuration for the monitoring job.",
required=False,
),
role_arn=InputValidator(
input_type=str,
description="The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.",
required=True,
),
stopping_condition=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="A time limit for how long the monitoring job is allowed to run before stopping.",
required=False,
),
tags=InputValidator(
input_type=SpecInputParsers.yaml_or_json_list,
description="(Optional) An array of key-value pairs.",
required=False,
),
**vars(COMMON_INPUTS),
)
OUTPUTS = SageMakerModelQualityJobDefinitionOutputs(
ack_resource_metadata=OutputValidator(
description="All CRs managed by ACK have a common `Status.",
),
conditions=OutputValidator(
description="All CRS managed by ACK have a common `Status.",
),
sagemaker_resource_name=OutputValidator(
description="Resource name on Sagemaker",
),
)
def __init__(self, arguments: List[str]):
super().__init__(
arguments,
SageMakerModelQualityJobDefinitionInputs,
SageMakerModelQualityJobDefinitionOutputs,
)
@property
def inputs(self) -> SageMakerModelQualityJobDefinitionInputs:
return self._inputs
@property
def outputs(self) -> SageMakerModelQualityJobDefinitionOutputs:
return self._outputs
@property
def output_paths(self) -> SageMakerModelQualityJobDefinitionOutputs:
return self._output_paths

View File

@ -0,0 +1,48 @@
# SageMaker Monitoring Schedule Kubeflow Pipelines component v2
Component to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint.
The component can be used alone or with one of 4 job definition components:
- [Data Quality Job Definition](../DataQualityJobDefinition)
- [Model Bias Job Definition](../ModelBiasJobDefinition)
- [Model Explainability Job Definition](../ModelExplainabilityJobDefinition)
- [Model Quality Job Definition](../ModelQualityJobDefinition)
### SageMaker Kubeflow Pipeline component versioning
See the SageMaker Components for Kubeflow Pipelines versions section in [SageMaker Components for Kubeflow Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-components-for-kubeflow-pipelines.html#kubeflow-pipeline-components) to learn about the differences between the version 1 and version 2 components.
### Kubeflow Pipelines backend compatibility
SageMaker components are currently supported with Kubeflow pipelines backend v1. This means, you will have to use KFP sdk 1.8.x to create your pipelines.
## Getting Started
Follow [this guide](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/aws-samples#prerequisites) to setup the prerequisites for MonitoringSchedule depending on your deployment.
## Input Parameters
Find the high level component input parameters and their description in the [component's input specification](./component.yaml). The parameters with `JsonObject` or `JsonArray` type inputs have nested fields, you will have to refer to the [MonitoringSchedule CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/monitoringschedule/) for the respective structure and pass the input in JSON format.
A quick way to see the converted JSON style input is to copy the [sample MonitoringSchedule spec](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/monitoringschedule/#spec) and convert it to JSON using a YAML to JSON converter like [this website](https://jsonformatter.org/yaml-to-json).
For example, `monitoringScheduleConfig` is of type `object` and has the following structure:
```
monitoringScheduleConfig:
monitoringType: DataQuality
scheduleConfig:
scheduleExpression: cron(0 * ? * * *)
monitoringJobDefinitionName: <job_definition_name>
```
The JSON style input for the above parameter would be:
```
monitoring_schedule_config = {
"monitoringType": "DataQuality",
"scheduleConfig": {"scheduleExpression": "cron(0 * ? * * *)"},
"monitoringJobDefinitionName": <job_definition_name>,
}
```
For a more detailed explanation of parameters, please refer to the [AWS SageMaker API Documentation for CreateMonitoringSchedule](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateMonitoringSchedule.html).
## References
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Monitoring Schedule CRD specification](https://aws-controllers-k8s.github.io/community/reference/sagemaker/v1alpha1/monitoringschedule/)
- [AWS SageMaker API Documentation for CreateMonitoringSchedule](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateMonitoringSchedule.html)

View File

@ -0,0 +1,93 @@
name: "Sagemaker - MonitoringSchedule"
description: Create MonitoringSchedule
inputs:
- {
name: region,
type: String,
description: "The region to use for the training job",
}
###########################GENERATED SECTION BELOW############################
- {
name: monitoring_schedule_config,
type: JsonObject,
default: '{}',
description: "The configuration object that specifies the monitoring schedule and defines the monitoring job.",
}
- {
name: monitoring_schedule_name,
type: String,
default: '',
description: "The name of the monitoring schedule.",
}
- {
name: tags,
type: JsonArray,
default: '[]',
description: "(Optional) An array of key-value pairs.",
}
###########################GENERATED SECTION ABOVE############################
outputs:
###########################GENERATED SECTION BELOW############################
- {
name: ack_resource_metadata,
type: JsonObject,
description: "All CRs managed by ACK have a common `Status.",
}
- {
name: conditions,
type: JsonArray,
description: "All CRS managed by ACK have a common `Status.",
}
- {
name: creation_time,
type: String,
description: "The time at which the monitoring job was created.",
}
- {
name: failure_reason,
type: String,
description: "A string, up to one KB in size, that contains the reason a monitoring job failed, if it failed.",
}
- {
name: last_modified_time,
type: String,
description: "The time at which the monitoring job was last modified.",
}
- {
name: last_monitoring_execution_summary,
type: JsonObject,
description: "Describes metadata on the last execution to run, if there was one.",
}
- {
name: monitoring_schedule_status,
type: String,
description: "The status of an monitoring job.",
}
- {
name: sagemaker_resource_name,
type: String,
description: "Resource name on Sagemaker",
}
###########################GENERATED SECTION ABOVE############################
implementation:
container:
image: rdpen/test-modelmoni:31
command: [python3]
args:
- MonitoringSchedule/src/MonitoringSchedule_component.py
- --region
- { inputValue: region }
###########################GENERATED SECTION BELOW############################
- --monitoring_schedule_config
- { inputValue: monitoring_schedule_config }
- --monitoring_schedule_name
- { inputValue: monitoring_schedule_name }
- --tags
- { inputValue: tags }
###########################GENERATED SECTION ABOVE############################

View File

@ -0,0 +1,191 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict
import json
from MonitoringSchedule.src.MonitoringSchedule_spec import (
SageMakerMonitoringScheduleInputs,
SageMakerMonitoringScheduleOutputs,
SageMakerMonitoringScheduleSpec,
)
from commonv2.sagemaker_component import (
SageMakerComponent,
ComponentMetadata,
SageMakerJobStatus,
)
from commonv2 import snake_to_camel
@ComponentMetadata(
name="SageMaker - MonitoringSchedule",
description="",
spec=SageMakerMonitoringScheduleSpec,
)
class SageMakerMonitoringScheduleComponent(SageMakerComponent):
"""SageMaker component for MonitoringSchedule."""
def Do(self, spec: SageMakerMonitoringScheduleSpec):
self.namespace = self._get_current_namespace()
logging.info("Current namespace: " + self.namespace)
############GENERATED SECTION BELOW############
self.job_name = spec.inputs.monitoring_schedule_name = (
spec.inputs.monitoring_schedule_name
if spec.inputs.monitoring_schedule_name
else SageMakerComponent._generate_unique_timestamped_id(
prefix="monitoring-schedule"
)
)
self.group = "sagemaker.services.k8s.aws"
self.version = "v1alpha1"
self.plural = "monitoringschedules"
self.spaced_out_resource_name = "Monitoring Schedule"
self.job_request_outline_location = (
"MonitoringSchedule/src/MonitoringSchedule_request.yaml.tpl"
)
self.job_request_location = (
"MonitoringSchedule/src/MonitoringSchedule_request.yaml"
)
self.update_supported = True
############GENERATED SECTION ABOVE############
super().Do(spec.inputs, spec.outputs, spec.output_paths)
def _create_job_request(
self,
inputs: SageMakerMonitoringScheduleInputs,
outputs: SageMakerMonitoringScheduleOutputs,
) -> Dict:
return super()._create_job_yaml(inputs, outputs)
def _submit_job_request(self, request: Dict) -> object:
if self.resource_upgrade:
ack_resource = self._get_resource()
self.initial_status = ack_resource.get("status", None)
return super()._patch_custom_resource(request)
else:
return super()._create_resource(request, 12, 15)
def _on_job_terminated(self):
super()._delete_custom_resource()
def _after_submit_job_request(
self,
job: object,
request: Dict,
inputs: SageMakerMonitoringScheduleInputs,
outputs: SageMakerMonitoringScheduleOutputs,
):
pass
def _get_job_status(self):
ack_resource = super()._get_resource()
resourceSynced = self._get_resource_synced_status(ack_resource["status"])
sm_job_status = ack_resource["status"]["monitoringScheduleStatus"]
if not resourceSynced:
return SageMakerJobStatus(
is_completed=False,
raw_status=sm_job_status,
)
if sm_job_status == "Scheduled":
return SageMakerJobStatus(
is_completed=True, has_error=False, raw_status="Scheduled"
)
if sm_job_status == "Failed":
message = ack_resource["status"]["failureReason"]
return SageMakerJobStatus(
is_completed=True,
has_error=True,
error_message=message,
raw_status=sm_job_status,
)
if sm_job_status == "Stopped":
message = "The schedule was stopped."
return SageMakerJobStatus(
is_completed=True,
has_error=True,
error_message=message,
raw_status=sm_job_status,
)
return SageMakerJobStatus(is_completed=False, raw_status=sm_job_status)
def _get_upgrade_status(self):
return self._get_job_status()
def _after_job_complete(
self,
job: object,
request: Dict,
inputs: SageMakerMonitoringScheduleInputs,
outputs: SageMakerMonitoringScheduleOutputs,
):
# prepare component outputs (defined in the spec)
ack_statuses = super()._get_resource()["status"]
############GENERATED SECTION BELOW############
outputs.ack_resource_metadata = str(
ack_statuses["ackResourceMetadata"]
if "ackResourceMetadata" in ack_statuses
else None
)
outputs.conditions = str(
ack_statuses["conditions"] if "conditions" in ack_statuses else None
)
outputs.creation_time = str(
ack_statuses["creationTime"] if "creationTime" in ack_statuses else None
)
outputs.failure_reason = str(
ack_statuses["failureReason"] if "failureReason" in ack_statuses else None
)
outputs.last_modified_time = str(
ack_statuses["lastModifiedTime"]
if "lastModifiedTime" in ack_statuses
else None
)
outputs.last_monitoring_execution_summary = str(
ack_statuses["lastMonitoringExecutionSummary"]
if "lastMonitoringExecutionSummary" in ack_statuses
else None
)
outputs.monitoring_schedule_status = str(
ack_statuses["monitoringScheduleStatus"]
if "monitoringScheduleStatus" in ack_statuses
else None
)
outputs.sagemaker_resource_name = self.job_name
############GENERATED SECTION ABOVE############
if __name__ == "__main__":
import sys
spec = SageMakerMonitoringScheduleSpec(sys.argv[1:])
component = SageMakerMonitoringScheduleComponent()
component.Do(spec)

View File

@ -0,0 +1,10 @@
apiVersion: sagemaker.services.k8s.aws/v1alpha1
kind: MonitoringSchedule
metadata:
name:
annotations:
services.k8s.aws/region:
spec:
monitoringScheduleConfig:
monitoringScheduleName:
tags:

View File

@ -0,0 +1,124 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Specification for the SageMaker - MonitoringSchedule"""
from dataclasses import dataclass
from typing import List
from commonv2.sagemaker_component_spec import (
SageMakerComponentSpec,
SageMakerComponentBaseOutputs,
)
from commonv2.spec_input_parsers import SpecInputParsers
from commonv2.common_inputs import (
COMMON_INPUTS,
SageMakerComponentCommonInputs,
SageMakerComponentInput as Input,
SageMakerComponentOutput as Output,
SageMakerComponentInputValidator as InputValidator,
SageMakerComponentOutputValidator as OutputValidator,
)
@dataclass(frozen=False)
class SageMakerMonitoringScheduleInputs(SageMakerComponentCommonInputs):
"""Defines the set of inputs for the MonitoringSchedule component."""
monitoring_schedule_config: Input
monitoring_schedule_name: Input
tags: Input
@dataclass
class SageMakerMonitoringScheduleOutputs(SageMakerComponentBaseOutputs):
"""Defines the set of outputs for the MonitoringSchedule component."""
ack_resource_metadata: Output
conditions: Output
creation_time: Output
failure_reason: Output
last_modified_time: Output
last_monitoring_execution_summary: Output
monitoring_schedule_status: Output
sagemaker_resource_name: Output
class SageMakerMonitoringScheduleSpec(
SageMakerComponentSpec[
SageMakerMonitoringScheduleInputs, SageMakerMonitoringScheduleOutputs
]
):
INPUTS: SageMakerMonitoringScheduleInputs = SageMakerMonitoringScheduleInputs(
monitoring_schedule_config=InputValidator(
input_type=SpecInputParsers.yaml_or_json_dict,
description="The configuration object that specifies the monitoring schedule and defines the monitoring job.",
required=True,
),
monitoring_schedule_name=InputValidator(
input_type=str,
description="The name of the monitoring schedule.",
required=True,
),
tags=InputValidator(
input_type=SpecInputParsers.yaml_or_json_list,
description="(Optional) An array of key-value pairs.",
required=False,
),
**vars(COMMON_INPUTS),
)
OUTPUTS = SageMakerMonitoringScheduleOutputs(
ack_resource_metadata=OutputValidator(
description="All CRs managed by ACK have a common `Status.",
),
conditions=OutputValidator(
description="All CRS managed by ACK have a common `Status.",
),
creation_time=OutputValidator(
description="The time at which the monitoring job was created.",
),
failure_reason=OutputValidator(
description="A string, up to one KB in size, that contains the reason a monitoring job failed, if it failed.",
),
last_modified_time=OutputValidator(
description="The time at which the monitoring job was last modified.",
),
last_monitoring_execution_summary=OutputValidator(
description="Describes metadata on the last execution to run, if there was one.",
),
monitoring_schedule_status=OutputValidator(
description="The status of an monitoring job.",
),
sagemaker_resource_name=OutputValidator(
description="Resource name on Sagemaker",
),
)
def __init__(self, arguments: List[str]):
super().__init__(
arguments,
SageMakerMonitoringScheduleInputs,
SageMakerMonitoringScheduleOutputs,
)
@property
def inputs(self) -> SageMakerMonitoringScheduleInputs:
return self._inputs
@property
def outputs(self) -> SageMakerMonitoringScheduleOutputs:
return self._outputs
@property
def output_paths(self) -> SageMakerMonitoringScheduleOutputs:
return self._output_paths

View File

@ -48,6 +48,23 @@ For more information about Version 1 of Hosting components see [SageMaker Hostin
The Batch Transform component enables you to run inference jobs for an entire dataset in Amazon SageMaker from a Kubeflow Pipelines workflow. For more information, see [SageMaker Batch Transform Kubeflow Pipeline component version 1](https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/batch_transform).
### Model Monitor components
The Model Monitor components allow you to setup Amazon SageMaker monitoring job definition and schedule directly from a Kubeflow Pipelines workflow.
#### Monitoring Job Definition
The Monitoring Job Definition components allow you to create a monitoring job definition that can be used to create a monitoring schedule directly from a Kubeflow Pipelines workflow. For more information, see:
- [SageMaker Data Quality Job Definition Kubeflow Pipelines component version 2](./DataQualityJobDefinition)
- [SageMaker Model Bias Job Definition Kubeflow Pipelines component version 2](./ModelBiasJobDefinition)
- [SageMaker Model Explainability Job Definition Kubeflow Pipelines component version 2](./ModelExplainabilityJobDefinition)
- [SageMaker Model Quality Job Definition Kubeflow Pipelines component version 2](./ModelQualityJobDefinition)
#### Monitoring Schedule
Monitoring Schedule component to create a monitoring schedule that regularly starts Amazon SageMaker Processing Jobs to monitor the data captured for an Amazon SageMaker Endpoint, see [SageMaker Monitoring Schedule Kubeflow Pipelines component version 2](./MonitoringSchedule).
### Ground Truth components

View File

@ -0,0 +1,70 @@
import time
import pytest
import os
import utils
from utils import kfp_client_utils
from utils import ack_utils
# Testing data quality job definition component and model explainability job definition component
@pytest.mark.parametrize(
"test_file_dir",
[
pytest.param(
"resources/config/v2-monitoring-job-data-quality", marks=pytest.mark.v2
),
pytest.param(
"resources/config/v2-monitoring-job-model-explainability",
marks=pytest.mark.v2,
),
pytest.param(
"resources/config/v2-monitoring-job-model-bias", marks=pytest.mark.v2
),
pytest.param(
"resources/config/v2-monitoring-job-model-quality", marks=pytest.mark.v2
),
],
)
def test_job_definitions(kfp_client, experiment_id, test_file_dir, deploy_endpoint):
download_dir = utils.mkdir(os.path.join(test_file_dir + "/generated"))
test_params = utils.load_params(
utils.replace_placeholders(
os.path.join(test_file_dir, "config.yaml"),
os.path.join(download_dir, "config.yaml"),
)
)
k8s_client = ack_utils.k8s_client()
job_definition_name = (
utils.generate_random_string(10) + "-v2-" + test_params["TestName"]
)
job_input_name = test_params["JobInputName"]
test_params["Arguments"]["job_definition_name"] = job_definition_name
test_params["Arguments"][job_input_name]["endpointInput"][
"endpointName"
] = deploy_endpoint
try:
_, _, _ = kfp_client_utils.compile_run_monitor_pipeline(
kfp_client,
experiment_id,
test_params["PipelineDefinition"],
test_params["Arguments"],
download_dir,
test_params["TestName"],
test_params["Timeout"],
)
job_definition_describe = ack_utils._get_resource(
k8s_client, job_definition_name, test_params["Plural"]
)
# Check if the job definition is created
assert (
job_definition_name
in job_definition_describe["status"]["ackResourceMetadata"]["arn"]
)
finally:
ack_utils._delete_resource(
k8s_client, job_definition_name, test_params["Plural"]
)

View File

@ -0,0 +1,261 @@
import time
import pytest
import os
import utils
from utils import kfp_client_utils
from utils import ack_utils
from utils import sagemaker_utils
# Testing monitoring schedule with model bias job definition
@pytest.mark.parametrize(
"test_file_dir",
[
pytest.param(
"resources/config/v2-monitoring-schedule",
marks=[pytest.mark.canary_test, pytest.mark.v2],
),
],
)
def test_v2_monitoring_schedule(
kfp_client, experiment_id, test_file_dir, deploy_endpoint, sagemaker_client
):
download_dir = utils.mkdir(os.path.join(test_file_dir + "/generated"))
test_params = utils.load_params(
utils.replace_placeholders(
os.path.join(test_file_dir, "config.yaml"),
os.path.join(download_dir, "config.yaml"),
)
)
k8s_client = ack_utils.k8s_client()
# parameters for model bias job definition
job_definition_name = (
utils.generate_random_string(10) + "-v2-model-bias-job-definition"
)
test_params["Arguments"]["job_definition_name"] = job_definition_name
test_params["Arguments"]["model_bias_job_input"]["endpointInput"][
"endpointName"
] = deploy_endpoint
# parameters for monitoring schedule
monitoring_schedule_name = (
utils.generate_random_string(10) + "-v2-monitoring-schedule"
)
test_params["Arguments"]["monitoring_schedule_name"] = monitoring_schedule_name
test_params["Arguments"]["monitoring_schedule_config"][
"monitoringJobDefinitionName"
] = job_definition_name
try:
_, _, _ = kfp_client_utils.compile_run_monitor_pipeline(
kfp_client,
experiment_id,
test_params["PipelineDefinition"],
test_params["Arguments"],
download_dir,
test_params["TestName"],
test_params["Timeout"],
)
job_definition_describe = ack_utils._get_resource(
k8s_client, job_definition_name, "modelbiasjobdefinitions"
)
# Check if the job definition is created
assert job_definition_describe["status"]["ackResourceMetadata"]["arn"] != None
# Verify if monitoring schedule is created with correct name and endpoint
monitoring_schedule_describe = sagemaker_utils.describe_monitoring_schedule(
sagemaker_client, monitoring_schedule_name
)
assert (
monitoring_schedule_name
in monitoring_schedule_describe["MonitoringScheduleArn"]
)
assert monitoring_schedule_describe["MonitoringScheduleStatus"] == "Scheduled"
assert monitoring_schedule_describe["EndpointName"] == deploy_endpoint
finally:
ack_utils._delete_resource(
k8s_client,
job_definition_name,
"modelbiasjobdefinitions",
wait_periods=10,
period_length=30,
)
ack_utils._delete_resource(
k8s_client,
monitoring_schedule_name,
"monitoringschedules",
wait_periods=10,
period_length=30,
)
# Test updating monitoring schedule using the same pipeline
# Steps:
# Prepare pipeline inputs for job_definition_1 and monitoring schedule
# Run the pipeline
# Update pipeline input (instanceType) for job_definition_2
# Rerun the same pipeline
@pytest.mark.parametrize(
"test_file_dir",
[
pytest.param(
"resources/config/v2-monitoring-schedule-update",
marks=[pytest.mark.v2],
),
],
)
def test_v2_monitoring_schedule_update(
kfp_client, experiment_id, test_file_dir, deploy_endpoint, sagemaker_client
):
download_dir = utils.mkdir(os.path.join(test_file_dir + "/generated"))
test_params = utils.load_params(
utils.replace_placeholders(
os.path.join(test_file_dir, "config.yaml"),
os.path.join(download_dir, "config.yaml"),
)
)
k8s_client = ack_utils.k8s_client()
# parameters for job definition
test_params["Arguments"][test_params["JobInputName"]]["endpointInput"][
"endpointName"
] = deploy_endpoint
job_definition_name_1 = (
utils.generate_random_string(10) + "-v2-model-data-quality-defi"
)
job_definition_name_2 = (
utils.generate_random_string(10) + "-v2-model-data-quality-defi"
)
# parameter for monitoring schedule
monitoring_schedule_name = (
utils.generate_random_string(10) + "-v2-monitoring-schedule"
)
try:
test_params["Arguments"]["job_definition_name"] = job_definition_name_1
test_params["Arguments"]["job_resources"]["clusterConfig"][
"instanceType"
] = "ml.m5.large"
test_params["Arguments"]["monitoring_schedule_name"] = monitoring_schedule_name
_, _, _ = kfp_client_utils.compile_run_monitor_pipeline(
kfp_client,
experiment_id,
test_params["PipelineDefinition"],
test_params["Arguments"],
download_dir,
test_params["TestName"],
test_params["Timeout"],
)
job_definition_1_describe_ack = ack_utils._get_resource(
k8s_client, job_definition_name_1, test_params["Plural"]
)
# Check if the job definition is created
assert (
job_definition_1_describe_ack["status"]["ackResourceMetadata"]["arn"]
!= None
)
# Verify if monitoring schedule is created with correct name and endpoint
monitoring_schedule_describe = sagemaker_utils.describe_monitoring_schedule(
sagemaker_client, monitoring_schedule_name
)
assert (
monitoring_schedule_name
in monitoring_schedule_describe["MonitoringScheduleArn"]
)
assert monitoring_schedule_describe["MonitoringScheduleStatus"] == "Scheduled"
assert monitoring_schedule_describe["EndpointName"] == deploy_endpoint
assert (
monitoring_schedule_describe["MonitoringScheduleConfig"][
"MonitoringJobDefinitionName"
]
== job_definition_name_1
)
# Verify if job definition is created with correct instance type
job_definition_1_describe = (
sagemaker_utils.describe_data_quality_job_definition(
sagemaker_client, job_definition_name_1
)
)
assert (
job_definition_1_describe["JobResources"]["ClusterConfig"]["InstanceType"]
== "ml.m5.large"
)
# Update monitoring schedule using new job definition
test_params["Arguments"]["job_definition_name"] = job_definition_name_2
test_params["Arguments"]["job_resources"]["clusterConfig"][
"instanceType"
] = "ml.m5.xlarge"
_, _, _ = kfp_client_utils.compile_run_monitor_pipeline(
kfp_client,
experiment_id,
test_params["PipelineDefinition"],
test_params["Arguments"],
download_dir,
test_params["TestName"],
test_params["Timeout"],
)
monitoring_schedule_updated_describe = (
sagemaker_utils.describe_monitoring_schedule(
sagemaker_client, monitoring_schedule_name
)
)
assert (
monitoring_schedule_updated_describe["MonitoringScheduleConfig"][
"MonitoringJobDefinitionName"
]
== job_definition_name_2
)
# Verify if job definition is created with correct instance type
job_definition_2_describe = (
sagemaker_utils.describe_data_quality_job_definition(
sagemaker_client, job_definition_name_2
)
)
assert (
job_definition_2_describe["JobResources"]["ClusterConfig"]["InstanceType"]
== "ml.m5.xlarge"
)
finally:
ack_utils._delete_resource(
k8s_client,
job_definition_name_1,
test_params["Plural"],
wait_periods=10,
period_length=30,
)
ack_utils._delete_resource(
k8s_client,
job_definition_name_2,
test_params["Plural"],
wait_periods=10,
period_length=30,
)
ack_utils._delete_resource(
k8s_client,
monitoring_schedule_name,
"monitoringschedules",
wait_periods=10,
period_length=30,
)

View File

@ -2,10 +2,13 @@ import pytest
import boto3
import kfp
import os
from utils import sagemaker_utils
import utils
from datetime import datetime
from filelock import FileLock
from sagemaker import image_uris
from botocore.config import Config
def pytest_addoption(parser):
@ -180,3 +183,80 @@ def experiment_id(kfp_client, tmp_path_factory, worker_id):
data = get_experiment_id(kfp_client)
fn.write_text(data)
return data
@pytest.fixture(scope="session")
# Deploy endpoint for testing model monitoring
def deploy_endpoint(sagemaker_client, s3_data_bucket, sagemaker_role_arn, region):
model_name = "model-monitor-" + utils.generate_random_string(5) + "-model"
endpoint_name = "model-monitor-" + utils.generate_random_string(5) + "-endpoint-v2"
endpoint_config_name = (
"model-monitor-" + utils.generate_random_string(5) + "-endpoint-config"
)
# create sagemaker model
# TODO upgrade to a newer xgboost version
image_uri = image_uris.retrieve("xgboost", region, "0.90-1")
create_model_api_response = sagemaker_client.create_model(
ModelName=model_name,
PrimaryContainer={
"Image": image_uri,
"ModelDataUrl": f"s3://{s3_data_bucket}/model-monitor/xgb-churn-prediction-model.tar.gz",
"Environment": {},
},
ExecutionRoleArn=sagemaker_role_arn,
)
# create sagemaker endpoint config
create_endpoint_config_api_response = sagemaker_client.create_endpoint_config(
EndpointConfigName=endpoint_config_name,
ProductionVariants=[
{
"VariantName": "variant-1",
"ModelName": model_name,
"InitialInstanceCount": 1,
"InstanceType": "ml.m5.large",
},
],
DataCaptureConfig={
"EnableCapture": True,
"CaptureOptions": [{"CaptureMode": "Input"}, {"CaptureMode": "Output"}],
"InitialSamplingPercentage": 100,
"DestinationS3Uri": f"s3://{s3_data_bucket}/model-monitor/datacapture",
},
)
# create sagemaker endpoint
create_endpoint_api_response = sagemaker_client.create_endpoint(
EndpointName=endpoint_name,
EndpointConfigName=endpoint_config_name,
)
try:
sagemaker_client.get_waiter("endpoint_in_service").wait(
EndpointName=endpoint_name
)
finally:
resp = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
endpoint_status = resp["EndpointStatus"]
endpoint_arn = resp["EndpointArn"]
print(f"Deployed endpoint {endpoint_arn}, ended with status {endpoint_status}")
if endpoint_status != "InService":
message = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)[
"FailureReason"
]
print(
"Endpoint deployment failed with the following error: {}".format(
message
)
)
raise Exception("Endpoint deployment failed")
yield endpoint_name
# delete model and endpoint config
print("deleting endpoint.................")
sagemaker_utils.delete_endpoint(sagemaker_client, endpoint_name)
sagemaker_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
sagemaker_client.delete_model(ModelName=model_name)

View File

@ -0,0 +1,36 @@
PipelineDefinition: resources/definition/data_quality_job_defi_v2_pipeline.py
TestName: v2-data-quality-job-definition
Timeout: 4000
Plural: dataqualityjobdefinitions
JobInputName: data_quality_job_input
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
data_quality_app_specification:
imageURI: ((MODEL_MONITOR_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-model-monitor-analyzer
data_quality_baseline_config:
constraintsResource:
s3URI: s3://((DATA_BUCKET))/model-monitor/baselining/data_quality/constraints.json
statisticsResource:
s3URI: s3://((DATA_BUCKET))/model-monitor/baselining/data_quality/statistics.json
data_quality_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
data_quality_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/data-quality-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
instanceType: ml.m5.large
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 3600

View File

@ -0,0 +1,38 @@
PipelineDefinition: resources/definition/model_bias_job_defi_v2_pipeline.py
TestName: v2-model-bias-job-definition
Timeout: 2000
Plural: modelbiasjobdefinitions
JobInputName: model_bias_job_input
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
model_bias_app_specification:
imageURI: ((CLARIFY_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-clarify-processing:1.0
configURI: s3://((DATA_BUCKET))/model-monitor/baselining/model_bias/analysis_config.json
model_bias_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
probabilityThresholdAttribute: 0.8
groundTruthS3Input:
s3URI: s3://((DATA_BUCKET))/model-monitor/ground_truth_data
model_bias_baseline_config:
constraintsResource:
s3URI: s3://((DATA_BUCKET))/model-monitor/baselining/model_bias/constraints.json
model_bias_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/model-bias-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
instanceType: ml.m5.large
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 1800

View File

@ -0,0 +1,33 @@
PipelineDefinition: resources/definition/model_explainability_job_defi_v2_pipeline.py
TestName: v2-model-explainability-job-definition
Timeout: 4000
Plural: modelexplainabilityjobdefinitions
JobInputName: model_explainability_job_input
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
model_explainability_app_specification:
imageURI: ((MODEL_MONITOR_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-model-monitor-analyzer
configURI: s3://((DATA_BUCKET))/model-monitor/baselining/model_explainability/analysis_config.json
model_explainability_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
probabilityThresholdAttribute: 0.8
model_explainability_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/model-explainability-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
instanceType: ml.m5.large
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 3600

View File

@ -0,0 +1,39 @@
PipelineDefinition: resources/definition/model_quality_job_defi_v2_pipeline.py
TestName: v2-model-quality-job-definition
Timeout: 2000
Plural: modelqualityjobdefinitions
JobInputName: model_quality_job_input
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
model_quality_app_specification:
imageURI: ((CLARIFY_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-clarify-processing:1.0
problemType: "BinaryClassification"
model_quality_baseline_config:
constraintsResource:
s3URI: "s3://((DATA_BUCKET))/model-monitor/baselining/model_quality/constraints.json"
model_quality_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
probabilityAttribute: "0"
probabilityThresholdAttribute: 0.5
groundTruthS3Input:
s3URI: s3://((DATA_BUCKET))/model-monitor/ground_truth_data
model_quality_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/model-quality-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
instanceType: ml.m5.large
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 1800

View File

@ -0,0 +1,37 @@
PipelineDefinition: resources/definition/monitoring_schedule_update_v2_pipeline.py
TestName: v2-monitoring-schedule-update
Timeout: 4000
Plural: dataqualityjobdefinitions
JobInputName: data_quality_job_input
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
data_quality_app_specification:
imageURI: ((MODEL_MONITOR_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-model-monitor-analyzer
data_quality_baseline_config:
constraintsResource:
s3URI: s3://((DATA_BUCKET))/model-monitor/baselining/data_quality/constraints.json
statisticsResource:
s3URI: s3://((DATA_BUCKET))/model-monitor/baselining/data_quality/statistics.json
data_quality_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
data_quality_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/data-quality-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 3600
monitoring_type: DataQuality
schedule_expression: cron(0 * ? * * *)

View File

@ -0,0 +1,40 @@
PipelineDefinition: resources/definition/monitoring_schedule_v2_pipeline.py
TestName: v2-monitoring-schedule
Timeout: 4000
ModelImage: ((XGBOOST_REGISTRY)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3
ModelUrl: s3://((DATA_BUCKET))/model-monitor/xgboost-churn-model.tar.gz
EndpointCapture: s3://((DATA_BUCKET))/model-monitor/endpoint_capture
Arguments:
region: ((REGION))
model_bias_app_specification:
imageURI: ((MODEL_MONITOR_IMAGE)).dkr.ecr.((REGION)).amazonaws.com/sagemaker-model-monitor-analyzer
configURI: s3://((DATA_BUCKET))/model-monitor/baselining/model_bias/analysis_config.json
model_bias_job_input:
endpointInput:
localPath: "/opt/ml/processing/input/endpoint"
s3InputMode: File
s3DataDistributionType: FullyReplicated
probabilityThresholdAttribute: 0.8
startTimeOffset: "-PT1H"
endTimeOffset: "-PT0H"
groundTruthS3Input:
s3URI: s3://((DATA_BUCKET))/model-monitor/ground_truth_data
model_bias_job_output_config:
monitoringOutputs:
- s3Output:
localPath: /opt/ml/processing/output
s3URI: s3://((DATA_BUCKET))/model-monitor/reports/model-bias-job-definition/
s3UploadMode: Continuous
job_resources:
clusterConfig:
instanceCount: 1
instanceType: ml.m5.large
volumeSizeInGB: 30
role_arn: ((SAGEMAKER_ROLE_ARN))
stopping_condition:
maxRuntimeInSeconds: 3600
monitoring_schedule_config:
monitoringType: ModelBias
scheduleConfig:
scheduleExpression: "cron(0 12 ? * * *)"

View File

@ -0,0 +1,38 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_DataQualityJobDefinition_op = components.load_component_from_file(
"../../DataQualityJobDefinition/component.yaml"
)
@dsl.pipeline(
name="DataQualityJobDefinition",
description="SageMaker DataQualityJobDefinition component",
)
def DataQualityJobDefinition(
region="",
data_quality_app_specification="",
data_quality_baseline_config="",
data_quality_job_input="",
data_quality_job_output_config="",
job_definition_name="",
job_resources="",
role_arn="",
stopping_condition="",
):
DataQualityJobDefinition = sagemaker_DataQualityJobDefinition_op(
region=region,
data_quality_app_specification=data_quality_app_specification,
data_quality_baseline_config=data_quality_baseline_config,
data_quality_job_input=data_quality_job_input,
data_quality_job_output_config=data_quality_job_output_config,
job_definition_name=job_definition_name,
job_resources=job_resources,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
kfp.compiler.Compiler().compile(DataQualityJobDefinition, __file__ + ".tar.gz")

View File

@ -0,0 +1,38 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_ModelBiasJobDefinition_op = components.load_component_from_file(
"../../ModelBiasJobDefinition/component.yaml"
)
@dsl.pipeline(
name="ModelBiasJobDefinition",
description="SageMaker ModelBiasJobDefinition component",
)
def ModelBiasJobDefinition(
region="",
model_bias_app_specification="",
model_bias_baseline_config="",
model_bias_job_input="",
model_bias_job_output_config="",
job_definition_name="",
job_resources="",
role_arn="",
stopping_condition="",
):
DataQualityJobDefinition = sagemaker_ModelBiasJobDefinition_op(
region=region,
model_bias_app_specification=model_bias_app_specification,
model_bias_baseline_config=model_bias_baseline_config,
model_bias_job_input=model_bias_job_input,
model_bias_job_output_config=model_bias_job_output_config,
job_definition_name=job_definition_name,
job_resources=job_resources,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
kfp.compiler.Compiler().compile(ModelBiasJobDefinition, __file__ + ".tar.gz")

View File

@ -0,0 +1,36 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_ModelExplainabilityJobDefinition_op = components.load_component_from_file(
"../../ModelExplainabilityJobDefinition/component.yaml"
)
@dsl.pipeline(
name="ModelExplainabilityJobDefinition",
description="SageMaker ModelExplainabilityJobDefinition component",
)
def ModelExplainabilityJobDefinition(
region="",
model_explainability_app_specification="",
model_explainability_job_input="",
model_explainability_job_output_config="",
job_definition_name="",
job_resources="",
role_arn="",
stopping_condition="",
):
DataQualityJobDefinition = sagemaker_ModelExplainabilityJobDefinition_op(
region=region,
model_explainability_app_specification=model_explainability_app_specification,
model_explainability_job_input=model_explainability_job_input,
model_explainability_job_output_config=model_explainability_job_output_config,
job_definition_name=job_definition_name,
job_resources=job_resources,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
kfp.compiler.Compiler().compile(ModelExplainabilityJobDefinition, __file__ + ".tar.gz")

View File

@ -0,0 +1,40 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_ModelQualityJobDefinition_op = components.load_component_from_file(
"../../ModelQualityJobDefinition/component.yaml"
)
@dsl.pipeline(
name="ModelQualityJobDefinition",
description="SageMaker ModelQualityJobDefinition component",
)
def ModelQualityJobDefinition(
region="",
job_definition_name="",
job_resources="",
model_quality_app_specification="",
model_quality_baseline_config="",
model_quality_job_input="",
model_quality_job_output_config="",
network_config="",
role_arn="",
stopping_condition="",
):
DataQualityJobDefinition = sagemaker_ModelQualityJobDefinition_op(
region=region,
job_definition_name=job_definition_name,
job_resources=job_resources,
model_quality_app_specification=model_quality_app_specification,
model_quality_baseline_config=model_quality_baseline_config,
model_quality_job_input=model_quality_job_input,
model_quality_job_output_config=model_quality_job_output_config,
network_config=network_config,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
kfp.compiler.Compiler().compile(ModelQualityJobDefinition, __file__ + ".tar.gz")

View File

@ -0,0 +1,57 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_MonitoringSchedule_op = components.load_component_from_file(
"../../MonitoringSchedule/component.yaml"
)
sagemaker_DataQualityJobDefinition_op = components.load_component_from_file(
"../../DataQualityJobDefinition/component.yaml"
)
@dsl.pipeline(
name="MonitoringSchedule", description="SageMaker MonitoringSchedule component"
)
def MonitoringSchedule(
region="",
data_quality_app_specification="",
data_quality_baseline_config="",
data_quality_job_input="",
data_quality_job_output_config="",
job_definition_name="",
job_resources="",
role_arn="",
stopping_condition="",
monitoring_schedule_name="",
monitoring_type="",
schedule_expression="",
):
DataQualityJobDefinition = sagemaker_DataQualityJobDefinition_op(
region=region,
job_definition_name=job_definition_name,
job_resources=job_resources,
data_quality_app_specification=data_quality_app_specification,
data_quality_baseline_config=data_quality_baseline_config,
data_quality_job_input=data_quality_job_input,
data_quality_job_output_config=data_quality_job_output_config,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
monitoring_schedule_config = {
"monitoringType": monitoring_type,
"scheduleConfig": {"scheduleExpression": schedule_expression},
"monitoringJobDefinitionName": DataQualityJobDefinition.outputs[
"sagemaker_resource_name"
],
}
MonitoringSchedule = sagemaker_MonitoringSchedule_op(
region=region,
monitoring_schedule_config=monitoring_schedule_config,
monitoring_schedule_name=monitoring_schedule_name,
)
kfp.compiler.Compiler().compile(MonitoringSchedule, __file__ + ".tar.gz")

View File

@ -0,0 +1,46 @@
import kfp
from kfp import components
from kfp import dsl
sagemaker_MonitoringSchedule_op = components.load_component_from_file(
"../../MonitoringSchedule/component.yaml"
)
sagemaker_ModelBiasJobDefinition_op = components.load_component_from_file(
"../../ModelBiasJobDefinition/component.yaml"
)
@dsl.pipeline(
name="MonitoringSchedule", description="SageMaker MonitoringSchedule component"
)
def MonitoringSchedule(
region="",
model_bias_app_specification="",
model_bias_job_input="",
model_bias_job_output_config="",
job_definition_name="",
job_resources="",
role_arn="",
stopping_condition="",
monitoring_schedule_name="",
monitoring_schedule_config="",
):
ModelBiasJobDefinition = sagemaker_ModelBiasJobDefinition_op(
region=region,
job_definition_name=job_definition_name,
job_resources=job_resources,
model_bias_app_specification=model_bias_app_specification,
model_bias_job_input=model_bias_job_input,
model_bias_job_output_config=model_bias_job_output_config,
role_arn=role_arn,
stopping_condition=stopping_condition,
)
MonitoringSchedule = sagemaker_MonitoringSchedule_op(
region=region,
monitoring_schedule_config=monitoring_schedule_config,
monitoring_schedule_name=monitoring_schedule_name,
).after(ModelBiasJobDefinition)
kfp.compiler.Compiler().compile(MonitoringSchedule, __file__ + ".tar.gz")

View File

@ -97,6 +97,10 @@ def replace_placeholders(input_filename, output_filename, shallow_canary=False):
"xgboost", region, "1.0-1"
),
"((BUILTIN_RULE_IMAGE))": get_algorithm_image_registry("debugger", region),
"((MODEL_MONITOR_IMAGE))": get_algorithm_image_registry(
"model-monitor", region
),
"((CLARIFY_IMAGE))": get_algorithm_image_registry("clarify", region),
"((FSX_ID))": get_fsx_id(),
"((FSX_SUBNET))": get_fsx_subnet(),
"((FSX_SECURITY_GROUP))": get_fsx_security_group(),

View File

@ -10,29 +10,33 @@ def k8s_client():
def _get_resource(k8s_client, job_name, plural):
"""Get the custom resource detail similar to: kubectl describe <resource> JOB_NAME -n NAMESPACE.
Returns:
None or object: None if the resource doesnt exist in server, otherwise the
None or object: None if the resource doesn't exist in server or there is an error, otherwise the
custom object.
"""
_api = client.CustomObjectsApi(k8s_client)
namespace = os.environ.get("NAMESPACE")
job_description = _api.get_namespaced_custom_object(
"sagemaker.services.k8s.aws",
"v1alpha1",
namespace.lower(),
plural,
job_name.lower(),
)
try:
job_description = _api.get_namespaced_custom_object(
"sagemaker.services.k8s.aws",
"v1alpha1",
namespace.lower(),
plural,
job_name.lower(),
)
except Exception as e:
print(f"Exception occurred while getting resource {job_name}: {e}")
return None
return job_description
def _delete_resource(k8s_client, job_name, plural):
def _delete_resource(k8s_client, job_name, plural, wait_periods=10, period_length=20):
"""Delete the custom resource
Returns:
None or object: None if the resource doesnt exist in server, otherwise the
custom object.
True or False: True if the resource is deleted, False if the resource deletion times out
"""
_api = client.CustomObjectsApi(k8s_client)
namespace = os.environ.get("NAMESPACE")
try:
_api.delete_namespaced_custom_object(
"sagemaker.services.k8s.aws",
@ -41,9 +45,17 @@ def _delete_resource(k8s_client, job_name, plural):
plural,
job_name.lower(),
)
except:
return False
return True
except Exception as e:
print(f"Exception occurred while deleting resource {job_name}: {e}")
for _ in range(wait_periods):
sleep(period_length)
if _get_resource(k8s_client, job_name, plural) is None:
print(f"Resource {job_name} deleted successfully.")
return True
print(f"Wait for resource deletion timed out, resource name: {job_name}")
return False
# TODO: Make this a generalized function for non-job resources.
@ -81,13 +93,8 @@ def does_endpoint_exist(k8s_client, endpoint_name):
def is_endpoint_deleted(k8s_client, endpoint_name):
try:
response = _get_resource(k8s_client, endpoint_name, "endpoints")
if response:
return False
if (
response is None
): # kubernetes module error, 404 would mean the resource doesnt exist
return False
except:
response = _get_resource(k8s_client, endpoint_name, "endpoints")
if response:
return False
if response is None:
return True

View File

@ -11,6 +11,7 @@ import json
from utils import get_s3_data_bucket
def describe_training_job(client, training_job_name):
return client.describe_training_job(TrainingJobName=training_job_name)
@ -37,6 +38,18 @@ def delete_endpoint(client, endpoint_name):
waiter.wait(EndpointName=endpoint_name)
def describe_monitoring_schedule(client, monitoring_schedule_name):
return client.describe_monitoring_schedule(
MonitoringScheduleName=monitoring_schedule_name
)
def describe_data_quality_job_definition(client, job_definition_name):
return client.describe_data_quality_job_definition(
JobDefinitionName=job_definition_name
)
def describe_hpo_job(client, job_name):
return client.describe_hyper_parameter_tuning_job(
HyperParameterTuningJobName=job_name
@ -93,6 +106,7 @@ def stop_labeling_job(client, labeling_job_name):
def describe_processing_job(client, processing_job_name):
return client.describe_processing_job(ProcessingJobName=processing_job_name)
def run_predict_mnist(boto3_session, endpoint_name, download_dir):
"""https://github.com/awslabs/amazon-sagemaker-
examples/blob/a8c20eeb72dc7d3e94aaaf28be5bf7d7cd5695cb.
@ -119,6 +133,8 @@ def run_predict_mnist(boto3_session, endpoint_name, download_dir):
payload = np2csv(train_set[0][30:31])
response = runtime.invoke_endpoint(
EndpointName=endpoint_name, ContentType="text/csv", Body=payload,
EndpointName=endpoint_name,
ContentType="text/csv",
Body=payload,
)
return json.loads(response["Body"].read().decode())
return json.loads(response["Body"].read().decode())

View File

@ -0,0 +1,178 @@
from commonv2.sagemaker_component import SageMakerJobStatus
from unittest.mock import patch, MagicMock
import unittest
from MonitoringSchedule.src.MonitoringSchedule_spec import (
SageMakerMonitoringScheduleSpec,
)
from MonitoringSchedule.src.MonitoringSchedule_component import (
SageMakerMonitoringScheduleComponent,
)
class MonitoringScheduleComponentTestCase(unittest.TestCase):
REQUIRED_ARGS = [
"--region",
"us-west-1",
"--monitoring_schedule_name",
"test",
"--monitoring_schedule_config",
"{'test': 'test'}",
]
@classmethod
def setUp(cls):
cls.component = SageMakerMonitoringScheduleComponent()
cls.component.job_name = "test"
@patch("MonitoringSchedule.src.MonitoringSchedule_component.super", MagicMock())
def test_do_sets_name(self):
named_spec = SageMakerMonitoringScheduleSpec(self.REQUIRED_ARGS)
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_current_namespace"
) as mock_namespace:
mock_namespace.return_value = "test-namespace"
self.component.Do(named_spec)
self.assertEqual("test", self.component.job_name)
def test_get_job_status(self):
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_resource"
) as mock_get_resource:
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_resource_synced_status"
) as mock_resource_sync:
mock_resource_sync.return_value = False
mock_get_resource.return_value = {
"status": {"monitoringScheduleStatus": "Pending"}
}
self.assertEqual(
self.component._get_job_status(),
SageMakerJobStatus(is_completed=False, raw_status="Pending"),
)
mock_resource_sync.return_value = True
mock_get_resource.return_value = {
"status": {"monitoringScheduleStatus": "Scheduled"}
}
self.assertEqual(
self.component._get_job_status(),
SageMakerJobStatus(is_completed=True, raw_status="Scheduled"),
)
mock_get_resource.return_value = {
"status": {"monitoringScheduleStatus": "Stopped"}
}
self.assertEqual(
self.component._get_job_status(),
SageMakerJobStatus(
is_completed=True,
raw_status="Stopped",
has_error=True,
error_message="The schedule was stopped.",
),
)
mock_get_resource.return_value = {
"status": {
"failureReason": "crash",
"monitoringScheduleStatus": "Failed",
}
}
self.assertEqual(
self.component._get_job_status(),
SageMakerJobStatus(
is_completed=True,
raw_status="Failed",
has_error=True,
error_message="crash",
),
)
def test_after_job_completed(self):
spec = SageMakerMonitoringScheduleSpec(self.REQUIRED_ARGS)
statuses = {
"status": {
"ackResourceMetadata": 1,
"conditions": 2,
}
}
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_resource",
MagicMock(return_value=statuses),
):
self.component._after_job_complete({}, {}, spec.inputs, spec.outputs)
self.assertEqual(spec.outputs.ack_resource_metadata, "1")
self.assertEqual(spec.outputs.conditions, "2")
self.assertEqual(spec.outputs.sagemaker_resource_name, "test")
def test_get_upgrade_status(self):
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_resource"
) as mock_get_resource:
with patch(
"MonitoringSchedule.src.MonitoringSchedule_component.SageMakerComponent._get_conditions_of_type"
) as mock_condition_list:
mock_get_resource.return_value = {
"status": {
"monitoringScheduleStatus": "Scheduled",
"conditions": [
{"type": "ACK.ResourceSynced", "status": "True"}
],
}
}
mock_condition_list.return_value = []
self.assertEqual(
self.component._get_upgrade_status(),
SageMakerJobStatus(
is_completed=True, raw_status="Scheduled", has_error=False
),
)
mock_get_resource.return_value = {
"status": {"monitoringScheduleStatus": "Updating"}
}
self.assertEqual(
self.component._get_upgrade_status(),
SageMakerJobStatus(
is_completed=False, raw_status="Updating", has_error=False
),
)
mock_get_resource.return_value = {
"status": {
"monitoringScheduleStatus": "Failed",
"failureReason": "invalid config",
"conditions": [
{"type": "ACK.ResourceSynced", "status": "True"}
],
}
}
# Failed Endpoint
self.assertEqual(
self.component._get_upgrade_status(),
SageMakerJobStatus(
is_completed=True,
raw_status="Failed",
has_error=True,
error_message="invalid config",
),
)
if __name__ == "__main__":
unittest.main()

View File

@ -0,0 +1,35 @@
from MonitoringSchedule.src.MonitoringSchedule_spec import (
SageMakerMonitoringScheduleSpec,
)
import unittest
class MonitoringScheduleSpecTestCase(unittest.TestCase):
REQUIRED_ARGS = [
"--region",
"us-west-1",
"--monitoring_schedule_name",
"test_monitoring_schedule_name",
"--monitoring_schedule_config",
"{'monitoringType': 'DataQuality', 'monitoringJobDefinitionName': 'test_monitoring_name', 'scheduleConfig': '' }",
]
INCORRECT_ARGS = [
"--region",
"us-west-1",
"--monitoring_schedule_name",
"test_monitoring_schedule_name",
]
def test_minimum_required_args(self):
# Will raise an exception if the inputs are incorrect
spec = SageMakerMonitoringScheduleSpec(self.REQUIRED_ARGS)
def test_incorrect_args(self):
# Will raise an exception if the inputs are incorrect
with self.assertRaises(SystemExit):
spec = SageMakerMonitoringScheduleSpec(self.INCORRECT_ARGS)
if __name__ == "__main__":
unittest.main()