View a markdown version of this page

Deploy a Model from the Registry with Python - Amazon SageMaker AI

Deploy a Model from the Registry with Python

After you register a model version and approve it for deployment, deploy it to a SageMaker AI endpoint for real-time inference. You can deploy your model by using the SageMaker AI SDK or the AWS SDK for Python (Boto3).

When you create a machine learning operations (MLOps) project and choose an MLOps project template that includes model deployment, approved model versions in the Model Registry are automatically deployed to production. For information about using SageMaker AI MLOps projects, see MLOps Automation With SageMaker Projects.

You can also enable an AWS account to deploy model versions that were created in a different account by adding a cross-account resource policy. For example, one team in your organization might be responsible for training models, and a different team is responsible for deploying and updating models.

Deploy a Model from the Registry (SageMaker SDK)

To deploy a model version using the Amazon SageMaker Python SDK use the following code snippet:

SageMaker Python SDK v3
from sagemaker.serve import ModelBuilder model_package_arn = 'arn:aws:sagemaker:us-east-2:12345678901:model-package/modeltest/1' # In V3, deploy a model package through ModelBuilder model_builder = ModelBuilder( model=model_package_arn, role_arn=role, sagemaker_session=sagemaker_session, instance_type='ml.m5.xlarge' ) model_builder.build() endpoint = model_builder.deploy( initial_instance_count=1, instance_type='ml.m5.xlarge' )
SageMaker Python SDK v2 (Legacy)
from sagemaker import ModelPackage from time import gmtime, strftime model_package_arn = 'arn:aws:sagemaker:us-east-2:12345678901:model-package/modeltest/1' model = ModelPackage(role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session) model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge')

Deploy a Model from the Registry (Boto3)

To deploy a model version using the AWS SDK for Python (Boto3), complete the following steps:

  1. The following code snippet assumes you already created the SageMaker AI Boto3 client sm_client and a model version whose ARN is stored in the variable model_version_arn.

    Create a model object from the model version by calling the create_model API operation. Pass the Amazon Resource Name (ARN) of the model version as part of the Containers for the model object:

    SageMaker Python SDK v3
    model_name = 'DEMO-modelregistry-model-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime()) print("Model name : {}".format(model_name)) container_list = [{'ModelPackageName': model_version_arn}] create_model_response = sm_client.create_model( ModelName = model_name, ExecutionRoleArn = role, Containers = container_list ) print("Model arn : {}".format(create_model_response["ModelArn"]))
    SageMaker Python SDK v2 (Legacy)
    import time import os from sagemaker import get_execution_role, session import boto3 region = boto3.Session().region_name role = get_execution_role() sm_client = boto3.client('sagemaker', region_name=region)
  2. Create an endpoint configuration by calling create_endpoint_config. The endpoint configuration specifies the number and type of Amazon EC2 instances to use for the endpoint.

    endpoint_config_name = 'DEMO-modelregistry-EndpointConfig-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime()) print(endpoint_config_name) create_endpoint_config_response = sm_client.create_endpoint_config( EndpointConfigName = endpoint_config_name, ProductionVariants=[{ 'InstanceType':'ml.m4.xlarge', 'InitialVariantWeight':1, 'InitialInstanceCount':1, 'ModelName':model_name, 'VariantName':'AllTraffic'}])
  3. Create the endpoint by calling create_endpoint.

    SageMaker Python SDK v3
    endpoint_name = 'DEMO-modelregistry-endpoint-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime()) print("EndpointName={}".format(endpoint_name)) create_endpoint_response = sm_client.create_endpoint( EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name) print(create_endpoint_response['EndpointArn'])
    SageMaker Python SDK v2 (Legacy)
    import time import os from sagemaker import get_execution_role, session import boto3 region = boto3.Session().region_name role = get_execution_role() sm_client = boto3.client('sagemaker', region_name=region)

Deploy a Model Version from a Different Account

You can permit an AWS account to deploy model versions that were created in a different account by adding a cross-account resource policy. For example, one team in your organization might be responsible for training models, and a different team is responsible for deploying and updating models. When you create these resource policies, you apply the policy to the specific resource to which you want to grant access. For more information about cross-account resource policies in AWS, see Cross-account policy evaluation logic in the AWS Identity and Access Management User Guide.

Note

You must use a KMS key to encrypt the output data config action during training for cross-account model deployment.

To enable cross-account model deployment in SageMaker AI, you have to provide a cross-account resource policy for the Model Group that contains the model versions you want to deploy, the Amazon ECR repository where the inference image for the Model Group resides, and the Amazon S3 bucket where the model versions are stored.

To be able to deploy a model that was created in a different account, you must have a role that has access to SageMaker AI actions, such as a role with the AmazonSageMakerFullAccess managed policy. For information about SageMaker AI managed policies, see AWS managed policies for Amazon SageMaker AI.

The following example creates cross-account policies for all three of these resources, and applies the policies to the resources. The example also assumes that you previously defined the following variables:

  • bucket – The Amazon S3 bucket where the model versions are stored.

  • kms_key_id – The KMS key used to encrypt the training output.

  • sm_client – A SageMaker AI Boto3 client.

  • model_package_group_name – The Model Group to which you want to grant cross-account access.

  • model_package_group_arn – The Model Group ARN to which you want to grant cross-account access.

SageMaker Python SDK v3
import json # The cross-account id to grant access to cross_account_id = "123456789012" # Create the policy for access to the ECR repository ecr_repository_policy = { 'Version': '2012-10-17', 'Statement': [{ 'Sid': 'AddPerm', 'Effect': 'Allow', 'Principal': { 'AWS': f'arn:aws:iam::{cross_account_id}:root' }, 'Action': ['ecr:*'] }] } # Convert the ECR policy from JSON dict to string ecr_repository_policy = json.dumps(ecr_repository_policy) # Set the new ECR policy ecr = boto3.client('ecr') response = ecr.set_repository_policy( registryId = account, repositoryName = 'decision-trees-sample', policyText = ecr_repository_policy ) # Create a policy for accessing the S3 bucket bucket_policy = { 'Version': '2012-10-17', 'Statement': [{ 'Sid': 'AddPerm', 'Effect': 'Allow', 'Principal': { 'AWS': f'arn:aws:iam::{cross_account_id}:root' }, 'Action': 's3:*', 'Resource': f'arn:aws:s3:::{bucket}/*' }] } # Convert the policy from JSON dict to string bucket_policy = json.dumps(bucket_policy) # Set the new policy s3 = boto3.client('s3') response = s3.put_bucket_policy( Bucket = bucket, Policy = bucket_policy) # Create the KMS grant for encryption in the source account to the # Model Registry account Model Group client = boto3.client('kms') response = client.create_grant( GranteePrincipal=cross_account_id, KeyId=kms_key_id Operations=[ 'Decrypt', 'GenerateDataKey', ], ) # 3. Create a policy for access to the Model Group. model_package_group_policy = { 'Version': '2012-10-17', 'Statement': [{ 'Sid': 'AddPermModelPackageGroup', 'Effect': 'Allow', 'Principal': { 'AWS': f'arn:aws:iam::{cross_account_id}:root' }, 'Action': ['sagemaker:DescribeModelPackageGroup'], 'Resource': f'arn:aws:sagemaker:{region}:{account}:model-package-group/{model_package_group_name}' },{ 'Sid': 'AddPermModelPackageVersion', 'Effect': 'Allow', 'Principal': { 'AWS': f'arn:aws:iam::{cross_account_id}:root' }, 'Action': ["sagemaker:DescribeModelPackage", "sagemaker:ListModelPackages", "sagemaker:UpdateModelPackage", "sagemaker:CreateModel"], 'Resource': f'arn:aws:sagemaker:{region}:{account}:model-package/{model_package_group_name}/*' }] } # Convert the policy from JSON dict to string model_package_group_policy = json.dumps(model_package_group_policy) # Set the policy to the Model Group response = sm_client.put_model_package_group_policy( ModelPackageGroupName = model_package_group_name, ResourcePolicy = model_package_group_policy) print('ModelPackageGroupArn : {}'.format(create_model_package_group_response['ModelPackageGroupArn'])) print("First Versioned ModelPackageArn: " + model_package_arn) print("Second Versioned ModelPackageArn: " + model_package_arn2) print("Success! You are all set to proceed for cross-account deployment.")
SageMaker Python SDK v2 (Legacy)
import json # The Model Registry account id of the Model Group model_registry_account = "111111111111" # The model training account id where training happens model_training_account = "222222222222" # 1. Create a policy for access to the ECR repository # in the model training account for the Model Registry account Model Group ecr_repository_policy = {"Version": "2012-10-17", "Statement": [{"Sid": "AddPerm", "Effect": "Allow", "Principal": { "AWS": f"arn:aws:iam::{model_registry_account}:root" }, "Action": [ "ecr:BatchGetImage", "ecr:Describe*" ] }] } # Convert the ECR policy from JSON dict to string ecr_repository_policy = json.dumps(ecr_repository_policy) # Set the new ECR policy ecr = boto3.client('ecr') response = ecr.set_repository_policy( registryId = model_training_account, repositoryName = "decision-trees-sample", policyText = ecr_repository_policy ) # 2. Create a policy in the model training account for access to the S3 bucket # where the model is present in the Model Registry account Model Group bucket_policy = {"Version": "2012-10-17", "Statement": [{"Sid": "AddPerm", "Effect": "Allow", "Principal": {"AWS": f"arn:aws:iam::{model_registry_account}:root" }, "Action": [ "s3:GetObject", "s3:GetBucketAcl", "s3:GetObjectAcl" ], "Resource": [ "arn:aws:s3:::{bucket}/*", "Resource: arn:aws:s3:::{bucket}" ] }] } # Convert the S3 policy from JSON dict to string bucket_policy = json.dumps(bucket_policy) # Set the new bucket policy s3 = boto3.client("s3") response = s3.put_bucket_policy( Bucket = bucket, Policy = bucket_policy) # 3. Create the KMS grant for the key used during training for encryption # in the model training account to the Model Registry account Model Group client = boto3.client("kms") response = client.create_grant( GranteePrincipal=model_registry_account, KeyId=kms_key_id Operations=[ "Decrypt", "GenerateDataKey", ], )