AWS AI Deployment Pipeline

Registry‑Driven Endpoint Promotion · Pipeline‑3

Automated promotion of approved models from SageMaker Model Registry to real‑time endpoints using AWS CDK and CI/CD (GitHub Actions + OIDC). Multi‑environment (dev, pre‑prod, prod) with least‑privilege IAM, KMS encryption, and traffic management.

Project Summary

MLOps + Cloud · cross‑industry

Category

MLOps + Cloud · AI Platform Engineering

Domain

Model Deployment & Serving · Enterprise AI

Method

CDK (IaC) · GitHub Actions · SageMaker

Key Technologies

AWS + CI/CD stack

AWS CDKSageMaker Endpoints Model RegistryEndpointConfig GitHub ActionsOIDC → IAM CloudFormationS3 / ECR KMS encryptionMulti‑env CloudWatchYAML config

Problem & Objective

Problem

  • Manual model deployment → inconsistent environments, security risks, slow promotion, lack of governance.

Objective

  • Secure, repeatable, registry‑driven promotion to SageMaker endpoints via CDK + CI/CD (GitHub Actions + OIDC).

Solution & Architecture

Overview

CDK pipeline fetches latest Approved model from SageMaker Model Registry, provisions Model, EndpointConfig, Endpoint to Dev/Pre‑Prod/Prod. Triggered by GitHub Actions (OIDC auth). Endpoint configuration (instance, weight) YAML‑driven. Completes lifecycle started in Pipeline‑2.

GitHub → OIDC → IAM → CDK → CloudFormation → SageMaker Endpoint
1GitHub Actions
2CDK synth
3Model Registry
4EndpointConfig
5Real‑time Endpoint

Components

  • AWS CDK (Python)
  • SageMaker Model Registry
  • Endpoint + EndpointConfig
  • IAM / KMS / S3 / ECR
  • CloudFormation backend

Scalability & reliability

  • Managed SageMaker endpoints (HA)
  • Config‑driven instance sizing
  • CloudFormation rollback
  • Multi‑environment isolation

MLOps & Automation

AI/ML type

MLOps Deployment / Model Serving Automation

Pipeline automation

Pull latest Approved model → create Model + EndpointConfig + Endpoint → CDK deployment

CI/CD & containerisation

  • GitHub Actions (orchestration) + OIDC → IAM
  • AWS CDK (deployment as code)
  • Docker/ECR for custom inference containers

Monitoring & optimisation

  • CloudWatch metrics & logs for endpoints
  • GitHub Actions + CloudFormation logs
  • YAML‑driven instance tuning
  • Cost guardrails via fixed defaults

Skills & Technologies

Primary skills

  • MLOps Deployment / Model Serving – Advanced
  • AWS CDK (IaC) – Advanced
  • SageMaker Endpoints – Advanced
  • CI/CD GitHub Actions – Advanced

Secondary tools

  • AWS IAM, KMS, S3
  • CloudFormation
  • Docker / ECR

Languages

Python (primary), YAML (CI/CD & config)

Challenges & Resolutions

Challenges

  • Registry‑driven without hardcoding versions
  • Secure CI/CD auth for prod
  • Environment‑specific config
  • IAM least‑privilege scoping

Resolutions

  • Lookup latest Approved model
  • OIDC‑based IAM roles for GitHub
  • YAML‑driven endpoint configs
  • Least‑privilege IAM + KMS encryption

CI/CD · AWS Mapping

Pipeline‑3 constructs

Architecture BlockAWS CI/CD Construct (Pipeline‑3)
Source RepositoryGitHub (deployment/IaC repo)
TriggerGitHub Actions (manual / approval)
CI RunnerGitHub Linux runner (ubuntu‑latest)
OrchestrationAWS CDK (synth / deploy)
Infra BackendAWS CloudFormation
Model SourceSageMaker Model Registry (approved packages)
Model PackagingSageMaker Model (CfnModel from Registry)
Endpoint ConfigurationSageMaker EndpointConfig (instance, weight)
Deployment TargetSageMaker Real‑time Endpoint (Dev/Pre‑Prod/Prod)
Artifact StorageAmazon S3 (model artifacts)
Container RegistryAmazon ECR (custom images)
Security & AuthOIDC (GitHub→AWS) + IAM roles

Process flow: GitHub → GitHub Actions → OIDC → IAM → CDK → CloudFormation → Model/EndpointConfig → Real‑time Endpoint.

Assets & References

GitHub Repository

https://github.com/03sarah/mlops-cdk-github-action

View

Study Material – Pipeline‑3

Official CDK & YAML reference
Downloadable PDF (public)
PDF
CDK file specific (restricted)
Detailed CDK constructs, IAM policies
PDF
Architecture diagrams (P3)
GitHub → OIDC → CDK → CloudFormation flow
PDF
SageMaker EndpointConfig YAML examples
Instance types, traffic weights, env vars
PDF