AI‑Kubeflow Pipeline‑2

Training, Evaluation & Model Selection · Kubernetes‑Native MLOps

Production‑style ML training and evaluation pipelines using Kubeflow Pipelines with containerized components, artifact lineage, and metric‑driven model selection (KFP v2, MinIO, MLMD).

Project Summary

MLOps / Platform Engineering

Category

AI + MLOps + Platform Engineering (Kubernetes / Kubeflow)

Industry

Cross‑industry Enterprise AI Platforms / MLOps Infrastructure

Domain

Model Training Engineering / Pipeline Orchestration

Key Technologies & Concepts

Kubeflow native keywords

MLOps keywords

Kubeflow Pipelines (KFP v2) Pipeline as Code Containerized ML components Training pipelines (LR, DT) Evaluation metrics (accuracy) Experiment tracking Artifact lineage (MLMD) Model comparison MinIO artifact store Argo Workflows

Problem & Objective

Why this project exists

Problems Solved

  • Notebook‑driven training lacks reproducibility & governance
  • Manual execution → inconsistent environments, fragmented metrics
  • Limited visibility into experiment lineage and outcomes

Primary Objective

  • Reproducible, pipeline‑driven ML training & evaluation on Kubeflow
  • Containerized components + standardized artifact storage + metric‑driven model selection

Solution & Architecture

Kubeflow pipeline flow

Orchestration overview

Kubeflow Pipelines workflow orchestrates containerized components: data ingestion → training (Logistic Regression / Decision Tree) → evaluation → metric‑based comparison.

Flow: Data → Train (LR, DT) → Evaluate → Compare Metrics → (conditional selection)
Kubeflow DAG (KFP v2)
1Data ingestion
2Train LR
3Train DT
4Evaluate
5Compare / select

Components

  • Download data (component)
  • Training (scikit‑learn containers)
  • Evaluation metrics (accuracy)
  • Kubeflow Pipelines orchestration
  • MinIO artifact storage

Platform & services

  • Kubernetes + Kubeflow (KFP v2)
  • Argo Workflows (execution engine)
  • Docker (component images)
  • MinIO + ML Metadata Store
  • RBAC / K8s secrets

Skills & Technologies

Proficiencies demonstrated

Primary skills

  • Kubeflow Pipelines (advanced)
  • Kubernetes‑Native MLOps (advanced)
  • scikit‑learn (model training)
  • Docker / containerization
  • YAML component specs

Secondary tools

  • Argo Workflows
  • MinIO, ML Metadata
  • Python (core)
  • Kubeflow UI / SDK
  • Docker Hub

Kubeflow DevOps — Architecture & YAML mapping

Reference: Pipeline‑2 modelling

Architecture BlockKubeflow CI/CD / MLOps Construct (Pipeline‑2)
Source RepositoryGitHub (Kubeflow modeling / pipelines repo)
Source TriggerManual (Kubeflow UI/SDK) or CI trigger (GitHub Actions)
CI RunnerGitHub Actions Linux runner (optional for compilation)
Build / Pipeline ExecutionKubeflow Pipelines (KFP v2: Data → Train → Evaluate → Condition)
Training OrchestrationKFP v2 on Kubernetes (Argo Workflows)
Data ProcessingKubeflow Pipeline Component (Python + sklearn)
Model EvaluationPython + sklearn evaluation; accuracy metric
Artifact StorageMinIO (S3‑compatible for datasets, models, metrics)
Container RegistryDocker Hub (versioned component images)
Model RegistryKFP run history + ML Metadata Store + MinIO artifact versions
Approval GatePipeline Condition (metric threshold gate)
Security & AuthKubernetes ServiceAccounts + RBAC
Secrets / ConfigKubernetes Secrets + environment variables
Monitoring & LogsKubeflow Pipelines UI + Pod logs
Lineage & GovernanceML Metadata Store (inputs/outputs, lineage)
Infrastructure BackendSelf‑managed Kubernetes (Minikube/EKS/AKS/GKE) + Kubeflow manifests

Pipeline‑2 implements Kubernetes‑native ML training & evaluation, open‑source equivalent of Vertex AI Pipelines.

Challenges & Resolutions

Technical hurdles

Key challenges

  • Packaging training code into reproducible container images → standardised Dockerfiles per component
  • Wiring artifact paths between components → Kubeflow artifact inputs/outputs
  • Consistent metric logging across variants → centralised accuracy logging
  • Experiment lineage → ML Metadata Store + KFP run history

Assets & References

Code, diagrams, study material

GitHub Repository

Kubeflow pipelines MLOps (training, evaluation, selection).

Access Repo

MLOps Study Resources

Kubeflow Pipelines – restricted & public materials

Request Study Material

Note: Additional project materials (KFP v2 YAML, component specs) are shared selectively for interview / evaluation.

Study Material – Kubeflow Pipelines v2

KFP official documentation + YAML guide
Downloadable PDF (public) – KFP v2, Python SDK
Download
Restricted: KFP pipeline‑2 component specs
Access limited to authorised users – container definitions, metrics
Download
Architecture diagrams (training & evaluation)
High‑res: Data → Train(LR/DT) → Evaluate → Compare
Download
ML Metadata & lineage best practices
MLMD + MinIO artifact tracking
Download
Kubeflow + Argo workflows deep dive
Execution engine & pod isolation
Download