MLOps

BTP Service

SAP AI Core

The enterprise MLOps and model serving platform on SAP BTP. Manages the full AI lifecycle — from Git-synced training pipelines through auto-scaled inference endpoints — and serves as the execution engine for SAP Generative AI Hub and SAP Joule.

Overview

SAP AI Core is the foundational AI infrastructure service on SAP BTP that provides managed compute for training machine learning models and serving inference endpoints at enterprise scale. It is built on Kubernetes and uses Argo Workflows as the pipeline engine, enabling declarative, Git-driven ML pipelines that teams treat as code.

Multi-tenancy is achieved through Resource Groups — isolated compute, storage, secrets, and model registries per tenant or environment. A single AI Core service instance can host dozens of independent Resource Groups, making it suitable for platform teams building shared ML infrastructure for multiple business units.

All operations are available through the AI API (a unified REST control plane) and the SAP AI SDK for Python and JavaScript. The visual interface — SAP AI Launchpad — provides data scientists with a web GUI over the same capabilities.

GitOps ML Pipelines

Argo Workflow YAML stored in Git, auto-synced to AI Core

Resource Group Isolation

Separate compute, storage, and registry per tenant

Auto-Scaled Serving

Inference endpoints with health monitoring and canary rollouts

Artefact Store

SAP-managed S3-compatible object storage per Resource Group

Model Registry

Versioned model artefact lifecycle with metadata tracking

Gen AI Hub Hosting

20+ foundation models via SAP-managed endpoints

Runtime Architecture

SAP AI Core — Runtime Architecture

Rendering diagram…

Control Plane

AI API + AI Launchpad manage Resource Groups, applications, executions, and deployments via REST. All state is persisted in SAP-managed infrastructure.

Data Plane

Argo Workflow engine runs training pipelines and manages inference serving pods on Kubernetes. Each Resource Group is a Kubernetes namespace.

Gen AI Hub Layer

Foundation model router and Orchestration Service run as a tenant within AI Core. Prompts never leave SAP infrastructure — data sovereignty at the network layer.

Core Concepts

Resource Group

The isolation unit in AI Core. Each Resource Group has dedicated compute quotas, a separate Artefact Store (object storage), its own model registry, and isolated secrets. Prevents all cross-tenant data access.

Application (GitOps)

A registered Git repository containing Argo Workflow YAML templates. AI Core polls the repository (or receives webhooks) and registers all discovered WorkflowTemplate objects as executable pipelines.

Execution (Training Run)

An invocation of a registered workflow template with specific input parameters and artefact bindings. Executions run on managed Kubernetes compute and write output artefacts to the Artefact Store.

Artefact

A versioned file or directory stored in the Artefact Store. Input artefacts (datasets, feature stores) are mounted read-only. Output artefacts (trained models, evaluation reports) are written by executions.

Configuration

A named parameter set that binds a workflow template to specific input values. Configurations enable reproducible executions — the same template run with different hyperparameters or dataset versions.

Deployment

A running inference container that exposes a versioned model as an HTTPS REST endpoint. Deployments auto-scale replicas, report health status, and support graceful rolling updates.

Multi-Tenancy via Resource Groups

Resource Group Isolation Model

Rendering diagram…

ISV and platform team pattern: Each customer or business unit gets a dedicated Resource Group. The platform team administers the AI Core service instance; tenants interact only within their own Resource Group boundary. Compute quotas are set at the Resource Group level.

ML Pipeline Lifecycle

Training Pipeline — GitOps to Inference Endpoint

Rendering diagram…

MLOps Lifecycle Phases

Template Authoring

Write Argo Workflow YAML. Define container images, input parameters, artefact paths, and resource requirements.

Git + YAML

Application Registration

AI API / SDK

Configuration

Create named configurations that bind templates to parameter values and input artefact IDs.

AI API / SDK

Execution

Trigger training runs via API. Argo Workflows orchestrates containers on Kubernetes-managed compute.

AI API / SDK

Model Registration

AI API / SDK

Deployment

Create a deployment from a model configuration. AI Core starts the serving pod and returns the inference URL.

AI API / SDK

Inference

Client applications call the inference URL with Authorization header. AI Core routes to the running deployment.

HTTPS REST

SAP AI Launchpad

SAP AI Launchpad is a separate BTP subscription that provides a visual web interface over the same AI Core control plane. It is designed for data scientists and ML engineers who prefer a GUI for exploring executions, comparing run metrics, monitoring deployments, and inspecting logs — without writing API code.

AI Launchpad Capabilities

Execution Explorer: Browse training runs, filter by scenario, compare metric outputs across runs
Deployment Manager: Monitor inference deployments, view health status, scale replicas
Log Viewer: Real-time and historical log streaming from executions and serving pods
Model Registry: Browse versioned models, inspect artefact metadata, promote to deployment
ML Operations: Manage configurations, register artefacts, trigger executions via UI
Gen AI Hub UI: Test foundation model prompts, manage prompt templates, view token usage

When to Use AI Launchpad vs. AI API

MLOps Engineer / Automator

AI API + SAP AI SDK — programmatic, CI/CD integration, scripted pipelines

Data Scientist

AI Launchpad — visual exploration, run comparison, interactive log review

Prompt Engineer

AI Launchpad Gen AI Hub UI — interactive prompt testing and template management

Platform Administrator

AI API — Resource Group provisioning, quota management, bulk operations

Inference Deployment Lifecycle

Deployment State Machine

Rendering diagram…

Auto-Scaling

AI Core adjusts replica count based on CPU/memory utilisation and request queue depth. Scale-to-zero is supported for dev/test Resource Groups.

Health Monitoring

Kubernetes liveness and readiness probes. If a deployment enters DEAD state, the restart policy re-schedules the pod automatically.

Canary Rollouts

Deploy a new model version alongside the current one. Route a configurable percentage of traffic to the new version before full cut-over.

SAP AI SDK — Setup

The SAP AI SDK for Python (sap-ai-sdk) wraps the AI API and Gen AI Hub in idiomatic Python clients. It uses the BTP service key (AICORE_SERVICE_KEY) for authentication.

Installation

# Install the SAP AI SDK for Python
pip install sap-ai-sdk

# Or install with all optional extras
pip install "sap-ai-sdk[all]"

Resource Groups & GitOps Applications

resource_groups.py

from ai_core_sdk.ai_core_v2_client import AICoreV2Client

# Authenticate using BTP service key (set env vars or pass directly)
client = AICoreV2Client(
    base_url="https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2",
    auth_url="https://<your-tenant>.authentication.eu10.hana.ondemand.com/oauth/token",
    client_id="<client-id>",
    client_secret="<client-secret>",
)

# Create a Resource Group for isolated tenant workloads
client.resource_groups.create(resource_group_id="production-team-a")

# Register a Git application (GitOps — pipeline templates sync from repo)
from ai_core_sdk.models import ApplicationBaseData

app = client.applications.create(
    application_name="fraud-detection-pipelines",
    repository_url="https://github.com/myorg/ai-core-templates",
    revision="main",
    path="/workflows",
)
print(f"Application synced: {app.application_name}")

# List registered workflow templates (synced from Git)
templates = client.workflow_specs.query(resource_group_id="production-team-a")
for t in templates.resources:
    print(f"  Pipeline: {t.name} — {t.scenario_id}")

Argo Workflow Pipeline Template

Pipeline templates are stored as Argo Workflow YAML in a Git repository. AI Core syncs the repository and registers each WorkflowTemplate as an executable pipeline. The template below defines a training container with GPU support and structured input/output artefact paths.

workflows/training-pipeline.yaml

# workflows/training-pipeline.yaml
# Stored in Git — AI Core syncs this automatically (GitOps)
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: fraud-detection-training
  annotations:
    scenarios.ai.sap.com/description: "Fraud detection model training pipeline"
    scenarios.ai.sap.com/name: "Fraud Detection"
    executors.ai.sap.com/v1: '[{"name":"fraud-training","image":"<registry>/fraud-trainer:1.0"}]'
    labels.ai.sap.com/version: "1.0.0"
spec:
  templates:
    - name: fraud-detection-training
      inputs:
        parameters:
          - name: learning_rate
            default: "0.001"
          - name: batch_size
            default: "64"
          - name: epochs
            default: "50"
        artifacts:
          - name: training-data
            path: /data/train
      outputs:
        artifacts:
          - name: trained-model
            path: /output/model
            archive:
              none: {}
      container:
        image: "{{workflow.parameters.executors.fraud-training.image}}"
        command: [python, train.py]
        args:
          - "--learning-rate={{inputs.parameters.learning_rate}}"
          - "--batch-size={{inputs.parameters.batch_size}}"
          - "--epochs={{inputs.parameters.epochs}}"
          - "--data-path=/data/train"
          - "--output-path=/output/model"
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            nvidia.com/gpu: "1"   # Request GPU for training

The scenarios.ai.sap.com and executors.ai.sap.com annotations are SAP AI Core metadata annotations required for template registration. Without them, AI Core ignores the YAML file during sync.

Triggering Training Executions

training_execution.py

from ai_core_sdk.models import ExecutionCreationRequest, ParameterBinding

# Trigger a training execution
execution = client.execution.create(
    resource_group_id="production-team-a",
    body=ExecutionCreationRequest(
        configuration_id="fraud-detection-v2-config",
    ),
)
print(f"Execution started: {execution.id} — status: {execution.status}")

# Poll for completion (production use: implement proper event loop / webhook)
import time

while True:
    status = client.execution.get(
        execution_id=execution.id,
        resource_group_id="production-team-a",
    )
    print(f"  Status: {status.status}")
    if status.status in ("COMPLETED", "DEAD", "STOPPED"):
        break
    time.sleep(30)

# Stream execution logs
logs = client.execution.query_logs(
    execution_id=execution.id,
    resource_group_id="production-team-a",
    start="2025-01-01T00:00:00Z",
)
for log in logs.data.result:
    print(f"[{log.timestamp}] {log.msg}")

Model Deployment & Inference

deployment.py

from ai_core_sdk.models import DeploymentCreationRequest

# Create an inference deployment from a registered model
deployment = client.deployment.create(
    resource_group_id="production-team-a",
    body=DeploymentCreationRequest(
        configuration_id="fraud-detection-serving-config",
    ),
)
print(f"Deployment ID: {deployment.id}")

# Wait for deployment to reach RUNNING state
import time

while True:
    d = client.deployment.get(
        deployment_id=deployment.id,
        resource_group_id="production-team-a",
    )
    print(f"  Deployment status: {d.status}")
    if d.status == "RUNNING":
        print(f"  Inference URL: {d.deployment_url}")
        break
    if d.status in ("DEAD", "STOPPED"):
        raise RuntimeError(f"Deployment failed: {d.status}")
    time.sleep(15)

# Call the inference endpoint
import requests

token = client._get_token()   # Reuse SDK token helper
inference_url = f"{d.deployment_url}/v1/models/fraud-detector:predict"

response = requests.post(
    inference_url,
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "AI-Resource-Group": "production-team-a",
    },
    json={
        "instances": [
            {"amount": 4500.00, "merchant": "ONLINE_RETAIL", "location": "DE"}
        ]
    },
    timeout=30,
)
print(response.json())
# { "predictions": [{"fraud_probability": 0.03, "label": "LEGITIMATE"}] }

AI API — REST Reference

All SDK operations map directly to REST endpoints. The base URL follows the patternhttps://api.ai.<region>.ml.hana.ondemand.com/v2. Authentication uses a standard OAuth 2.0 client credentials flow against the BTP UAA tenant.

ai-api-examples.sh

### AI API — REST Examples (curl)
### Base URL: https://api.ai.<region>.ml.hana.ondemand.com/v2

# 1. List resource groups
curl -X GET \
  "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/admin/resourceGroups" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}"

# 2. Create a configuration (links a pipeline template to parameter values)
curl -X POST \
  ".../v2/lm/configurations" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "fraud-detection-v2-config",
    "executableId": "fraud-detection-training",
    "scenarioId": "fraud-detection-scenario",
    "parameterBindings": [
      { "key": "learning_rate", "value": "0.001" },
      { "key": "batch_size",    "value": "64" },
      { "key": "epochs",        "value": "100" }
    ],
    "inputArtifactBindings": [
      { "key": "training-data", "artifactId": "<dataset-artefact-id>" }
    ]
  }'

# 3. Trigger execution
curl -X POST \
  ".../v2/lm/executions" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{ "configurationId": "<config-id>" }'

# 4. Get execution logs
curl -X GET \
  ".../v2/lm/executions/<exec-id>/logs?start=2025-01-01T00:00:00Z" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a"

# 5. Create deployment (inference serving)
curl -X POST \
  ".../v2/lm/deployments" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{ "configurationId": "<serving-config-id>" }'

Supported ML Frameworks & Runtimes

SAP AI Core runs any Docker container — there is no restriction on the ML framework used inside training pipelines or serving containers. The following frameworks are validated by SAP and referenced in official documentation:

PyTorch

Deep learning, transformers

TensorFlow / Keras

Neural networks, TFX pipelines

scikit-learn

Classical ML, preprocessing

XGBoost

Gradient boosting, tabular data

Hugging Face

Open-weight LLMs, tokenisers

ONNX Runtime

Cross-framework inference

LangChain

LLM orchestration (via Gen AI Hub)

Custom Docker

Any framework or runtime

BTP Service Connectivity

AI Core → SAP HANA Cloud

Training pipelines access HANA Cloud via the BTP Destination Service. The HANA Vector Engine is used for embedding storage and similarity search during RAG pipeline steps.

AI Core → SAP Datasphere

Feature data and training datasets are consumed from Datasphere via OData APIs or the Datasphere Consumption API. Data preparation pipelines run as Argo Workflow steps.

AI Core → S/4HANA / SuccessFactors

Training data extraction from SAP backend systems uses the Integration Suite as a data pipeline layer. Real-time inference calls can be initiated by SAP backend events via Integration Suite.

AI Core → SAP Joule

Joule uses AI Core as the execution engine for its custom skills (Joule Studio) and routes LLM calls through the Generative AI Hub tenant hosted on AI Core.

Road Map

Status:Generally AvailablePlannedRoadmapFuture Direction

Generally Available

AI Core Service — Multi-region deployment

Generally Available across all major BTP regions (EU10, US10, AP10, JP10).

Generally Available

AI Launchpad — Visual MLOps UI

Generally Available as BTP subscription. Manage pipelines, deployments, and logs via web UI.

Generally Available

Free Tier plan for exploration

AI Core Free Tier available for development and proof-of-concept workloads.

Generally Available

GPU compute for training pipelines

NVIDIA GPU nodes available in AI Core Standard plan for deep learning training.

Generally Available

Bring Your Own Model (BYOM) fine-tuning

Fine-tune Llama 3 and other open-weight models on proprietary data within AI Core.

Planned

ML Pipeline event triggers (S/4HANA events)

Trigger training pipelines from SAP Event Mesh business events. Planned — SAP Road Map.

Roadmap

AI Core — Enhanced observability dashboard

Unified metrics, tracing, and cost attribution per Resource Group. On the SAP Road Map.

Future Direction

Automated model evaluation and retraining

Automated drift detection triggering retraining pipelines. Future Direction — SAP Innovation.

Licensing & Commercial Model

Status:Generally AvailablePlannedRoadmapFuture Direction

AI Core

Generally Available· GA

SAP's MLOps service on SAP BTP — providing infrastructure for AI model training, deployment, serving, and lifecycle management including access to the Generative AI Hub.

CPEA

Resource UnitsInference UnitsStorage (GB)

CPEA consumption-based: Resource Units for model training/serving, Inference Units for production AI workloads. Storage charged separately.

Discovery Center Full details

Generative AI Hub

Generally Available· GA — 20+ foundation models available; model catalogue continuously updated

SAP's curated access point for 20+ foundation models (GPT-4o, Claude, Gemini, Llama, DALL-E, and SAP-specific models) — with data privacy, usage tracking, and SAP context grounding.

CPEA

TokensInference Units

Access via SAP AI Core (Standard plan). Token consumption billed per model per 1,000 tokens. All inference processed within SAP-operated infrastructure for data sovereignty.

Discovery Center Full details

SAP AI Core — Plan Comparison

Feature	Free TierExploration onlyGenerally Available	StandardCPEA consumption-basedGenerally Available
Compute & Scale
Purpose	Development & exploration	Production MLOps workloads
Resource Groups	1 (default)	Unlimited
Concurrent Executions	Limited	Based on CPEA quota
Inference Deployments	1 deployment	Unlimited deployments
GPU compute		NVIDIA A10G, V100 (by region)
AI Capabilities
Generative AI Hub access	Limited model access	Full 20+ model catalogue
Custom model training (BYOM)
Model Registry	Basic	Full lifecycle management
AI Launchpad	Separate subscription	Separate subscription
Enterprise
SLA	None (best-effort)	SAP standard BTP SLA
Data residency controls	SAP-managed defaults	Region-pinned deployment
Commercial model	Free (with BTP account)	CPEA (Resource Units + Inference Units + Storage GB)

SAP AI Core Standard is a CPEA consumption-based service. Costs accrue for Resource Units (compute time during training), Inference Units (serving pod uptime), and Storage GB (Artefact Store). The Generative AI Hub token consumption is billed separately per 1,000 tokens per model. For current rates, see SAP Discovery Center and your CPEA contract.

SAP Official References

SAP AI Core — Help Portal

Official product documentation including API reference, tutorials, and release notes.

SAP AI SDK for Python

Python client library for AI API and Generative AI Hub. PyPI package with full documentation.

SAP AI Core on Discovery Center

Service overview, pricing, regions, and trial information.

SAP AI Launchpad — Help Portal

AI Launchpad setup, user guide, and troubleshooting.

Argo Workflows Documentation

Open-source pipeline engine used by SAP AI Core. YAML reference and examples.

SAP AI Core Road Map

Official SAP product road map for AI Core — planned features and timelines.