MLOps
BTP Service
GA

SAP AI Core

The enterprise MLOps and model serving platform on SAP BTP. Manages the full AI lifecycle — from Git-synced training pipelines through auto-scaled inference endpoints — and serves as the execution engine for SAP Generative AI Hub and SAP Joule.

Overview

SAP AI Core is the foundational AI infrastructure service on SAP BTP that provides managed compute for training machine learning models and serving inference endpoints at enterprise scale. It is built on Kubernetes and uses Argo Workflows as the pipeline engine, enabling declarative, Git-driven ML pipelines that teams treat as code.

Multi-tenancy is achieved through Resource Groups — isolated compute, storage, secrets, and model registries per tenant or environment. A single AI Core service instance can host dozens of independent Resource Groups, making it suitable for platform teams building shared ML infrastructure for multiple business units.

All operations are available through the AI API (a unified REST control plane) and the SAP AI SDK for Python and JavaScript. The visual interface — SAP AI Launchpad — provides data scientists with a web GUI over the same capabilities.

GitOps ML Pipelines
Argo Workflow YAML stored in Git, auto-synced to AI Core
Resource Group Isolation
Separate compute, storage, and registry per tenant
Auto-Scaled Serving
Inference endpoints with health monitoring and canary rollouts
Artefact Store
SAP-managed S3-compatible object storage per Resource Group
Model Registry
Versioned model artefact lifecycle with metadata tracking
Gen AI Hub Hosting
20+ foundation models via SAP-managed endpoints

Runtime Architecture

SAP AI Core — Runtime Architecture
Rendering diagram…
Control Plane

AI API + AI Launchpad manage Resource Groups, applications, executions, and deployments via REST. All state is persisted in SAP-managed infrastructure.

Data Plane

Argo Workflow engine runs training pipelines and manages inference serving pods on Kubernetes. Each Resource Group is a Kubernetes namespace.

Gen AI Hub Layer

Foundation model router and Orchestration Service run as a tenant within AI Core. Prompts never leave SAP infrastructure — data sovereignty at the network layer.

Core Concepts

Resource Group

The isolation unit in AI Core. Each Resource Group has dedicated compute quotas, a separate Artefact Store (object storage), its own model registry, and isolated secrets. Prevents all cross-tenant data access.

Application (GitOps)

A registered Git repository containing Argo Workflow YAML templates. AI Core polls the repository (or receives webhooks) and registers all discovered WorkflowTemplate objects as executable pipelines.

Execution (Training Run)

An invocation of a registered workflow template with specific input parameters and artefact bindings. Executions run on managed Kubernetes compute and write output artefacts to the Artefact Store.

Artefact

A versioned file or directory stored in the Artefact Store. Input artefacts (datasets, feature stores) are mounted read-only. Output artefacts (trained models, evaluation reports) are written by executions.

Configuration

A named parameter set that binds a workflow template to specific input values. Configurations enable reproducible executions — the same template run with different hyperparameters or dataset versions.

Deployment

A running inference container that exposes a versioned model as an HTTPS REST endpoint. Deployments auto-scale replicas, report health status, and support graceful rolling updates.

Multi-Tenancy via Resource Groups

Resource Group Isolation Model
Rendering diagram…
ISV and platform team pattern: Each customer or business unit gets a dedicated Resource Group. The platform team administers the AI Core service instance; tenants interact only within their own Resource Group boundary. Compute quotas are set at the Resource Group level.

ML Pipeline Lifecycle

Training Pipeline — GitOps to Inference Endpoint
Rendering diagram…
MLOps Lifecycle Phases
1
Template Authoring
Write Argo Workflow YAML. Define container images, input parameters, artefact paths, and resource requirements.
Git + YAML
2
Application Registration
Register the Git repo with AI Core. AI Core syncs WorkflowTemplate objects into the target Resource Group.
AI API / SDK
3
Configuration
Create named configurations that bind templates to parameter values and input artefact IDs.
AI API / SDK
4
Execution
Trigger training runs via API. Argo Workflows orchestrates containers on Kubernetes-managed compute.
AI API / SDK
5
Model Registration
Register output artefacts as versioned model objects in the Model Registry with metadata.
AI API / SDK
6
Deployment
Create a deployment from a model configuration. AI Core starts the serving pod and returns the inference URL.
AI API / SDK
7
Inference
Client applications call the inference URL with Authorization header. AI Core routes to the running deployment.
HTTPS REST

SAP AI Launchpad

SAP AI Launchpad is a separate BTP subscription that provides a visual web interface over the same AI Core control plane. It is designed for data scientists and ML engineers who prefer a GUI for exploring executions, comparing run metrics, monitoring deployments, and inspecting logs — without writing API code.

AI Launchpad Capabilities

Execution Explorer
Browse training runs, filter by scenario, compare metric outputs across runs
Deployment Manager
Monitor inference deployments, view health status, scale replicas
Log Viewer
Real-time and historical log streaming from executions and serving pods
Model Registry
Browse versioned models, inspect artefact metadata, promote to deployment
ML Operations
Manage configurations, register artefacts, trigger executions via UI
Gen AI Hub UI
Test foundation model prompts, manage prompt templates, view token usage

When to Use AI Launchpad vs. AI API

MLOps Engineer / Automator
AI API + SAP AI SDK — programmatic, CI/CD integration, scripted pipelines
Data Scientist
AI Launchpad — visual exploration, run comparison, interactive log review
Prompt Engineer
AI Launchpad Gen AI Hub UI — interactive prompt testing and template management
Platform Administrator
AI API — Resource Group provisioning, quota management, bulk operations

Inference Deployment Lifecycle

Deployment State Machine
Rendering diagram…
Auto-Scaling

AI Core adjusts replica count based on CPU/memory utilisation and request queue depth. Scale-to-zero is supported for dev/test Resource Groups.

Health Monitoring

Kubernetes liveness and readiness probes. If a deployment enters DEAD state, the restart policy re-schedules the pod automatically.

Canary Rollouts

Deploy a new model version alongside the current one. Route a configurable percentage of traffic to the new version before full cut-over.

SAP AI SDK — Setup

The SAP AI SDK for Python (sap-ai-sdk) wraps the AI API and Gen AI Hub in idiomatic Python clients. It uses the BTP service key (AICORE_SERVICE_KEY) for authentication.

Installation
# Install the SAP AI SDK for Python
pip install sap-ai-sdk

# Or install with all optional extras
pip install "sap-ai-sdk[all]"

Resource Groups & GitOps Applications

resource_groups.py
from ai_core_sdk.ai_core_v2_client import AICoreV2Client

# Authenticate using BTP service key (set env vars or pass directly)
client = AICoreV2Client(
    base_url="https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2",
    auth_url="https://<your-tenant>.authentication.eu10.hana.ondemand.com/oauth/token",
    client_id="<client-id>",
    client_secret="<client-secret>",
)

# Create a Resource Group for isolated tenant workloads
client.resource_groups.create(resource_group_id="production-team-a")

# Register a Git application (GitOps — pipeline templates sync from repo)
from ai_core_sdk.models import ApplicationBaseData

app = client.applications.create(
    application_name="fraud-detection-pipelines",
    repository_url="https://github.com/myorg/ai-core-templates",
    revision="main",
    path="/workflows",
)
print(f"Application synced: {app.application_name}")

# List registered workflow templates (synced from Git)
templates = client.workflow_specs.query(resource_group_id="production-team-a")
for t in templates.resources:
    print(f"  Pipeline: {t.name}{t.scenario_id}")

Argo Workflow Pipeline Template

Pipeline templates are stored as Argo Workflow YAML in a Git repository. AI Core syncs the repository and registers each WorkflowTemplate as an executable pipeline. The template below defines a training container with GPU support and structured input/output artefact paths.

workflows/training-pipeline.yaml
# workflows/training-pipeline.yaml
# Stored in Git — AI Core syncs this automatically (GitOps)
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: fraud-detection-training
  annotations:
    scenarios.ai.sap.com/description: "Fraud detection model training pipeline"
    scenarios.ai.sap.com/name: "Fraud Detection"
    executors.ai.sap.com/v1: '[{"name":"fraud-training","image":"<registry>/fraud-trainer:1.0"}]'
    labels.ai.sap.com/version: "1.0.0"
spec:
  templates:
    - name: fraud-detection-training
      inputs:
        parameters:
          - name: learning_rate
            default: "0.001"
          - name: batch_size
            default: "64"
          - name: epochs
            default: "50"
        artifacts:
          - name: training-data
            path: /data/train
      outputs:
        artifacts:
          - name: trained-model
            path: /output/model
            archive:
              none: {}
      container:
        image: "{{workflow.parameters.executors.fraud-training.image}}"
        command: [python, train.py]
        args:
          - "--learning-rate={{inputs.parameters.learning_rate}}"
          - "--batch-size={{inputs.parameters.batch_size}}"
          - "--epochs={{inputs.parameters.epochs}}"
          - "--data-path=/data/train"
          - "--output-path=/output/model"
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            nvidia.com/gpu: "1"   # Request GPU for training
The scenarios.ai.sap.com and executors.ai.sap.com annotations are SAP AI Core metadata annotations required for template registration. Without them, AI Core ignores the YAML file during sync.

Triggering Training Executions

training_execution.py
from ai_core_sdk.models import ExecutionCreationRequest, ParameterBinding

# Trigger a training execution
execution = client.execution.create(
    resource_group_id="production-team-a",
    body=ExecutionCreationRequest(
        configuration_id="fraud-detection-v2-config",
    ),
)
print(f"Execution started: {execution.id} — status: {execution.status}")

# Poll for completion (production use: implement proper event loop / webhook)
import time

while True:
    status = client.execution.get(
        execution_id=execution.id,
        resource_group_id="production-team-a",
    )
    print(f"  Status: {status.status}")
    if status.status in ("COMPLETED", "DEAD", "STOPPED"):
        break
    time.sleep(30)

# Stream execution logs
logs = client.execution.query_logs(
    execution_id=execution.id,
    resource_group_id="production-team-a",
    start="2025-01-01T00:00:00Z",
)
for log in logs.data.result:
    print(f"[{log.timestamp}] {log.msg}")

Model Deployment & Inference

deployment.py
from ai_core_sdk.models import DeploymentCreationRequest

# Create an inference deployment from a registered model
deployment = client.deployment.create(
    resource_group_id="production-team-a",
    body=DeploymentCreationRequest(
        configuration_id="fraud-detection-serving-config",
    ),
)
print(f"Deployment ID: {deployment.id}")

# Wait for deployment to reach RUNNING state
import time

while True:
    d = client.deployment.get(
        deployment_id=deployment.id,
        resource_group_id="production-team-a",
    )
    print(f"  Deployment status: {d.status}")
    if d.status == "RUNNING":
        print(f"  Inference URL: {d.deployment_url}")
        break
    if d.status in ("DEAD", "STOPPED"):
        raise RuntimeError(f"Deployment failed: {d.status}")
    time.sleep(15)

# Call the inference endpoint
import requests

token = client._get_token()   # Reuse SDK token helper
inference_url = f"{d.deployment_url}/v1/models/fraud-detector:predict"

response = requests.post(
    inference_url,
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "AI-Resource-Group": "production-team-a",
    },
    json={
        "instances": [
            {"amount": 4500.00, "merchant": "ONLINE_RETAIL", "location": "DE"}
        ]
    },
    timeout=30,
)
print(response.json())
# { "predictions": [{"fraud_probability": 0.03, "label": "LEGITIMATE"}] }

AI API — REST Reference

All SDK operations map directly to REST endpoints. The base URL follows the patternhttps://api.ai.<region>.ml.hana.ondemand.com/v2. Authentication uses a standard OAuth 2.0 client credentials flow against the BTP UAA tenant.

ai-api-examples.sh
### AI API — REST Examples (curl)
### Base URL: https://api.ai.<region>.ml.hana.ondemand.com/v2

# 1. List resource groups
curl -X GET \
  "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/admin/resourceGroups" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}"

# 2. Create a configuration (links a pipeline template to parameter values)
curl -X POST \
  ".../v2/lm/configurations" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "fraud-detection-v2-config",
    "executableId": "fraud-detection-training",
    "scenarioId": "fraud-detection-scenario",
    "parameterBindings": [
      { "key": "learning_rate", "value": "0.001" },
      { "key": "batch_size",    "value": "64" },
      { "key": "epochs",        "value": "100" }
    ],
    "inputArtifactBindings": [
      { "key": "training-data", "artifactId": "<dataset-artefact-id>" }
    ]
  }'

# 3. Trigger execution
curl -X POST \
  ".../v2/lm/executions" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{ "configurationId": "<config-id>" }'

# 4. Get execution logs
curl -X GET \
  ".../v2/lm/executions/<exec-id>/logs?start=2025-01-01T00:00:00Z" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a"

# 5. Create deployment (inference serving)
curl -X POST \
  ".../v2/lm/deployments" \
  -H "Authorization: Bearer ${AI_CORE_TOKEN}" \
  -H "AI-Resource-Group: production-team-a" \
  -H "Content-Type: application/json" \
  -d '{ "configurationId": "<serving-config-id>" }'

Supported ML Frameworks & Runtimes

SAP AI Core runs any Docker container — there is no restriction on the ML framework used inside training pipelines or serving containers. The following frameworks are validated by SAP and referenced in official documentation:

PyTorch
Deep learning, transformers
TensorFlow / Keras
Neural networks, TFX pipelines
scikit-learn
Classical ML, preprocessing
XGBoost
Gradient boosting, tabular data
Hugging Face
Open-weight LLMs, tokenisers
ONNX Runtime
Cross-framework inference
LangChain
LLM orchestration (via Gen AI Hub)
Custom Docker
Any framework or runtime

BTP Service Connectivity

AI Core → SAP HANA Cloud

Training pipelines access HANA Cloud via the BTP Destination Service. The HANA Vector Engine is used for embedding storage and similarity search during RAG pipeline steps.

AI Core → SAP Datasphere

Feature data and training datasets are consumed from Datasphere via OData APIs or the Datasphere Consumption API. Data preparation pipelines run as Argo Workflow steps.

AI Core → S/4HANA / SuccessFactors

Training data extraction from SAP backend systems uses the Integration Suite as a data pipeline layer. Real-time inference calls can be initiated by SAP backend events via Integration Suite.

AI Core → SAP Joule

Joule uses AI Core as the execution engine for its custom skills (Joule Studio) and routes LLM calls through the Generative AI Hub tenant hosted on AI Core.

Road Map

Status:Generally AvailablePlannedRoadmapFuture Direction
Generally Available
AI Core Service — Multi-region deployment
Generally Available across all major BTP regions (EU10, US10, AP10, JP10).
Generally Available
AI Launchpad — Visual MLOps UI
Generally Available as BTP subscription. Manage pipelines, deployments, and logs via web UI.
Generally Available
Free Tier plan for exploration
AI Core Free Tier available for development and proof-of-concept workloads.
Generally Available
GPU compute for training pipelines
NVIDIA GPU nodes available in AI Core Standard plan for deep learning training.
Generally Available
Bring Your Own Model (BYOM) fine-tuning
Fine-tune Llama 3 and other open-weight models on proprietary data within AI Core.
Planned
ML Pipeline event triggers (S/4HANA events)
Trigger training pipelines from SAP Event Mesh business events. Planned — SAP Road Map.
Roadmap
AI Core — Enhanced observability dashboard
Unified metrics, tracing, and cost attribution per Resource Group. On the SAP Road Map.
Future Direction
Automated model evaluation and retraining
Automated drift detection triggering retraining pipelines. Future Direction — SAP Innovation.

Licensing & Commercial Model

Status:Generally AvailablePlannedRoadmapFuture Direction
AI

AI Core

Generally Available· GA

SAP's MLOps service on SAP BTP — providing infrastructure for AI model training, deployment, serving, and lifecycle management including access to the Generative AI Hub.

CPEA
Resource UnitsInference UnitsStorage (GB)

CPEA consumption-based: Resource Units for model training/serving, Inference Units for production AI workloads. Storage charged separately.

AI

Generative AI Hub

Generally Available· GA — 20+ foundation models available; model catalogue continuously updated

SAP's curated access point for 20+ foundation models (GPT-4o, Claude, Gemini, Llama, DALL-E, and SAP-specific models) — with data privacy, usage tracking, and SAP context grounding.

CPEA
TokensInference Units

Access via SAP AI Core (Standard plan). Token consumption billed per model per 1,000 tokens. All inference processed within SAP-operated infrastructure for data sovereignty.

SAP AI Core — Plan Comparison

Feature
Free TierExploration onlyGenerally Available
StandardCPEA consumption-basedGenerally Available
Compute & Scale
PurposeDevelopment & explorationProduction MLOps workloads
Resource Groups1 (default)Unlimited
Concurrent ExecutionsLimitedBased on CPEA quota
Inference Deployments1 deploymentUnlimited deployments
GPU computeNVIDIA A10G, V100 (by region)
AI Capabilities
Generative AI Hub accessLimited model accessFull 20+ model catalogue
Custom model training (BYOM)
Model RegistryBasicFull lifecycle management
AI LaunchpadSeparate subscriptionSeparate subscription
Enterprise
SLANone (best-effort)SAP standard BTP SLA
Data residency controlsSAP-managed defaultsRegion-pinned deployment
Commercial modelFree (with BTP account)CPEA (Resource Units + Inference Units + Storage GB)
SAP AI Core Standard is a CPEA consumption-based service. Costs accrue for Resource Units (compute time during training), Inference Units (serving pod uptime), and Storage GB (Artefact Store). The Generative AI Hub token consumption is billed separately per 1,000 tokens per model. For current rates, see SAP Discovery Center and your CPEA contract.

SAP Official References