Good AI Tools

>>

stars: 2
forks: 0
watches: 2
last updated: 2026-01-26 21:32:45

SAP-RPT-1-OSS Predictor

SAP-RPT-1-OSS is SAP's open source tabular foundation model (Apache 2.0) for predictions on structured business data. Unlike LLMs that predict text, RPT-1 predicts field values in table rows using in-context learning—no model training required.

Repository: https://github.com/SAP-samples/sap-rpt-1-oss Model: https://huggingface.co/SAP/sap-rpt-1-oss

Setup

1. Install Package

pip install git+https://github.com/SAP-samples/sap-rpt-1-oss

2. Hugging Face Authentication

Model weights require HF login and license acceptance:

# Install HF CLI
pip install huggingface_hub

# Login (creates ~/.huggingface/token)
huggingface-cli login

Then accept model terms at: https://huggingface.co/SAP/sap-rpt-1-oss

3. Hardware Requirements

ConfigGPU MemoryContext SizeBaggingUse Case
Optimal80GB (A100)81928Production, best accuracy
Standard40GB (A6000)40964Good balance
Minimal24GB (RTX 4090)20482Development
CPUN/A10241Testing only (slow)

Quick Start

Classification (Customer Churn, Payment Default)

import pandas as pd
from sap_rpt_oss import SAP_RPT_OSS_Classifier

# Load SAP data export
df = pd.read_csv("sap_customers.csv")
X = df.drop(columns=["CHURN_STATUS"])
y = df["CHURN_STATUS"]

# Split data
X_train, X_test = X[:400], X[400:]
y_train, y_test = y[:400], y[400:]

# Initialize and predict
clf = SAP_RPT_OSS_Classifier(max_context_size=4096, bagging=4)
clf.fit(X_train, y_train)

predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)

Regression (Delivery Delay Days, Demand Quantity)

from sap_rpt_oss import SAP_RPT_OSS_Regressor

reg = SAP_RPT_OSS_Regressor(max_context_size=4096, bagging=4)
reg.fit(X_train, y_train)
predictions = reg.predict(X_test)

Core Workflow

  1. Extract SAP data → Export to CSV from relevant tables
  2. Prepare dataset → Include 50-500 rows with known outcomes
  3. Rename fields → Use semantic names (see Data Preparation)
  4. Run prediction → Fit on training data, predict on new data
  5. Interpret results → Probabilities for classification, values for regression

SAP Use Cases

See references/sap-use-cases.md for detailed extraction queries:

  • FI-AR: Payment default probability (BSID, BSAD, KNA1)
  • FI-GL: Journal entry anomaly detection (ACDOCA, BKPF)
  • SD: Delivery delay prediction (VBAK, VBAP, LIKP)
  • SD: Customer churn likelihood (VBRK, VBRP, KNA1)
  • MM: Vendor performance scoring (EKKO, EKPO, EBAN)
  • PP: Production delay risk (AFKO, AFPO)

Data Preparation

Semantic Column Names (Important!)

RPT-1-OSS uses an LLM to embed column names and values. Descriptive names improve accuracy:

# Good: Model understands business context
CUSTOMER_CREDIT_LIMIT, DAYS_SINCE_LAST_ORDER, PAYMENT_DELAY_DAYS

# Bad: Generic names lose semantic value
COL1, VALUE, FIELD_A

Use scripts/prepare_sap_data.py to rename SAP technical fields:

from scripts.prepare_sap_data import SAPDataPrep

prep = SAPDataPrep()
df = prep.rename_sap_fields(df)  # BUKRS → COMPANY_CODE, etc.

Dataset Size

  • Minimum: 50 training examples
  • Recommended: 200-500 examples
  • Maximum context: 8192 rows (GPU dependent)

Scripts

  • scripts/rpt1_oss_predict.py - Local model prediction wrapper
  • scripts/prepare_sap_data.py - SAP field renaming and SQL templates
  • scripts/batch_predict.py - Chunked processing for large datasets

Alternative: RPT Playground API

For users with SAP access, the closed-source RPT-1 is available via API:

from scripts.rpt1_api import RPT1Client

client = RPT1Client(token="YOUR_RPT_TOKEN")  # Get from rpt-playground.sap.com
result = client.predict(data="data.csv", target_column="TARGET", task_type="classification")

See references/api-reference.md for RPT Playground API documentation.

Limitations

  • Tabular data only (no images, text documents)
  • Requires labeled examples for in-context learning
  • First prediction is slow (model loading)
  • GPU strongly recommended for production use