Skip to main content
TabPFN provides powerful regression capabilities for predicting continuous numerical values with high accuracy and interpretability.

Key Capabilities

  • Zero-shot regression - Predicts continuous targets instantly in a single forward pass.
  • Calibrated uncertainty - Produces reliable mean, median, or quantile-based predictions for confidence estimation.
  • Robust to real-world noise - Handles outliers, missing values, and mixed data types.
  • Text-aware inputs - Detects textual columns, extracts embeddings, and integrates them into predictions.
  • Minimal preprocessing - Works directly with raw numerical and categorical data.
  • Fast inference - Zero-shot predictions complete in seconds, even on mid-range GPUs.

Getting Started

First, load your training and test datasets. The training dataset is used for in-context learning - it’s passed directly into the model’s forward pass, allowing TabPFN to condition its predictions on the data distribution without performing gradient-based optimization. The test dataset is then used for making predictions, leveraging what the model inferred from the context provided by your training data.
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

# Load example dataset
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
Next, run inference using TabPFN.
from tabpfn import TabPFNRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Initialize and fit
model = TabPFNRegressor(device="auto")
model.fit(X_train, y_train)

# Predict
preds = model.predict(X_test)

# Evaluate
print("MSE:", mean_squared_error(y_test, preds))
print("MAE:", mean_absolute_error(y_test, preds))
print("R²:", r2_score(y_test, preds))

Advanced Features

Auto fine-tuning

The AutoTabPFNRegressor automatically builds ensembles of strong models to maximize accuracy:
  • Runs an automated hyperparameter search for optimal settings.
  • Builds a Post-Hoc Ensemble (PHE) for improved calibration and performance.
  • Balances accuracy vs. latency with the max_time parameter.
from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNRegressor
from sklearn.metrics import mean_squared_error, r2_score

reg = AutoTabPFNRegressor(max_time=30)  # runtime budget in seconds
reg.fit(X_train, y_train)
preds = reg.predict(X_test)

print("MSE:", mean_squared_error(y_test, preds))
print("R²:", r2_score(y_test, preds))

Quantile Regression

# Get predictions for specific quantiles
quantiles = regressor.predict_quantiles(X, quantiles=[0.1, 0.5, 0.9])
TabPFN provides a full predictive distribution, enabling loss-aware predictions without retraining. You can compute the Bayes-optimal point prediction that minimizes the expected custom loss. This method gives flexible custom-loss predictions without modifying the model.
out = reg.predict(X, output_type="full")
criterion, logits = out["criterion"], out["logits"]

q = np.linspace(0.01, 0.99, 101)
samples = np.stack(
    [criterion.icdf(logits, float(x)).cpu().numpy() for x in q], axis=1
)

denom = np.maximum(np.abs(samples), 1e-6)
diffs = np.abs(samples[:, :, None] - samples[:, None, :])
exp_mape = (diffs / denom[:, :, None]).mean(axis=1)
best_idx = exp_mape.argmin(axis=1)
y_hat = samples[np.arange(samples.shape[0]), best_idx]
This is due to cached tensors not being cleared between predictions. It occurs primarily in regression because distributional predictions retain more GPU buffers.
for _ in range(10):
    y_pred = reg.predict(X)
    torch.cuda.empty_cache()
This prevents gradual VRAM growth until automatic cleanup is added in future releases.
I