Skip to main content
TabPFN provides a powerful interface for classification on tabular data, supporting binary, multi-class, and even text-enhanced inputs - all without model training or hyperparameter tuning.

Key Capabilities

  • Zero-shot classification - Predicts instantly through a single forward pass, no gradient descent required.
  • Calibrated probabilities - Outputs reliable class probabilities for uncertainty-aware decision-making.
  • Robust to noise and missing data - Handles categorical variables, uninformative columns, and outliers natively.
  • Text-aware inputs - This API only feature automatically detects textual fields, extracts embeddings, and includes them in the forward pass for classification.
  • Seamless multi-class support - Works out-of-the-box for binary and multi-class datasets.

Getting Started

from tabpfn import TabPFNClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = TabPFNClassifier(device="cuda")
model.fit(X_train, y_train)
preds = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, preds))
For datasets with more than 10 classes, use the ManyClassClassifier extension - available via Prior Labs Extensions. It extends TabPFN to large multi-class problems using Error-Correcting Output Codes (ECOC), which:
  • Encodes the multi-class task into multiple binary or small-class subtasks.
  • Trains the base TabPFNClassifier on these subtasks.
  • Decodes the results back into the original class space.
This approach enables TabPFN to scale to hundreds of classes efficiently while maintaining accuracy and calibration.
from tabpfn_extensions.manyclass_classifier import ManyClassClassifier
from tabpfn import TabPFNClassifier

estimator = TabPFNClassifier(device="cuda")
classifier = ManyClassClassifier(base_estimator=estimator)
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)
Each predict() re-encodes the full training set in-context. Performance depends on:
  • Dataset size (e.g. TabPFN-2 is optimized for ≤10k samples)
  • Fit mode (low_memory vs fit_with_cache)
  • Device (CPU/MPS/GPU)
classifier = TabPFNClassifier(fit_mode="fit_with_cache", device="cuda")
fit_with_cache retains preprocessing and significantly speeds up repeated predictions.
TabPFNClassifier now supports native handling of missing values pd.NA. Older versions required manual preprocessing, but the current release integrates this automatically.We recommend upgrading to the latest version:
pip install -U tabpfn
After each inference, TabPFN’s inference engine automatically moves back to CPU to free GPU memory. While this conserves resources, it slows down repeated predictions and can confuse users expecting persistent GPU behavior.
# Keeps model on GPU or MPS between predictions
estimator.executor_.model = estimator.executor_.model.to("cuda")
I