Skip to main content
Fine-tuning enables you to optimize TabPFN’s pretrained foundation models to your own datasets. It works by updating the pretrained transformer parameters by training with a user-provided dataset using gradient descent. This process retains TabPFN’s learned priors while aligning it more closely with the target data distribution. You can fine-tune both: Fine-tuning can help especially when:
  • Your data represents an edge case or niche distribution not well-covered by TabPFN’s priors.
  • You want to specialize the model for a single domain (e.g., healthcare, finance, IoT sensors)
Recommended setup Fine-tuning requires GPU acceleration for efficient training.

Getting Started

The fine-tuning process is similar for classifiers and regressors:
  1. Prepare your dataset: Load, subset, and split your data into train and validation sets.
  2. Configure your model: Initialize a TabPFNClassifier or TabPFNRegressor with fine-tuning-specific hyperparameters. Use alow learning rate (e.g., 1e-5 to 1e-6) to avoid catastrophic forgetting.
  3. Create a fine-tuning dataloader: Use get_preprocessed_datasets() and meta_dataset_collator to prepare batches.
  4. Run the fine-tuning loop: Iterate for several epochs, performing backpropagation and optimizer updates.
  5. Evaluate performance: Clone the fine-tuned model and test it on held-out validation data.
from tabpfn import TabPFNClassifier
from tabpfn.utils import meta_dataset_collator
from tabpfn.finetune_utils import clone_model_for_evaluation
from torch.utils.data import DataLoader
from torch.optim import Adam
from sklearn.datasets import fetch_covtype
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, log_loss
from tqdm import tqdm
import torch, numpy as np

# --- Load and split data ---
X, y = fetch_covtype(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y)

# --- Configure model ---
clf = TabPFNClassifier(
    device="cuda" if torch.cuda.is_available() else "cpu",
    n_estimators=2,
    ignore_pretraining_limits=True,
    fit_mode="batched",
    differentiable_input=False,
)
optimizer = Adam(clf.model_.parameters(), lr=1e-5)

# --- Prepare datasets ---
training_datasets = clf.get_preprocessed_datasets(X_train, y_train, train_test_split, 10000)
dataloader = DataLoader(training_datasets, batch_size=1, collate_fn=meta_dataset_collator)
loss_fn = torch.nn.CrossEntropyLoss()

# --- Fine-tuning loop ---
for epoch in range(1, 6):
    for X_tr, X_te, y_tr, y_te, cat_ixs, confs in tqdm(dataloader, desc=f"Epoch {epoch}"):
        optimizer.zero_grad()
        clf.fit_from_preprocessed(X_tr, y_tr, cat_ixs, confs)
        preds = clf.forward(X_te, return_logits=True)
        loss = loss_fn(preds, y_te.to(clf.device))
        loss.backward()
        optimizer.step()

# --- Evaluation ---
eval_clf = clone_model_for_evaluation(clf, {}, TabPFNClassifier)
eval_clf.fit(X_train, y_train)
probs = eval_clf.predict_proba(X_test)

print("ROC AUC:", roc_auc_score(y_test, probs, multi_class="ovr", average="weighted"))
print("Log Loss:", log_loss(y_test, probs))

GitHub Examples

See more examples and fine-tuning utilities in our TabPFN GitHub repository.