AutoTabPFN Ensembles

The AutoTabPFN Ensembles Extension leverages AutoGluon to perform hyperparameter search and ensembling. The AutoTabPFN ensemble extension automatically explores the TabPFN hyperparameter space, trains multiple candidate models, and constructs an optimized weighted ensemble from the best performers. How it works:

Randomly sample configurations from the TabPFN hyperparameter space.
Train a TabPFN model for each sampled configuration (n_ensemble_models total).
Use AutoGluon to evaluate, select, and weight the top-performing models.
Combine them into a final, optimized meta-ensemble of TabPFNs.

Result: an automatically tuned ensemble of TabPFN models optimized for your dataset.

Getting Started

To install the extension, include the post_hoc_ensembles extension:

pip install "tabpfn-extensions[post_hoc_ensembles]"

This installs all dependencies, including AutoGluon and the TabPFN core library.

Then, build and AutoTabPFN ensemble for classification like the following:

from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, accuracy_score
import numpy as np

# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# Initialize AutoTabPFN with 300-second search time
# Increase the search time over time as this will lead to a better fitted model
clf = AutoTabPFNClassifier(device="auto", max_time=300)
clf.fit(X_train, y_train)

# Predict probabilities and evaluate
probas = clf.predict_proba(X_test)
preds = np.argmax(probas, axis=1)

print("ROC AUC:", roc_auc_score(y_test, probas[:, 1]), "Accuracy:", accuracy_score(y_test, preds))

Core Parameters

The interface is sklearn compatible and built around two parameter sets: AutoGluon control parameters and TabPFN model parameters. See the github for more details.

AutoGluon Parameters

Parameter	Description
`presets`	Controls the trade-off between training time and predictive accuracy. Common options: `'medium_quality'`, `'best_quality'`.
`phe_init_args`	Dictionary of arguments passed directly to `AutoGluon.TabularPredictor()` for advanced customization.
`phe_fit_args`	Arguments passed to `AutoGluon.TabularPredictor.fit()` to control training specifics such as early stopping, validation splits, and resource usage.

TabPFN Model Parameters

Parameter	Description
`n_estimators`	Number of internal transformers to ensemble within each individual TabPFN model. Increasing this can boost performance at the cost of compute time. `(int, default=8)`
`balance_probabilities`	Balances class probabilities for imbalanced datasets. Recommended for skewed classification tasks. `(bool, default=False)`
`ignore_pretraining_limits`	Bypasses TabPFN’s dataset size and feature limits (50k samples / 2k features). Use with caution - performance beyond these limits may degrade. `(bool, default=False)`

Best Practices

Start small: Try max_time=300 to quickly explore configurations.
Use for accuracy-critical tasks: The ensemble adds compute cost but yields higher precision and calibration.

Getting Started

Capabilities

Extensions

Integrations

Use Cases

AutoTabPFN Ensembles

Getting Started

Core Parameters

AutoGluon Parameters

TabPFN Model Parameters

Best Practices

Getting Started

Capabilities

Extensions

Integrations

Use Cases

​Getting Started

​Core Parameters

​AutoGluon Parameters

​TabPFN Model Parameters

​Best Practices

Getting Started

Core Parameters

AutoGluon Parameters

TabPFN Model Parameters

Best Practices