AutoTabPFN Ensembles

AutoTabPFN Ensembles provide an automated, high-performance system for building Post-Hoc Ensembles (PHE) of TabPFN models. This extension uses AutoGluon to perform hyperparameter search and ensembling - delivering state-of-the-art accuracy on tabular classification and regression tasks with minimal configuration.

Overview

The AutoTabPFN ensemble framework automatically explores the TabPFN hyperparameter space, trains multiple candidate models, and builds an optimized weighted ensemble from the best performers. This process combines TabPFN’s zero-shot modeling with AutoGluon’s ensemble optimization, resulting in robust predictions that outperform single-model configurations. How it works:

Random search across the TabPFN hyperparameter space.
A fixed number of configurations (n_ensemble_models) are sampled.
A TabPFN model is trained for each sampled configuration.
AutoGluon selects and weights the best-performing models into a final ensemble.

The result: a meta-ensemble of TabPFNs tuned automatically for your dataset. To install the extension, include the post_hoc_ensembles extras module:

pip install "tabpfn-extensions[post_hoc_ensembles]"

This installs all dependencies, including AutoGluon and the TabPFN core library.

Getting Started

Here’s a quick example showing how to build an AutoTabPFN ensemble for classification.

from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, accuracy_score
import numpy as np

# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# Initialize AutoTabPFN with 30-second search time
clf = AutoTabPFNClassifier(device="auto", max_time=30)
clf.fit(X_train, y_train)

# Predict probabilities and evaluate
probas = clf.predict_proba(X_test)
preds = np.argmax(probas, axis=1)

print("ROC AUC:", roc_auc_score(y_test, probas[:, 1]), "Accuracy:", accuracy_score(y_test, preds))

Core Parameters

The interface is scikit-learn compatible and built around two parameter sets: AutoGluon control parameters and TabPFN model parameters.

AutoGluon Parameters

Parameter	Description
`presets`	Controls the trade-off between training time and predictive accuracy. Common options: `'medium_quality'`, `'best_quality'`.
`phe_init_args`	Dictionary of arguments passed directly to `AutoGluon.TabularPredictor()` for advanced customization.
`phe_fit_args`	Arguments passed to `AutoGluon.TabularPredictor.fit()` to control training specifics such as early stopping, validation splits, and resource usage.

TabPFN Model Parameters

Parameter	Description
`n_estimators`	Number of internal transformers to ensemble within each individual TabPFN model. Increasing this can boost performance at the cost of compute time. `(int, default=16)`
`balance_probabilities`	Balances class probabilities for imbalanced datasets. Recommended for skewed classification tasks. `(bool, default=False)`
`ignore_pretraining_limits`	Bypasses TabPFN’s dataset size and feature limits (10k samples / 500 features). Use with caution - performance beyond these limits may degrade. `(bool, default=False)`

Best Practices

Start small: Try max_time=30 to quickly explore configurations.
Use for accuracy-critical tasks: The ensemble adds compute cost but yields higher precision and calibration.
Classification vs Regression: Works seamlessly for both AutoTabPFNClassifier and AutoTabPFNRegressor.
Parallelism: AutoGluon handles internal parallel execution - no need for manual threading.
Reproducibility: Results are deterministic when a random seed is specified.

Getting Started

Models

Capabilities

Extensions

Use Cases

AutoTabPFN Ensembles

Overview

Getting Started

Core Parameters

AutoGluon Parameters

TabPFN Model Parameters

Best Practices

Getting Started

Models

Capabilities

Extensions

Use Cases

​Overview

​Getting Started

​Core Parameters

​AutoGluon Parameters

​TabPFN Model Parameters

​Best Practices

Overview

Getting Started

Core Parameters

AutoGluon Parameters

TabPFN Model Parameters

Best Practices