Skip to main content
TabPFN works well out of the box and handles many tasks natively that traditional ML pipelines require. We recommend feeding in data as raw as possible, as additional processing can hurt performance. Avoid additional scaling with StandardScaler / MinMaxScaler, imputation of missing values, or one-hot encoding of categoricals.

Escalation Path

When the default TabPFN does not meet your needs, try these approaches in roughly this order:
1

Feature engineering

Add domain features, extract datetime components, encode text meaningfully. This is usually the highest-impact change. See Feature Engineering.
2

Feature selection

If you have many features (100+), try filtering to the most informative ones. See Feature Selection.
3

Metric tuning

Use eval_metric and tuning_config to optimize for your specific evaluation metric. See Model Parameters.
4

Preprocessing transforms

Experiment with different PREPROCESS_TRANSFORMS and target transforms. See Preprocessing Transforms.
5

Hyperparameter optimization

Use the HPO extension for automated search over the TabPFN hyperparameter space.
6

AutoTabPFN ensembles

Use the AutoTabPFN extension for an automatically tuned ensemble of TabPFN models. Typically gives a few percent boost.
7

Fine-tuning

Fine-tune the pretrained model on your data when you have a specialized domain or distribution shift.

Guides

Feature Engineering

Encode domain knowledge into features that TabPFN cannot learn from raw columns alone.

Feature Selection

Reduce feature count to improve attention efficiency and predictive power.

Preprocessing Transforms

Configure TabPFN’s internal preprocessing pipeline for maximum ensemble diversity.

Model Parameters

Tune softmax temperature, metric optimization, and class imbalance handling.

Fine-Tuning

Adapt TabPFN’s pretrained weights to your domain.

AutoTabPFN Ensembles

Automated ensembling for maximum accuracy.

Hyperparameter Optimization

Bayesian optimization over TabPFN’s hyperparameter space.