Improving Performance

TabPFN works well out of the box and handles many tasks natively that traditional ML pipelines require. We recommend feeding in data as raw as possible, as additional processing can hurt performance. Avoid additional scaling with StandardScaler / MinMaxScaler, imputation of missing values, or one-hot encoding of categoricals.

Escalation Path

When the default TabPFN does not meet your needs, try these approaches in roughly this order:

Feature engineering

Add domain features, extract datetime components, encode text meaningfully. This is usually the highest-impact change. See Feature Engineering.

Feature selection

If you have many features (100+), try filtering to the most informative ones. See Feature Selection.

Metric tuning

Use eval_metric and tuning_config to optimize for your specific evaluation metric. See Model Parameters.

Preprocessing transforms

Experiment with different PREPROCESS_TRANSFORMS and target transforms. See Preprocessing Transforms.

Fine-tuning

Fine-tune the pretrained model on your data when you have a specialized domain or distribution shift.

Guides

Feature Engineering

Encode domain knowledge into features that TabPFN cannot learn from raw columns alone.

Feature Selection

Reduce feature count to improve attention efficiency and predictive power.

Preprocessing Transforms

Configure TabPFN’s internal preprocessing pipeline for maximum ensemble diversity.

Model Parameters

Tune softmax temperature, metric optimization, and class imbalance handling.

Fine-Tuning

Adapt TabPFN’s pretrained weights to your domain.

Benchmarking TabPFN Feature Engineering

⌘I

​Escalation Path

​Guides