TabPFNClassifier’s output probabilities or tune the decision thresholds to optimize for specific evaluation metrics. This matters when your metric is not aligned with a maximum probability prediction, e.g. when evaluating F1 or balanced accuracy.
Automated Tuning
We allow you to automatically optimize your predictions by specifying the metric you care about (eval_metric) and enabling tuning (by specifying a tuning_config) during initialization. When you call .fit(), the classifier will:
- Automatically split the training data into an internal training and validation set.
- Fit on the internal training set and make predictions on the validation set.
- Find the optimal prediction settings (like temperature and decision thresholds) that maximize your
eval_metricon the validation set. - Store these optimal settings and apply them automatically during
.predict()and.predict_proba().
- Temperature Calibration
- What it is: This finds the optimal
softmax_temperatureto make the model’s probabilities as accurate as possible. - Parameter:
tuning_config={"calibrate_temperature": True} - Best for Metrics: Log-Loss, Brier Score, or any metric that relies on well-calibrated probabilities.
- Note: This overrides the manually-set
softmax_temperatureparameter.
- Decision Threshold Tuning
- What it is: This finds the best decision boundary for each class. Instead of just
argmax(probabilities), it finds optimal thresholds (e.g., “predict class 1 if prob > 0.4, not 0.5”). - Parameter:
tuning_config={"tune_decision_thresholds": True} - Best for Metrics: Threshold-sensitive metrics like F1 Score, Balanced Accuracy, Precision, or Recall, especially on imbalanced datasets.
Balancing for Balanced Metrics
Settingbalance_probabilities=True is a simple heuristic for imbalanced data. It re-weights the output probabilities based on the class frequencies in the training data.
This is a simpler, non-optimized alternative to eval_metric="balanced_accuracy" with tune_decision_thresholds=True.