Skip to main content
You can calibrate the TabPFNClassifier’s output probabilities or tune the decision thresholds to optimize for specific evaluation metrics. This matters when your metric is not aligned with a maximum probability prediction, e.g. when evaluating F1 or balanced accuracy.

Automated Tuning

We allow you to automatically optimize your predictions by specifying the metric you care about (eval_metric) and enabling tuning (by specifying a tuning_config) during initialization. When you call .fit(), the classifier will:
  1. Automatically split the training data into an internal training and validation set.
  2. Fit on the internal training set and make predictions on the validation set.
  3. Find the optimal prediction settings (like temperature and decision thresholds) that maximize your eval_metric on the validation set.
  4. Store these optimal settings and apply them automatically during .predict() and .predict_proba().
See below for a full example:
# Initialize the model with `eval_metric` and `tuning_config`
model = TabPFNClassifier(
    eval_metric="f1",
    tuning_config={
        "calibrate_temperature": True,
        "tune_decision_thresholds": True,
    }
)

# .fit() runs the automated tuning process
model.fit(X_train, y_train)

# .predict() and .predict_proba() will use the tuned settings
preds = model.predict(X_test)
What is Being Tuned?
  1. Temperature Calibration
  • What it is: This finds the optimal softmax_temperature to make the model’s probabilities as accurate as possible.
  • Parameter: tuning_config={"calibrate_temperature": True}
  • Best for Metrics: Log-Loss, Brier Score, or any metric that relies on well-calibrated probabilities.
  • Note: This overrides the manually-set softmax_temperature parameter.
  1. Decision Threshold Tuning
  • What it is: This finds the best decision boundary for each class. Instead of just argmax(probabilities), it finds optimal thresholds (e.g., “predict class 1 if prob > 0.4, not 0.5”).
  • Parameter: tuning_config={"tune_decision_thresholds": True}
  • Best for Metrics: Threshold-sensitive metrics like F1 Score, Balanced Accuracy, Precision, or Recall, especially on imbalanced datasets.

Balancing for Balanced Metrics

Setting balance_probabilities=True is a simple heuristic for imbalanced data. It re-weights the output probabilities based on the class frequencies in the training data. This is a simpler, non-optimized alternative to eval_metric="balanced_accuracy" with tune_decision_thresholds=True.
# Use the built-in heuristic for balanced metrics
balanced_model = TabPFNClassifier(balance_probabilities=True)
balanced_model.fit(X_train, y_train)
balanced_preds = balanced_model.predict(X_test)