Skip to main content
Thinking mode is only available through the TabPFN API (via tabpfn-client or the REST endpoints). It is not part of the open-source tabpfn package.
Thinking mode introduces test-time compute scaling to TabPFN. Instead of a single forward pass, Thinking mode for TabPFN-3-Plus applies additional inference-time computation to push prediction quality further. TabPFN-3-Plus with thinking mode beats all non-TabPFN models by over 200 Elo on the standard TabArena benchmark, rising to 420 Elo on the largest data subset. It outperforms AutoGluon 1.5 Extreme in 82% of the cases, in less than a tenth of its runtime, without using LLMs, real data, internet search, or any other model besides TabPFN. Check out our model reports for details. Thinking mode builds on top of TabPFN-3-Plus with native text-feature support, so a single call can handle mixed numerical, categorical, and text columns.
Thinking fits consume a monthly quota separate from prediction tokens (default: 20 per month). Exceeding the limit returns HTTP 429. See API metering for current limits and how to request more.

When to use it

SituationGuidance
High-ROI use cases where small accuracy gains matter (finance, healthcare)Use thinking mode — the one-time fit cost fuels recurring predictions
Pipelines with highly unstable ground truth dataLeave thinking off; use default TabPFN fit

Quickstart

Thinking mode is available only in the API with TabPFN-3-Plus. Local execution is not supported.
from tabpfn_client import TabPFNClassifier

clf = TabPFNClassifier(
    thinking_mode=True,
    thinking_effort="high",
    thinking_metric="accuracy",
)
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
probs = clf.predict_proba(X_test)

Choosing effort and metric

thinking_effort controls how much compute is spent during fitting. thinking_metric sets the target. Effort levels:
Use caseRecommended effort
You need maximum accuracy and can trade off fit timeEnable Thinking mode and set thinking_effort="high"
You want a balance between quality and speedStart with default Thinking mode (thinking_effort="medium")
Supported metrics:
TaskMetrics
Classificationaccuracy, log_loss, roc_auc
Regressionrmse, mae
You can also set thinking_timeout_s to cap the wall-clock time spent on optimization.

REST API

Call POST /tabpfn/fit with the thinking parameters in the JSON body:
{
  "train_set_upload_id": "<your-upload-id>",
  "task": "classification",
  "thinking_effort": "high",
  "thinking_timeout_s": 300,
  "thinking_effort_metric": "log_loss"
}
See the API reference for the full endpoint documentation and upload flow.

Parameters

ParameterTypeDefaultDescription
thinking_modeboolFalseEnable thinking mode. Setting thinking_effort also enables it.
thinking_effort"medium" or "high"NoneControls the effort & compute spent at fit time. Higher tends to give better results.
thinking_timeout_sfloatNoneWall-clock time budget in seconds
thinking_metricstrNoneTarget metric to optimize (see supported metrics above)
On the REST API, the metric parameter is called thinking_effort_metric.

Limits

Thinking fits have a separate monthly quota from prediction tokens. The default is 20 thinking fits per month. When the quota is exhausted, POST /tabpfn/fit with thinking enabled returns HTTP 429. If you need higher limits, see API metering for details or contact Prior Labs.

TabPFN-3 changelog

Full release notes including thinking mode.

API metering

Token budgets, thinking fit limits, and usage tracking.

Classification

Binary and multi-class classification guide.

Regression

Point estimates, quantiles, and full distributions.