Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt

Use this file to discover all available pages before exploring further.

Thinking mode is only available through the TabPFN API (via tabpfn-client or the REST endpoints). It is not part of the open-source tabpfn package.
Thinking mode introduces test-time compute scaling to TabPFN. Instead of a single forward pass, Thinking mode for TabPFN-3-Plus applies additional inference-time computation to push prediction quality further. TabPFN-3-Plus with thinking mode beats all non-TabPFN models by over 200 Elo on the standard TabArena benchmark, rising to 420 Elo on the largest data subset. It outperforms AutoGluon 1.5 Extreme in 82% of the cases, in less than a tenth of its runtime, without using LLMs, real data, internet search, or any other model besides TabPFN. Thinking mode builds on top of TabPFN-3-Plus with native text-feature support, so a single call can handle mixed numerical, categorical, and text columns.
Thinking fits consume a monthly quota separate from prediction tokens (default: 20 per month). Exceeding the limit returns HTTP 429. See API metering for current limits and how to request more.

When to use it

SituationGuidance
High-ROI use cases where small accuracy gains matter (finance, healthcare)Use thinking mode — the one-time fit cost fuels recurring predictions
Pipelines with highly unstable ground truth dataLeave thinking off; use default TabPFN fit
Prediction latency remains close to vanilla TabPFN — you pay the optimization cost once at fit time, and all subsequent predictions remain fast.

Quickstart

Thinking mode is available only in the API with TabPFN-3-Plus. Local execution is not supported.
from tabpfn_client import TabPFNClassifier

clf = TabPFNClassifier(
    thinking_mode=True,
    thinking_effort="high",
    thinking_metric="accuracy",
)
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
probs = clf.predict_proba(X_test)

Choosing effort and metric

thinking_effort controls how much compute is spent during fitting. thinking_metric sets the target. Effort levels:
Use caseRecommended effort
You need maximum accuracy and can trade off fit timeEnable Thinking mode and set thinking_effort="high"
You want a balance between quality and speedStart with default Thinking mode (thinking_effort="medium")
Supported metrics:
TaskMetrics
Classificationaccuracy, log_loss, roc_auc
Regressionrmse, mae
You can also set thinking_timeout_s to cap the wall-clock time spent on optimization.

REST API

Call POST /tabpfn/fit with the thinking parameters in the JSON body:
{
  "train_set_upload_id": "<your-upload-id>",
  "task": "classification",
  "thinking_effort": "high",
  "thinking_timeout_s": 300,
  "thinking_effort_metric": "log_loss"
}
See the API reference for the full endpoint documentation and upload flow.

Parameters

ParameterTypeDefaultDescription
thinking_modeboolFalseEnable thinking mode. Setting thinking_effort also enables it.
thinking_effort"medium" or "high"NoneControls the effort & compute spent at fit time. Higher tends to give better results.
thinking_timeout_sfloatNoneWall-clock time budget in seconds
thinking_metricstrNoneTarget metric to optimize (see supported metrics above)
On the REST API, the metric parameter is called thinking_effort_metric.

Limits

Thinking fits have a separate monthly quota from prediction tokens. The default is 20 thinking fits per month. When the quota is exhausted, POST /tabpfn/fit with thinking enabled returns HTTP 429. If you need higher limits, see API metering for details or contact Prior Labs.

TabPFN-3 changelog

Full release notes including thinking mode.

API metering

Token budgets, thinking fit limits, and usage tracking.

Classification

Binary and multi-class classification guide.

Regression

Point estimates, quantiles, and full distributions.