Thinking mode

Thinking mode is only available through the TabPFN API (via tabpfn-client or the REST endpoints). It is not part of the open-source tabpfn package.

Thinking mode introduces test-time compute scaling to TabPFN. Instead of a single forward pass, Thinking mode for TabPFN-3-Plus applies additional inference-time computation to push prediction quality further. TabPFN-3-Plus with thinking mode beats all non-TabPFN models by over 200 Elo on the standard TabArena benchmark, rising to 420 Elo on the largest data subset. It outperforms AutoGluon 1.5 Extreme in 82% of the cases, in less than a tenth of its runtime, without using LLMs, real data, internet search, or any other model besides TabPFN. Thinking mode builds on top of TabPFN-3-Plus with native text-feature support, so a single call can handle mixed numerical, categorical, and text columns.

Thinking fits consume a monthly quota separate from prediction tokens (default: 20 per month). Exceeding the limit returns HTTP 429. See API metering for current limits and how to request more.

When to use it

Situation	Guidance
High-ROI use cases where small accuracy gains matter (finance, healthcare)	Use thinking mode — the one-time fit cost fuels recurring predictions
Pipelines with highly unstable ground truth data	Leave thinking off; use default TabPFN fit

Prediction latency remains close to vanilla TabPFN — you pay the optimization cost once at fit time, and all subsequent predictions remain fast.

Quickstart

Thinking mode is available only in the API with TabPFN-3-Plus. Local execution is not supported.

Classification
Regression

from tabpfn_client import TabPFNClassifier

clf = TabPFNClassifier(
    thinking_mode=True,
    thinking_effort="high",
    thinking_metric="accuracy",
)
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
probs = clf.predict_proba(X_test)

from tabpfn_client import TabPFNRegressor

reg = TabPFNRegressor(
    thinking_mode=True,
    thinking_effort="high",
    thinking_metric="rmse",
)
reg.fit(X_train, y_train)
preds = reg.predict(X_test)

Choosing effort and metric

thinking_effort controls how much compute is spent during fitting. thinking_metric sets the target. Effort levels:

Use case	Recommended effort
You need maximum accuracy and can trade off fit time	Enable Thinking mode and set `thinking_effort="high"`
You want a balance between quality and speed	Start with default Thinking mode (`thinking_effort="medium"`)

Supported metrics:

Task	Metrics
Classification	`accuracy`, `log_loss`, `roc_auc`
Regression	`rmse`, `mae`

You can also set thinking_timeout_s to cap the wall-clock time spent on optimization.

REST API

Call POST /tabpfn/fit with the thinking parameters in the JSON body:

{
  "train_set_upload_id": "<your-upload-id>",
  "task": "classification",
  "thinking_effort": "high",
  "thinking_timeout_s": 300,
  "thinking_effort_metric": "log_loss"
}

See the API reference for the full endpoint documentation and upload flow.

Parameters

Parameter	Type	Default	Description
`thinking_mode`	`bool`	`False`	Enable thinking mode. Setting `thinking_effort` also enables it.
`thinking_effort`	`"medium"` or `"high"`	`None`	Controls the effort & compute spent at fit time. Higher tends to give better results.
`thinking_timeout_s`	`float`	`None`	Wall-clock time budget in seconds
`thinking_metric`	`str`	`None`	Target metric to optimize (see supported metrics above)

On the REST API, the metric parameter is called thinking_effort_metric.

Limits

Thinking fits have a separate monthly quota from prediction tokens. The default is 20 thinking fits per month. When the quota is exhausted, POST /tabpfn/fit with thinking enabled returns HTTP 429. If you need higher limits, see API metering for details or contact Prior Labs.

TabPFN-3 changelog

Full release notes including thinking mode.

API metering

Token budgets, thinking fit limits, and usage tracking.

Classification

Binary and multi-class classification guide.

Regression

Point estimates, quantiles, and full distributions.

Getting Started

API

Capabilities

Agentic

Integrations

Cookbooks

Use Cases

Changelog

When to use it

Quickstart

Choosing effort and metric

REST API

Parameters

Limits

TabPFN-3 changelog

API metering

Classification

Regression

Getting Started

API

Capabilities

Agentic

Integrations

Cookbooks

Use Cases

Changelog

Documentation Index

​When to use it

​Quickstart

​Choosing effort and metric

​REST API

​Parameters

​Limits

TabPFN-3 changelog

API metering

Classification

Regression

When to use it

Quickstart

Choosing effort and metric

REST API

Parameters

Limits