Skip to main content
TabPFN is available on the AWS SageMaker Marketplace. By subscribing to a listing, you provision and run TabPFN inside your own AWS account — data never leaves your private AWS network. Two listings are available:
  • TabPFN-3-Plus (Public, free for non-commercial use) — recommended.
  • TabPFN-2.5 (Public, free for non-commercial use) — still available.

TabPFN-3-Plus

Using TabPFN-3-Plus on the AWS SageMaker Marketplace is free of charge; you only pay for the underlying AWS compute. Model weights are released under the TABPFN-3.0 License v1.0, which is permissive for research and internal evaluation.For production / commercial use, we offer a Commercial Enterprise License that includes dedicated support, integration tooling, and other internal models. Contact sales@priorlabs.ai for commercial licensing.

What’s bundled in the listing

TabPFN-3-Plus on SageMaker bundles every base TabPFN-3 capability:
  • State-of-the-art classification and regression in a single forward pass — tops the public TabArena benchmark for classification and regression.
  • Scales to 1M training rows at 200 features, 100k at 2000 features, or 1k at 20000 features.
  • Native handling of mixed feature types — numerical, categorical, and free-text (string-valued) columns — plus missing values, outliers, and uninformative columns. No bespoke preprocessing required.
  • Up to 20× faster than TabPFN-2.5 at scale; 1M rows fits on a single H100.
  • Native many-class classification (up to 160 classes).
  • Thinking mode — additional inference-time compute for higher prediction quality — is also available via a separate enterprise listing (see below).

Subscribe and deploy

Deploying on Sagemaker consists of subscribing to the listing, creating a deployable model, and creating an inference endpoint:
  1. Open the TabPFN-3-Plus Marketplace listing and click View Purchase OptionsSubscribe.
  2. Open the Amazon SageMaker AI console, switch to the AWS region where you want to deploy, and open Deployments & inferenceDeployable modelsCreate model. Under Container definition, choose Use a model package subscription, and select TabPFN-3-Plus. Create the model.
  3. Select the model just created, and choose Create endpoint. Pick an endpoint name, and create or pick an existing endpoint configuration. Optionally, when creating the endpoint configuration, we recommend configuring Async invocation config in the same step — this enables use of use_async=True in the Python SDK, which is required to use the full TabPFN-3-Plus limits (see below).
  4. Under Variants, you can set the instance type (ActionsEdit). You must use a GPU instance such as ml.g5.xlarge. CPU instances like the default ml.p5.48xlarge will not work.
  5. After clicking Submit, SageMaker provisions the endpoint. The endpoint is ready when its status transitions to InService (typically 6–10 minutes).

Using your endpoint

You can use tabpfn-client (recommended), or invoke the endpoint directly, for example using boto3:

Async inference for larger datasets

SageMaker real-time invocations cap a single request at 6 MB payload and 60 s processing time. To use the full TabPFN-3-Plus limits — up to 1M rows × 200 features, 100K × 2,000, or 1K × 20,000 — we recommend deploying the endpoint with AsyncInferenceConfig and using the SDK’s use_async=True switch. Async invocations support up to 1 GB payload and 60 min processing.
from tabpfn_client.sagemaker import TabPFNClassifier

clf = TabPFNClassifier(
    endpoint_name="your-tabpfn-3-plus-async-endpoint",
    region_name="us-east-1",
    use_async=True,
    s3_bucket="your-async-io-bucket",        # bucket the endpoint role can read+write
)
clf.fit(X_train, y_train)
proba = clf.predict_proba(X_test)
The SDK handles the S3-staging-input / poll-output round-trip transparently; the call shape stays the same. See the sample notebook in the tabpfn-client repo for the end-to-end deploy → invoke → teardown walkthrough.

Thinking mode

Thinking mode on SageMaker is available through a separate enterprise listing. Thinking mode applies additional inference-time computation on top of TabPFN-3-Plus to push prediction quality further — on the public TabArena benchmark it beats every non-TabPFN model by over 200 Elo overall and by 420 Elo on the largest data subset. To get access for your AWS account, contact sales@priorlabs.ai.
Once you’ve been granted access, the Thinking listing surfaces the same tabpfn-client SDK; the only change is an optional thinking_effort constructor kwarg on TabPFNClassifier / TabPFNRegressor that engages Thinking mode per request. Omitting it falls back to the standard TabPFN-3-Plus single-forward-pass behavior, so a single endpoint serves both modes. See the Thinking mode page for capability details, benchmark numbers, and parameter reference.

Limitations

  • Payload caps. Real-time invocations are capped by AWS at 6 MB payload / 60 s processing. Use async inference (see above) for anything bigger or slower.
  • GPU-only. The model package only declares GPU instance types as supported. The SageMaker console default (ml.m4.xlarge, CPU) is rejected with a ValidationException; see the Marketplace listing’s Recommended instance types section for the GPU-only allowlist.

Sample notebook

End-to-end deploy → invoke → teardown walkthrough using tabpfn-client.sagemaker.

tabpfn-client repo

Source for the Python SDK that wraps SageMaker endpoints with the scikit-learn surface.

TabPFN-2.5

The earlier TabPFN-2.5 SageMaker listing is still available. New deployments are recommended on TabPFN-3-Plus above.

Subscribe and deploy

  1. Open the TabPFN-2.5 Marketplace listing and click View Purchase OptionsSubscribe.
  2. In the SageMaker AI console, navigate to AWS Marketplace resourcesAWS Marketplace subscriptionsTabPFN-2.5, then Actions → Create endpoint. TabPFN-2.5 requires at least one NVIDIA T4 or P4 GPU instance; see the listing’s Recommended instance types for the full list.

Example code

Step-by-step instructions for running inference with TabPFN-2.5 on SageMaker.

Getting Started Notebook

A guided notebook demonstrating how to use TabPFN-2.5 for inference on SageMaker.

Limitations

TabPFN-2.5 supports two input formats for inference:
  • application/json — a JSON-encoded request body.
  • multipart/form-data — containing the dataset as Parquet files.
Both formats must remain within SageMaker’s 6 MB real-time payload limit. Because Parquet is compressed, the multipart/form-data option generally allows you to send more rows or features within the same size constraint.