- TabPFN-3-Plus (Public, free for non-commercial use) — recommended.
- TabPFN-2.5 (Public, free for non-commercial use) — still available.
TabPFN-3-Plus
Using TabPFN-3-Plus on the AWS SageMaker Marketplace is free of charge; you only pay for the underlying AWS compute. Model weights are released under the TABPFN-3.0 License v1.0, which is permissive for research and internal evaluation.For production / commercial use, we offer a Commercial Enterprise License that includes dedicated support, integration tooling, and other internal models. Contact sales@priorlabs.ai for commercial licensing.
What’s bundled in the listing
TabPFN-3-Plus on SageMaker bundles every base TabPFN-3 capability:- State-of-the-art classification and regression in a single forward pass — tops the public TabArena benchmark for classification and regression.
- Scales to 1M training rows at 200 features, 100k at 2000 features, or 1k at 20000 features.
- Native handling of mixed feature types — numerical, categorical, and free-text (string-valued) columns — plus missing values, outliers, and uninformative columns. No bespoke preprocessing required.
- Up to 20× faster than TabPFN-2.5 at scale; 1M rows fits on a single H100.
- Native many-class classification (up to 160 classes).
- Thinking mode — additional inference-time compute for higher prediction quality — is also available via a separate enterprise listing (see below).
Subscribe and deploy
Deploying on Sagemaker consists of subscribing to the listing, creating a deployable model, and creating an inference endpoint:- Open the TabPFN-3-Plus Marketplace listing and click View Purchase Options → Subscribe.
- Open the Amazon SageMaker AI console, switch to the AWS region where you want to deploy, and open Deployments & inference → Deployable models → Create model. Under Container definition, choose Use a model package subscription, and select TabPFN-3-Plus. Create the model.
- Select the model just created, and choose Create endpoint. Pick an endpoint name, and create or pick an existing endpoint configuration. Optionally, when creating the endpoint configuration, we recommend configuring Async invocation config in the same step — this enables use of
use_async=Truein the Python SDK, which is required to use the full TabPFN-3-Plus limits (see below). - Under Variants, you can set the instance type (Actions → Edit). You must use a GPU instance such as
ml.g5.xlarge. CPU instances like the defaultml.p5.48xlargewill not work. - After clicking Submit, SageMaker provisions the endpoint. The endpoint is ready when its status transitions to InService (typically 6–10 minutes).
Using your endpoint
You can use tabpfn-client (recommended), or invoke the endpoint directly, for example usingboto3:
- tabpfn-client (recommended)
- boto3
Install the SageMaker extra of The SDK mirrors the scikit-learn surface. AWS credentials are resolved through the standard
tabpfn-client:boto3 credential chain (env vars, ~/.aws/credentials, instance profile, SSO).Async inference for larger datasets
SageMaker real-time invocations cap a single request at 6 MB payload and 60 s processing time. To use the full TabPFN-3-Plus limits — up to 1M rows × 200 features, 100K × 2,000, or 1K × 20,000 — we recommend deploying the endpoint withAsyncInferenceConfig and using the SDK’s use_async=True switch. Async invocations support up to 1 GB payload and 60 min processing.
tabpfn-client repo for the end-to-end deploy → invoke → teardown walkthrough.
Thinking mode
Thinking mode on SageMaker is available through a separate enterprise listing. Thinking mode applies additional inference-time computation on top of TabPFN-3-Plus to push prediction quality further — on the public TabArena benchmark it beats every non-TabPFN model by over 200 Elo overall and by 420 Elo on the largest data subset. To get access for your AWS account, contact sales@priorlabs.ai.
tabpfn-client SDK; the only change is an optional thinking_effort constructor kwarg on TabPFNClassifier / TabPFNRegressor that engages Thinking mode per request. Omitting it falls back to the standard TabPFN-3-Plus single-forward-pass behavior, so a single endpoint serves both modes. See the Thinking mode page for capability details, benchmark numbers, and parameter reference.
Limitations
- Payload caps. Real-time invocations are capped by AWS at 6 MB payload / 60 s processing. Use async inference (see above) for anything bigger or slower.
- GPU-only. The model package only declares GPU instance types as supported. The SageMaker console default (
ml.m4.xlarge, CPU) is rejected with aValidationException; see the Marketplace listing’s Recommended instance types section for the GPU-only allowlist.
Sample notebook
End-to-end deploy → invoke → teardown walkthrough using
tabpfn-client.sagemaker.tabpfn-client repo
Source for the Python SDK that wraps SageMaker endpoints with the scikit-learn surface.
TabPFN-2.5
The earlier TabPFN-2.5 SageMaker listing is still available. New deployments are recommended on TabPFN-3-Plus above.Subscribe and deploy
- Open the TabPFN-2.5 Marketplace listing and click View Purchase Options → Subscribe.
- In the SageMaker AI console, navigate to AWS Marketplace resources → AWS Marketplace subscriptions → TabPFN-2.5, then Actions → Create endpoint. TabPFN-2.5 requires at least one NVIDIA T4 or P4 GPU instance; see the listing’s Recommended instance types for the full list.
Example code
Step-by-step instructions for running inference with TabPFN-2.5 on SageMaker.
Getting Started Notebook
A guided notebook demonstrating how to use TabPFN-2.5 for inference on SageMaker.
Limitations
TabPFN-2.5 supports two input formats for inference:application/json— a JSON-encoded request body.multipart/form-data— containing the dataset as Parquet files.
multipart/form-data option generally allows you to send more rows or features within the same size constraint.