> ## Documentation Index
> Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Amazon SageMaker

> Deploy TabPFN-3-Plus on Amazon SageMaker. Data stays inside your AWS account; the tabpfn-client Python SDK wraps the endpoint with a familiar scikit-learn surface.

TabPFN is available on the AWS SageMaker Marketplace. By subscribing to a listing, you provision and run TabPFN inside your own AWS account — data never leaves your private AWS network.

Two listings are available:

* **TabPFN-3-Plus** (Public, free for non-commercial use) — recommended.
* **TabPFN-2.5** (Public, free for non-commercial use) — still available.

## TabPFN-3-Plus

<Note>
  Using TabPFN-3-Plus on the AWS SageMaker Marketplace is free of charge; you only pay for the underlying AWS compute. Model weights are released under the [TABPFN-3.0 License v1.0](https://huggingface.co/Prior-Labs/tabpfn_3/blob/main/LICENSE), which is permissive for research and internal evaluation.

  For production / commercial use, we offer a *Commercial Enterprise License* that includes dedicated support, integration tooling, and other internal models. Contact [sales@priorlabs.ai](mailto:sales@priorlabs.ai) for commercial licensing.
</Note>

### What's bundled in the listing

TabPFN-3-Plus on SageMaker bundles every base TabPFN-3 capability:

* State-of-the-art classification and regression in a single forward pass — tops the public TabArena benchmark for classification and regression.
* Scales to **1M training rows at 200 features**, 100k at 2000 features, or 1k at 20000 features.
* Native handling of mixed feature types — numerical, categorical, **and free-text (string-valued) columns** — plus missing values, outliers, and uninformative columns. No bespoke preprocessing required.
* Up to 20× faster than TabPFN-2.5 at scale; 1M rows fits on a single H100.
* Native many-class classification (up to 160 classes).
* **[Thinking mode](#thinking-mode)** — additional inference-time compute for higher prediction quality — is also available via a separate enterprise listing (see below).

### Subscribe and deploy

Deploying on Sagemaker consists of subscribing to the listing, creating a deployable model, and creating an inference endpoint:

1. Open the [TabPFN-3-Plus Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-b6rjndzxhbqww) and click **View Purchase Options** → **Subscribe**.
2. Open the Amazon SageMaker AI console, switch to the AWS region where you want to deploy, and open **Deployments & inference** → **Deployable models** → **Create model**. Under **Container definition**, choose *Use a model package subscription*, and select **TabPFN-3-Plus**. Create the model.
3. Select the model just created, and choose **Create endpoint**. Pick an endpoint name, and create or pick an existing endpoint configuration. Optionally, when creating the endpoint configuration, we recommend configuring **Async invocation config** in the same step — this enables use of `use_async=True` in the Python SDK, which is required to use the full [TabPFN-3-Plus limits](/models) (see below).
4. Under **Variants**, you can set the instance type (**Actions** → **Edit**). You must use a GPU instance such as `ml.g5.xlarge`. CPU instances like the default `ml.p5.48xlarge` will not work.
5. After clicking **Submit**, SageMaker provisions the endpoint. The endpoint is ready when its status transitions to **InService** (typically 6–10 minutes).

### Using your endpoint

You can use [tabpfn-client](https://github.com/PriorLabs/tabpfn-client) (recommended), or invoke the endpoint directly, for example using `boto3`:

<Tabs>
  <Tab title="tabpfn-client (recommended)">
    Install the SageMaker extra of `tabpfn-client`:

    ```bash theme={null}
    pip install --upgrade 'tabpfn-client[sagemaker]'
    ```

    The SDK mirrors the scikit-learn surface. AWS credentials are resolved through the standard `boto3` credential chain (env vars, `~/.aws/credentials`, instance profile, SSO).

    ```python theme={null}
    from tabpfn_client.sagemaker import TabPFNClassifier
    # For regression, use:
    # from tabpfn_client.sagemaker import TabPFNRegressor
    from sklearn.datasets import load_breast_cancer
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import accuracy_score, log_loss

    X, y = load_breast_cancer(return_X_y=True)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    clf = TabPFNClassifier(
        endpoint_name="your-tabpfn-3-plus-endpoint",
        region_name="us-east-1",
    )
    clf.fit(X_train, y_train)

    y_pred = clf.predict(X_test)
    proba = clf.predict_proba(X_test)
    ```
  </Tab>

  <Tab title="boto3">
    If you'd rather call the endpoint without the SDK, the request body is JSON. Required top-level fields: `task_config` (with `task`, `tabpfn_config`, `predict_params`), `X_train`, `y_train`, `X_test`. The [Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-b6rjndzxhbqww)'s *Usage information* section is the authoritative reference for the full request parameter list and response schema.

    ```python theme={null}
    import boto3, json

    rt = boto3.client("sagemaker-runtime", region_name="us-east-1")
    resp = rt.invoke_endpoint(
        EndpointName="your-tabpfn-3-plus-endpoint",
        ContentType="application/json",
        Accept="application/json",
        Body=json.dumps({
            "task_config": {
                "task": "classification",
                "tabpfn_config": {"n_estimators": 4},
                "predict_params": {"output_type": "probas"},
            },
            "X_train": X_train.tolist(),
            "y_train": y_train.tolist(),
            "X_test":  X_test.tolist(),
        }),
    )
    out = json.loads(resp["Body"].read())
    proba = out["prediction"]              # 2D list, shape (n_test, n_classes)
    metadata = out["metadata"]             # echoed task / package_version / dataset shape
    ```
  </Tab>
</Tabs>

### Async inference for larger datasets

SageMaker real-time invocations cap a single request at **6 MB payload** and **60 s processing time**. To use the full [TabPFN-3-Plus limits](/models) — up to 1M rows × 200 features, 100K × 2,000, or 1K × 20,000 — we recommend deploying the endpoint with `AsyncInferenceConfig` and using the SDK's `use_async=True` switch. Async invocations support up to 1 GB payload and 60 min processing.

```python theme={null}
from tabpfn_client.sagemaker import TabPFNClassifier

clf = TabPFNClassifier(
    endpoint_name="your-tabpfn-3-plus-async-endpoint",
    region_name="us-east-1",
    use_async=True,
    s3_bucket="your-async-io-bucket",        # bucket the endpoint role can read+write
)
clf.fit(X_train, y_train)
proba = clf.predict_proba(X_test)
```

The SDK handles the S3-staging-input / poll-output round-trip transparently; the call shape stays the same. See the [sample notebook](https://github.com/PriorLabs/tabpfn-client/blob/main/examples/sagemaker/tabpfn-3-sample.ipynb) in the `tabpfn-client` repo for the end-to-end deploy → invoke → teardown walkthrough.

### Thinking mode

<Note>
  **Thinking mode on SageMaker is available through a separate enterprise listing.** [Thinking mode](/capabilities/thinking-mode) applies additional inference-time computation on top of TabPFN-3-Plus to push prediction quality further — on the public TabArena benchmark it beats every non-TabPFN model by over 200 Elo overall and by 420 Elo on the largest data subset. To get access for your AWS account, contact [sales@priorlabs.ai](mailto:sales@priorlabs.ai).
</Note>

Once you've been granted access, the Thinking listing surfaces the same `tabpfn-client` SDK; the only change is an optional `thinking_effort` constructor kwarg on `TabPFNClassifier` / `TabPFNRegressor` that engages Thinking mode per request. Omitting it falls back to the standard TabPFN-3-Plus single-forward-pass behavior, so a single endpoint serves both modes. See the [Thinking mode](/capabilities/thinking-mode) page for capability details, benchmark numbers, and parameter reference.

### Limitations

* **Payload caps.** Real-time invocations are capped by AWS at 6 MB payload / 60 s processing. Use async inference (see above) for anything bigger or slower.
* **GPU-only.** The model package only declares GPU instance types as supported. The SageMaker console default (`ml.m4.xlarge`, CPU) is rejected with a `ValidationException`; see the [Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-b6rjndzxhbqww)'s *Recommended instance types* section for the GPU-only allowlist.

<CardGroup cols={2}>
  <Card title="Sample notebook" icon="book" href="https://github.com/PriorLabs/tabpfn-client/blob/main/examples/sagemaker/tabpfn-3-sample.ipynb">
    End-to-end deploy → invoke → teardown walkthrough using `tabpfn-client.sagemaker`.
  </Card>

  <Card title="tabpfn-client repo" icon="github" href="https://github.com/PriorLabs/tabpfn-client">
    Source for the Python SDK that wraps SageMaker endpoints with the scikit-learn surface.
  </Card>
</CardGroup>

## TabPFN-2.5

The earlier **TabPFN-2.5** SageMaker listing is still available. New deployments are recommended on TabPFN-3-Plus above.

<iframe className="w-full aspect-video rounded-xl" src="https://www.loom.com/embed/7ae0d372899f4088a603786472a17c88" title="TabPFN-2.5 on SageMaker walkthrough" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

### Subscribe and deploy

1. Open the [TabPFN-2.5 Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-chfhncrdzlb3s) and click **View Purchase Options** → **Subscribe**.
2. In the SageMaker AI console, navigate to **AWS Marketplace resources** → **AWS Marketplace subscriptions** → **TabPFN-2.5**, then **Actions → Create endpoint**. TabPFN-2.5 requires at least one NVIDIA T4 or P4 GPU instance; see the listing's *Recommended instance types* for the full list.

<CardGroup cols={2}>
  <Card title="Example code" icon="github" href="https://github.com/PriorLabs/TabPFN/blob/main/examples/sagemaker.py">
    Step-by-step instructions for running inference with TabPFN-2.5 on SageMaker.
  </Card>

  <Card title="Getting Started Notebook" icon="book" href="https://colab.research.google.com/drive/1lUocasMAw7jdABwOxivIIl5PnjID_6Qm?usp=sharing">
    A guided notebook demonstrating how to use TabPFN-2.5 for inference on SageMaker.
  </Card>
</CardGroup>

### Limitations

TabPFN-2.5 supports two input formats for inference:

* `application/json` — a JSON-encoded request body.
* `multipart/form-data` — containing the dataset as Parquet files.

Both formats must remain within SageMaker's 6 MB real-time payload limit. Because Parquet is compressed, the `multipart/form-data` option generally allows you to send more rows or features within the same size constraint.
