> ## Documentation Index
> Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Microsoft Foundry

> Access TabPFN in your secure Azure environment.

Access TabPFN directly from Azure AI Foundry with Azure-native endpoints and authentication. Usage is billed through your Azure subscription and you are charged by Azure only for the compute resources needed to host TabPFN models.

## Prerequisites

* An active Azure subscription with access to [Azure AI Foundry](https://ai.azure.com/explore/models)
* Azure quota for VM SKUs with GPU
* TabPFN deployed as an endpoint in your Foundry project

<Note>
  For a full list of supported VM SKUs please visit the TabPFN Microsoft Foundry Model Card.
</Note>

## Getting Started

1. Navigate to the Azure AI Foundry [Model Catalog](https://ai.azure.com/explore/models)
2. Search for TabPFN and select [TabPFN-3-Plus](https://ai.azure.com/explore/models/TabPFN-3-Plus/version/1/registry/azureml-priorlabs-p)
3. Click **Use this model** and follow the guided setup
4. Once deployed, note your endpoint URL and API key from the deployment details page

<video className="w-full aspect-video rounded-xl" src="https://storage.googleapis.com/prior-labs-tabpfn-public/videos/azure_tabpfn3.mp4#t=2" controls preload="metadata" playsInline />

<Note>
  Microsoft Foundry hosts each TabPFN version as a separate model. When a new TabPFN version is released, it will appear as a distinct model in the catalog and must be deployed independently - existing deployments will not be updated automatically.
</Note>

## Azure AI Foundry

If you've deployed TabPFN to an Azure AI Foundry managed online endpoint, you can invoke it through `tabpfn_client.foundry` using the same scikit-learn surface. There is no PriorLabs API token in this path — you authenticate against your own Foundry endpoint with its bearer key, and `predict` calls are billed by Azure rather than against your TabPFN usage allowance.

Install the client library:

```bash theme={null}
pip install tabpfn-client
```

Point the estimator at your endpoint URL and pass the bearer key:

```python theme={null}
from tabpfn_client.foundry import TabPFNClassifier, TabPFNRegressor
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)

clf = TabPFNClassifier(
    endpoint_url="https://<your-endpoint>.<region>.inference.ml.azure.com/predict",
    api_key="<your-foundry-bearer-token>",
)
clf.fit(X_train, y_train)
clf.predict(X_test)
clf.predict_proba(X_test)
```

Notes:

* `endpoint_url` is the full Foundry scoring URL, including the `/predict` path. The bearer key is sent as `Authorization: Bearer <api_key>`.
* Requests are sent as `application/json`; the Foundry path does not use multipart, so all data travels JSON-encoded.

Set `use_kv_cache=True` if you will call `predict*` more than once on the same training data. The first call ships `X_train` / `y_train` to the endpoint, runs the fit there, and gets back a `model_id`. The client caches that id, and every subsequent call sends only `X_test` plus the id — the server **skips the fit and runs inference only**. That makes follow-up calls dramatically faster on non-trivial training sets, and shrinks the wire payload from `O(n_train + n_test)` down to `O(n_test)`:

```python theme={null}
clf = TabPFNClassifier(
    endpoint_url="https://<your-endpoint>.<region>.inference.ml.azure.com/predict",
    api_key="<your-foundry-bearer-token>",
    use_kv_cache=True,
)
clf.fit(X_train, y_train)
clf.predict(X_test_a)          # first call: fit + predict on the endpoint
clf.predict_proba(X_test_b)    # cache hit: predict only — much faster
```

Leave `use_kv_cache=False` (the default) when each call uses a different training set; otherwise the cache is dead weight on the endpoint.

### Finding your endpoint and key

Authenticate using the **Primary key** from your deployment's page in Azure AI Foundry. To access the model settings, navigate to your TabPFN deployment in Azure AI Foundry:

1. Go to [Azure AI Foundry](https://ai.azure.com/build/deployments/model) and select the Foundry project where TabPFN was deployed.
2. In the left-hand menu, select **My assets** → **Models + Endpoints**.
3. Open the **Model deployments** tab and click on **TabPFN**.

![TabPFN Model Deployment](https://storage.googleapis.com/prior-labs-tabpfn-public/videos/azure_tabpfn3_keys.gif)

### Thinking mode

**Thinking mode on Azure AI Foundry is available through a separate enterprise listing.** [Thinking mode](/capabilities/thinking-mode) applies additional inference-time computation on top of TabPFN-3-Plus to push prediction quality further — on the public TabArena benchmark it beats every non-TabPFN model by over 200 Elo overall and by 420 Elo on the largest data subset.

<Tip>
  To request access to Thinking mode on your Azure subscription, reach out to [sales@priorlabs.ai](mailto:sales@priorlabs.ai).
</Tip>

Once you've been granted access, the Thinking listing surfaces the same `tabpfn-client` SDK; the only change is an optional `thinking_effort` constructor kwarg on `TabPFNClassifier` / `TabPFNRegressor` that engages Thinking mode per request. Omitting it falls back to the standard TabPFN-3-Plus single-forward-pass behavior, so a single endpoint serves both modes. See the [Thinking mode](/capabilities/thinking-mode) page for capability details, benchmark numbers, and parameter reference.
