> ## Documentation Index
> Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Many Class Classifier

> Learn about classification with TabPFN for large number of classes.

The [Many Class Classifier Extension](https://github.com/PriorLabs/tabpfn-extensions/tree/main/src/tabpfn_extensions/many_class) allows TabPFN to handle classification problems with more classes than the base checkpoint's native limit. `ManyClassClassifier` auto-detects this limit from the base estimator's `MAX_NUMBER_OF_CLASSES` — currently `10` for TabPFN-2.5 / TabPFN-2.6 and `160` for TabPFN-3.

It works through an error-correcting output code (ECOC) approach that:

* Encodes the multi-class task into multiple binary or small-class subtasks.
* Trains the base `TabPFNClassifier` on these subtasks.
* Decodes the results back into the original class space.

This approach scales to thousands of classes efficiently while maintaining accuracy and calibration. On TabPFN-3, a 5,000-class problem fits in only \~32 sub-estimator calls thanks to the 160-symbol alphabet — roughly an order of magnitude fewer fits than the same problem would have needed on TabPFN-2.x (which is limited to a 10-symbol alphabet).

## Getting Started

Install the `many_class` extension:

```bash theme={null}
pip install "tabpfn-extensions[many_class]"
```

Then, simply wrap your existing `TabPFNClassifier` with `ManyClassClassifier` to enable support for datasets with large number of classes.

```python theme={null}
from tabpfn_extensions.many_class import ManyClassClassifier
from tabpfn import TabPFNClassifier

estimator = TabPFNClassifier()
classifier = ManyClassClassifier(
    estimator=estimator,
    # The parameters below are optional — shown here for clarity.
    n_estimators_redundancy=4, # Multiplier on the auto-chosen number of sub-estimators; higher = more stable, slower.
    random_state=42,
)
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)
```

### Key parameters

All parameters below are optional — sensible defaults are used if they are not provided.

* **`alphabet_size`** — number of classes each sub-estimator is trained on. Leave unset (the default) so it is inferred from the base estimator's `MAX_NUMBER_OF_CLASSES` (read off `estimator.get_inference_config()`). With this, future TabPFN checkpoints that support more classes per sub-task are picked up automatically. If you set it explicitly, do not exceed the base estimator's limit (10 for TabPFN-2.5 / TabPFN-2.6, 160 for TabPFN-3).
* **`n_estimators_redundancy`** — redundancy multiplier on the minimum number of sub-estimators needed to cover all classes. Higher values improve accuracy and calibration at the cost of runtime. Default is `4`.
* **`n_estimators`** — set this to override the auto-derived number of sub-estimators entirely. Leave as `None` to let it be chosen from `alphabet_size`, the number of classes in the training data, and `n_estimators_redundancy`.
* **`random_state`** — controls the randomness of the sub-task encoding, ensuring reproducible results.
