Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Many Class Classifier Extension allows TabPFN to handle classification problems with more classes than the base checkpoint’s native limit. ManyClassClassifier auto-detects this limit from the base estimator’s MAX_NUMBER_OF_CLASSES — currently 10 for TabPFN-2.5 / TabPFN-2.6 and 160 for TabPFN-3 — so the wrapper transparently picks up future checkpoints with higher limits. It works through an error-correcting output code (ECOC) approach that:
  • Encodes the multi-class task into multiple binary or small-class subtasks.
  • Trains the base TabPFNClassifier on these subtasks.
  • Decodes the results back into the original class space.
This approach scales to thousands of classes efficiently while maintaining accuracy and calibration. On TabPFN-3, a 5,000-class problem fits in only ~32 sub-estimator calls thanks to the 160-symbol alphabet — roughly an order of magnitude fewer fits than the same problem would have needed on TabPFN-2.x (which is limited to a 10-symbol alphabet).

Getting Started

Install the many_class extension:
pip install "tabpfn-extensions[many_class]"
Then, simply wrap your existing TabPFNClassifier with ManyClassClassifier to enable support for datasets with large number of classes.
from tabpfn_extensions.many_class import ManyClassClassifier
from tabpfn import TabPFNClassifier

estimator = TabPFNClassifier(device="cuda")
classifier = ManyClassClassifier(
    estimator=estimator,
    # The parameters below are optional — shown here for clarity.
    n_estimators_redundancy=4, # Multiplier on the auto-chosen number of sub-estimators; higher = more stable, slower.
    random_state=42,
)
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)

Key parameters

All parameters below are optional — sensible defaults are used if they are not provided.
  • alphabet_size — number of classes each sub-estimator is trained on. Leave unset (the default) so it is inferred from the base estimator’s MAX_NUMBER_OF_CLASSES (read off estimator.get_inference_config()). With this, future TabPFN checkpoints that support more classes per sub-task are picked up automatically. If you set it explicitly, do not exceed the base estimator’s limit (10 for TabPFN-2.5 / TabPFN-2.6, 160 for TabPFN-3).
  • n_estimators_redundancy — redundancy multiplier on the minimum number of sub-estimators needed to cover all classes. Higher values improve accuracy and calibration at the cost of runtime. Default is 4.
  • n_estimators — set this to override the auto-derived number of sub-estimators entirely. Leave as None to let it be chosen from alphabet_size, the number of classes in the training data, and n_estimators_redundancy.
  • random_state — controls the randomness of the sub-task encoding, ensuring reproducible results.