Skip to main content
The Many Class Classifier Extension allows TabPFN to handle classification problems with more classes than the base checkpoint’s native limit. ManyClassClassifier auto-detects this limit from the base estimator’s MAX_NUMBER_OF_CLASSES — currently 10 for TabPFN-2.5 / TabPFN-2.6 and 160 for TabPFN-3. It works through an error-correcting output code (ECOC) approach that:
  • Encodes the multi-class task into multiple binary or small-class subtasks.
  • Trains the base TabPFNClassifier on these subtasks.
  • Decodes the results back into the original class space.
This approach scales to thousands of classes efficiently while maintaining accuracy and calibration. On TabPFN-3, a 5,000-class problem fits in only ~32 sub-estimator calls thanks to the 160-symbol alphabet — roughly an order of magnitude fewer fits than the same problem would have needed on TabPFN-2.x (which is limited to a 10-symbol alphabet).

Getting Started

Install the many_class extension:
pip install "tabpfn-extensions[many_class]"
Then, simply wrap your existing TabPFNClassifier with ManyClassClassifier to enable support for datasets with large number of classes.
from tabpfn_extensions.many_class import ManyClassClassifier
from tabpfn import TabPFNClassifier

estimator = TabPFNClassifier()
classifier = ManyClassClassifier(
    estimator=estimator,
    # The parameters below are optional — shown here for clarity.
    n_estimators_redundancy=4, # Multiplier on the auto-chosen number of sub-estimators; higher = more stable, slower.
    random_state=42,
)
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)

Key parameters

All parameters below are optional — sensible defaults are used if they are not provided.
  • alphabet_size — number of classes each sub-estimator is trained on. Leave unset (the default) so it is inferred from the base estimator’s MAX_NUMBER_OF_CLASSES (read off estimator.get_inference_config()). With this, future TabPFN checkpoints that support more classes per sub-task are picked up automatically. If you set it explicitly, do not exceed the base estimator’s limit (10 for TabPFN-2.5 / TabPFN-2.6, 160 for TabPFN-3).
  • n_estimators_redundancy — redundancy multiplier on the minimum number of sub-estimators needed to cover all classes. Higher values improve accuracy and calibration at the cost of runtime. Default is 4.
  • n_estimators — set this to override the auto-derived number of sub-estimators entirely. Leave as None to let it be chosen from alphabet_size, the number of classes in the training data, and n_estimators_redundancy.
  • random_state — controls the randomness of the sub-task encoding, ensuring reproducible results.