Whether you’re just getting started with TabPFN or pushing it into production, this FAQ highlights practical answers to common questions about model limits, performance, reproducibility, and API best practices.Documentation Index
Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
Use this file to discover all available pages before exploring further.
What GPUs does TabPFN require at minimum?
What GPUs does TabPFN require at minimum?
device parameter of TabPFNClassifier and TabPFNRegressor.What are TabPFN's practical limits for context size and features?
What are TabPFN's practical limits for context size and features?
How are text features handled?
How are text features handled?
How are date features handled?
How are date features handled?
How reproducible are TabPFN results across runs and devices?
How reproducible are TabPFN results across runs and devices?
Does TabPFN handle missing values natively?
Does TabPFN handle missing values natively?
pd.NA, without requiring manual imputation.How should I handle imbalanced datasets?
How should I handle imbalanced datasets?
- Use
balance_probabilities=Trueto optimize for balanced accuracy or balanced loss. - Use
eval_metric="f1"(or other supported metrics) for specific precision-recall tradeoffs.
My run is 'slow' on thousands of rows - what controls the fit/predict trade-off?
My run is 'slow' on thousands of rows - what controls the fit/predict trade-off?
predict() step (that’s how it simulates Bayesian posterior inference). That means latency scales roughly with the number of training rows. For fast inference we can create a tree- or small MLP based model that yields almost the same accuracy as TabPFN. Contact sales@priorlabs.ai to access this solution.How can I speed up predictions on large test sets?
How can I speed up predictions on large test sets?
.predict() is called. It is much faster to make a prediction for all your test points in a single .predict() call rather than calling it repeatedly. If you run out of memory, split the test points into batches of 1,000 to 10,000 and call .predict() for each batch.You can also tune the memory_saving_mode and n_preprocessing_jobs parameters of TabPFNClassifier and TabPFNRegressor for additional speed improvements. See the code documentation for details.What is the fitted-model cache and when should I use it?
What is the fitted-model cache and when should I use it?
.fit(), making subsequent .predict() calls fast by using a KV-Cache. Enable it by setting the fit_mode parameter of TabPFNClassifier or TabPFNRegressor to fit_with_cache.However, with this setting classification models consume approximately 6.1 KB of GPU memory and 48.8 KB of CPU memory per cell in the training dataset (regression models about 25% less), so it is currently only suitable for small training datasets. For larger datasets and CPU-based inference, we recommend the TabPFN-as-MLP/Tree output engine.What PyTorch version should I use?
What PyTorch version should I use?
What are the API rate limits?
What are the API rate limits?
How do I know which TabPFN model version am I using in the TabPFN API Client?
How do I know which TabPFN model version am I using in the TabPFN API Client?