The Data Generation capability extends TabPFN’s unsupervised modeling system to create realistic synthetic tabular datasets. By modeling feature dependencies and joint probability distributions, TabPFN can generate new samples that follow the same statistical structure as your original data - useful for augmentation, simulation, and masking sensitive data.Documentation Index
Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
Use this file to discover all available pages before exploring further.

Getting Started
Install theunsupervised extension:
TabPFNUnsupervisedModel with a TabPFN classifier and regressor model to generate new data:
How it Works
The data generation process leverages the same probabilistic modeling used in TabPFN’s unsupervised mode:- Each feature is modeled conditionally on the others.
- The chain rule of probability is used to estimate the full joint distribution.
- New samples are drawn using the learned conditional dependencies, controlled by a temperature parameter (
temp) that influences variability and diversity.
Use Cases
Synthetic data generation can be applied across a range of research and engineering tasks:- Data augmentation - expand limited datasets for training or validation.
- Privacy-preserving analytics - create realistic datasets without exposing sensitive information.
Google Colab Example
Check out our Google Colab for a demo.