Extract latent feature representations from TabPFN models.
The **Embeddings** extension enables extraction of latent feature representations (embeddings) from TabPFN models. These embeddings provide a way to analyze, visualize, or reuse the learned representations produced by TabPFN’s transformer architecture - useful for downstream tasks such as clustering, feature analysis, search, or meta-learning.The extension offers two embedding extraction modes:
Vanilla embeddings - trained on the full dataset
Cross-validated embeddings - extracted via K-fold cross-validation
The embeddings extension is not compatible with the tabpfn-client package since the client does not expose internal model representations yet.
When n_fold=0, the model extracts embeddings after fitting once on the entire dataset.
Copy
Ask AI
# Train the modelembedding_extractor.fit(X_train, y_train)# Extract embeddings for both training and test setstrain_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_test, data_source="train")test_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_test, data_source="test")
This produces NumPy arrays containing dense vector representations of each sample.
When n_fold > 0, K-fold cross-validation is applied following the method described in “A Closer Look at TabPFN v2: Strength, Limitation, and Extension” (arXiv:2502.17361).Larger values of n_fold yield more robust embeddings.