> ## Documentation Index
> Fetch the complete documentation index at: https://docs.priorlabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Feature Selection

> Reduce feature count to improve TabPFN's attention efficiency and predictive power.

When your dataset has many features (especially beyond 500), feature selection can improve both performance and speed.

## Why It Helps

TabPFN uses transformer attention over all features. Irrelevant or noisy features dilute the model's attention budget and can reduce predictive power, especially as feature count grows.

## Approaches

**Greedy feature selection** - remove features individually and check performance. This works particularly well on smaller data with low computational costs.

**Mutual information filtering** — rank features by mutual information with the target and keep the top-k:

```python theme={null}
from sklearn.feature_selection import mutual_info_classif, SelectKBest

selector = SelectKBest(mutual_info_classif, k=50)
X_train_selected = selector.fit_transform(X_train, y_train)
X_test_selected = selector.transform(X_test)
```

**PCA / TruncatedSVD** — reduce dimensionality while retaining variance:

```python theme={null}
from sklearn.decomposition import PCA

pca = PCA(n_components=50)
X_train_reduced = pca.fit_transform(X_train)
X_test_reduced = pca.transform(X_test)
```
