Contents

3.5. Feature selection

3.5.1. Univariate feature selection

Univariate feature selection works by selecting the best features based on univariate statistical tests. It can seen as a preprocessing step to an estimator. The scikit.learn exposes feature selection routines a objects that implement the transform method. The k-best features can be selected based on:

or by setting a percentile of features to keep using

or using common statistical quantities:

These objects take as input a scoring function that returns univariate p-values.

3.5.1.1. Feature scoring functions

Warning

Beware not to use a regression scoring function with a classification problem.

3.5.1.1.1. For classification

3.5.1.1.2. For regression