noiseceiling

Noise ceiling estimation for machine learning models.

What does it do?

This package contains functionality to estimate a "noise ceiling" for any model that predicts a particular target variable/dependent variable (y) from a set of predictors/independent variables (X). The noise ceiling is (an estimate of) the upper bound of model performance given the consistency (or, inversely, measurement noise) of the target variable (y).

The package uses repeated observations (determined from X) to estimate the "variance" of the dependent variable across those repetitions, which in turn forms the basis of the noise ceiling estimate. If your data (X) does not contain repeated observations, the noise ceiling will be 1 (which is probably not very informative).

API

The noiseceiling package contains two main functions, compute_nc_classification and compute_nc_regression, which (as their name suggests) computes the noise ceiling for classification models (dependent variable is categorical) and regression models (dependent variable is continuous) respectively.

Note that both functions expect two (mandatory) arguments:

X: a pandas DataFrame with predictors/features in columns and observations in rows;
y: a pandas Series with observations in rows

In addition, you can estimate the variability of the noise ceiling using the run_bootstraps_nc function, which accepts an addition keyword, classification (a bool), which indicates whether a categorical (True) or continuous (False) noise ceiling should be estimated.

Notes

Although the functions from this package work for any column type in X (integer, float, string), the noise ceiling functions are about an order of magnitude faster when all columns are of a numeric type (i.e., no strings). Therefore, I'd recommend encoding string-based values (e.g., category levels) as integers.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
noiseceiling		noiseceiling
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

noiseceiling

What does it do?

API

Notes

About

Releases 1

Packages

Languages

License

lukassnoek/noiseceiling

Folders and files

Latest commit

History

Repository files navigation

noiseceiling

What does it do?

API

Notes

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages