plenoptic
is a python library of tools to help researchers better understand
their models. We welcome and encourage contributions from everyone!
First, please check out the Code of Conduct and read it before going any further. You may also want to check out the main page of the documentation for a longer overview of the project and how to get everything installed, as well as pointers for further reading, depending on your interests.
If you encounter any issues with plenoptic
, first search the existing
issues and
discussions
to see if there's already information to help you. If not, please open a new
issue! We have
a template for bug reports, and following it (so you provide the necessary
details) will make solving your problem much easier.
If you'd like to help improve plenoptic
, there are many ways you can
contribute, from improving documentation to writing code. For those not already
familiar with the project, it can be very helpful for us if you go through the
tutorials, README, and documentation and let us know if anything is unclear,
what needs more detail (or clearer writing), etc. For those that want to
contribute code, we also have many
issues that we
are working through. If you would like to work on one of those, please give it a
try!
In order to submit changes, create a branch or fork of the project, make your changes, add documentation and tests, and submit a Pull Request. See contributing to the code below for more details on this process.
We try to keep all our communication on Github, and we use several channels:
- Discussions is the place to ask usage questions, discuss issues too broad for a single issue, or show off what you've made with plenoptic.
- If you've come across a bug, open an issue.
- If you have an idea for an extension or enhancement, please post in the ideas section of discussions first. We'll discuss it there and, if we decide to pursue it, open an issue to track progress.
plenoptic
tries to follow
SPEC-0: we support python
versions are supported for 3 years following initial release. This means that we
support three python feature versions (e.g., 3.10, 3.11, and 3.12) at any one
time and that we'll transition between versions during the fourth quarter of
each year. We run our CPU tests on all three versions, and the GPU tests and
documentation build use the middle version.
We welcome contributions to plenoptic
! In order to contribute, please create
your own branch, make sure the tests pass, and open a Pull Request. We follow
the GitHub
Flow
workflow: no one is allowed to push to the main
branch, all development
happens in separate feature branches (each of which, ideally, implements a
single feature, addresses a single issue, or fixes a single problem), and these
get merged into main
once we have determined they're ready. Then, after enough
changes have accumulated, we put out a new release, adding a new tag which
increments the version number, and uploading the new release to PyPI (see
releases for more details).
In addition to the information that follows, Github (unsurprisingly) has good information on this workflow, as does the Caiman package (though note that the Caiman uses git flow, which involves a separate develop branch in addition to main).
Before we begin: everyone finds git
confusing the first few (dozen) times they encounter it! And even people with a hard-won understanding frequently look up information on how it works. If you find the following difficult, we're happy to help walk you through the process. Please post on our GitHub discussions page to get help.
You'll need a local installation of plenoptic
which keeps up-to-date with any changes you make. To do so, you will need to fork and clone plenoptic
:
- Go to the plenoptic repo and click on the
Fork
button at the top right of the page. This creates a copy of plenoptic in your Github account. - You should then clone your fork to your local machine and create an editable installation. To do so, follow the instructions for an editable install found in our docs, replacing
git clone https://github.com/plenoptic-org/plenoptic.git
withgit clone https://github.com/<YourUserName>/plenoptic.git
. - Add the
upstream
branch:git remote add upstream https://github.com/plenoptic-org/plenoptic.git
. At this point, you have two remotes:origin
(your fork) andupstream
(the canonical version). You won't have permission to push to upstream (onlyorigin
), but this makes it easy to keep yourplenoptic
up to date with the canonical version by pulling from upstream:git pull upstream
.
You should probably also install all the optional dependencies, so that you can run tests, build the documentation, and run the jupyter notebooks locally. To do so, run pip install -e ".[docs,dev,nb]"
from within the copy of plenoptic
on your machine (see this page of our documentation for information on how to set up jupyter if you don't want an extra copy of it in this environment).
As discussed above, each feature in plenoptic
is worked on in a separate branch. This allows us to have multiple people developing multiple features simultaneously, without interfering with each other's work. To create your own branch, run the following from within your plenoptic
directory:
# switch to main branch of your fork
git checkout main
# update your fork from your github
git pull origin main
# ensure your fork is in sync with the canonical version
git pull upstream main
# update your fork's main branch with any changes from upstream
git push origin main
# create and switch to the branch
git checkout -b my_cool_branch
Then, create new changes on this branch and, when you're ready, add and commit them:
# stage the changes
git add src/plenoptic/the_file_you_changed.py
# commit your changes
git commit -m "A helpful message explaining my changes"
# push to the origin remote
git push origin my_cool_branch
If you aren't comfortable with git add
, git commit
, git push
, I recommend the Software Carpentry git lesson.
You can make any number of changes on your branch. Once you're happy with your changes, add tests to check that they run correctly and add documentation, then make sure the existing tests all run successfully, that your branch is up-to-date with main, and then open a pull request by clicking on the big Compare & pull request
button that appears at the top of your fork after pushing to your branch (see here for a tutorial).
Your pull request should include information on what you changed and why, referencing any relevant issues or discussions, and highlighting any portion of your changes where you have lingering questions (e.g., "was this the right way to implement this?") or want reviewers to pay special attention. You can look at previous closed pull requests to see what this looks like.
At this point, we will be notified of the pull request and will read it over. We will try to give an initial response quickly, and then do a longer in-depth review, at which point you will probably need to respond to our comments, making changes as appropriate. We'll then respond again, and proceed in an iterative fashion until everyone is happy with the proposed changes. This process can take a while! (The more focused your pull request, the less time it will take.)
If your changes are integrated, you will be added as a Github contributor and as one of the authors of the package. Thank you for being part of plenoptic
!
We use Ruff for linting and formatting our Python code to maintain a consistent code style and catch potential errors early. We run ruff as part of our CI and non-compliant code will not be merged!
Ruff is a fast and comprehensive Python formatter and linter that checks for common style and code quality issues. It combines multiple tools, like black, Pyflakes, pycodestyle, isort, and other linting rules into one efficient tool, which are specified in pyproject.toml
. Before submitting your code, make sure to run Ruff to catch any issues. See other sections of this document for how to use nox
and pre-commit
to simplify this process.
Ruff has two components, a formatter and a linter. Formatters and linters are both static analysis tools, but formatters "quickly check and reformat your code for stylistic consistency without changing the runtime behavior of the code", while linters "detect not just stylistic inconsistency but also potential logical bugs, and often suggest code fixes" (per GitHub's readme project). There are many choices of formatters and linters in python; ruff aims to combine the features of many of them while being very fast.
For both the formatter and the linter, you can run ruff without any additional arguments; our configuration option are stored in the pyproject.toml
file and so don't need to be specified explicitly.
ruff format
is the primary entrypoint to the formatter. It accepts a list of files or directories, and formats all discovered Python files:
ruff format # Format all files in the current directory.
ruff format path/to/code/ # Format all files in `path/to/code` (and any subdirectories).
ruff format path/to/file.py # Format a single file.
For the full list of supported options, run ruff format --help
.
To run Ruff on your code:
ruff check .
It'll then tell you which lines are violating linting rules and may suggest that some errors are automatically fixable.
To automatically fix lintint errors, run:
ruff --fix .
Be careful with unsafe fixes, safe fixes are symbolized with the tools emoji and are listed here!
In some cases, it may be acceptable to suppress lint errors, for example when too long lines (code E501
) are desired because otherwise the url might not be readable anymore. These ignores will be evaluated on a case-by-case basis.
You can do this by adding the following to the end of the line:
This line is tooooooo long. # noqa: E501
If you want to suppress an error across an entire file, do this at the top of the file:
# ruff: noqa: E501
Below is my python script
...
...
And any line living in this file can be as long as it wants ...
...
In some cases, you want to not only suppress the error message a linter throws but actually disable a linting rule. An example might be if the import order matters and running isort
would mess with this.
In these cases, you can introduce an action comment like this such that ruff does not sort the following packages alphabetically:
import numpy as np # isort: skip
import my_package as mp # isort: skip
For more details, refer to the documentation.
- Longer, descriptive names are preferred (e.g.,
x
is not an appropriate name for a variable), especially for anything user-facing, such as methods, attributes, or arguments. - Any public method or function must have a complete type-annotated docstring (see below for details). Hidden ones do not need to have complete docstrings, but they probably should.
Pre-Commit Hooks: Identifying simple issues before submission to code review (and how to ignore those)
Pre-commit hooks are useful for the developer to check if all the linting and formatting rules (see Ruff above) are honored before committing. That is, when you commit, pre-commit hooks are run and auto-fixed where possible (e.g., trailing whitespace). You then need to add again if you want these changes to be included in your commit. If the problem is not automatically fixable, you will need to manually update your code before you are able to commit.
Using pre-commit is optional. We use pre-commit.ci to run pre-commit as part of PRs (auto-fixing wherever possible), but it may simplify your life to integrate pre-commit into your workflow.
In order to use pre-commit, you must install the pre-commit
package into your development environment, and then install the hooks:
pip install pre-commit
pre-commit install
See pre-commit docs for more details.
After installation, should you want to ignore pre-commit hooks for some reason (e.g., because you have to run to a meeting and so don't have time to fix all the linting errors but still want your changes to be commited), you can add --no-verify
to your commit message like this:
git commit -m <my commit message> --no-verify
In addition to the above, see the documentation for a description of models and synthesis objects. Any new models or synthesis objects will need to meet the requirements outlined in those pages.
We create releases on Github, deploy on / distribute via pypi, and try to follow semantic versioning:
Given a version number MAJOR.MINOR.PATCH, increment the:
- MAJOR version when you make incompatible API changes
- MINOR version when you add functionality in a backward compatible manner
- PATCH version when you make backward compatible bug fixes
When doing a new release, the following steps must be taken:
- In a new PR:
- Update all the binder links, which are of the form
https://mybinder.org/v2/gh/plenoptic-org/plenoptic/X.Y.Z?filepath=examples
, which are found inREADME.md
,index.rst
,examples/README.md
, and some of the tutorial notebooks found inexamples/
. Note that the version tag must match the github tag (specified in the next step) or the link won't work.
- After merging the above PR into the
main
branch, create a Github release with a new tag matching that used in the binder link above:X.Y.Z
. Creating the release will trigger the deployment to pypi, via ourdeploy
action (found in.github/workflows/deploy.yml
). The built version will grab the version tag from the Github release, using setuptools_scm.
Note that the binder link I have been unable to find a way to make binder use the latest github release tag directly (or make binder use a latest
tag, so ensure they match!
Before running tests locally, you'll need
ffmpeg installed on your system, as well as
the dev
optional dependencies (i.e., you should run pip install -e ".[dev]"
from within your local copy of plenoptic
).
To run all tests, run pytest tests/
from the main plenoptic
directory. This
will take a while, as we have many tests, broken into categories. There are
several choices for how to run a subset of the tests:
-
Run tests from one file:
pytest tests/test_mod.py
-
Run tests by keyword expressions:
pytest -k "MyClass and not method"
. This will run tests which contain names that match the given string expression, which can include Python operators that use filenames, class names and function names as variables. The example above will runTestMyClass.test_something
but notTestMyClass.test_method_simple
. -
To run a specific test within a module:
pytest tests/test_mod.py::test_func
-
Another example specifying a test method in the command line:
pytest test_mod.py::TestClass::test_method
View the pytest documentation for more info.
This section is optional but if you want to easily run tests in an isolated environment using the nox command-line tool.
nox
is installed automatically as a [dev]
dependency of plenoptic.
To run all tests and linters through nox
, from the root folder of the
plenoptic package, execute the following command,
nox
nox
will read the configuration from the noxfile.py
script.
If you only want to run an individual session (e.g., lint or test), you can first check which sessions are available with the following command:
nox -l
Then you can use
nox -s <your_nox_session>
to run the session of your choice.
Here are some examples:
If you want to run just the tests:
nox -s tests
for running only the linters,
nox -s linters
and for testing only the coverage, run:
nox -s coverage
nox
offers a variety of configuration options, you can learn more about it from their
documentation.
Note that nox works particularly well with pyenv, discussed later in this file, which makes it easy to install the multiple python versions used in testing.
Sometimes, before opening a pull-request that will trigger the .github/workflow/ci.yml
continuous
integration workflow, you may want to test your changes over all the supported python versions locally.
Handling multiple installed python versions on the same machine can be challenging and confusing.
pyenv
is a great tool that really comes to the rescue. Note that pyenv
just handles python versions --- virtual environments have to be handled separately, using pyenv-virtualenv
!
This tool doesn't come with the package dependencies and has to be installed separately. Installation instructions are system specific but the package readme is very details, see here.
Follow carefully the instructions to configure pyenv after installation.
Once you have tha package installed and configured, you can install multiple python version through it. First get a list of the available versions with the command,
pyenv install -l
Install the python version you need. For this example, let's assume we want python 3.10.11
and python 3.11.8
,
pyenv install 3.10.11
pyenv install 3.11.8
You can check which python version is currently set as default, by typing,
pyenv which python
And you can list all available versions of python with,
pyenv versions
If you want to run nox
on multiple python versions, all you need to do is:
- Set your desired versions as
global
.This will make both version available, and the default python will be set to the first one listed (pyenv global 3.11.8 3.10.11
3.11.8
in this case). - Run nox specifying the python version as an option.
nox -p 3.10
Note that noxfile.py
lists the available option as keyword arguments in a session specific manner.
As mentioned earlier, if you have multiple python version installed, we recommend you manage your virtual environments through pyenv
using the pyenv-virtualenv
extension.
This tool works with most of the environment managers including (venv
and conda
).
Creating an environment with it is as simple as calling,
pyenv virtualenv my-python my-enviroment
Here, my-python
is the python version, one between pyenv versions
, and my-environment
is your
new environment name.
If my-python
has conda
installed, it will create a conda environment, if not, it will use venv
.
You can list the virtual environment only with,
pyenv virtualenvs
And you can uninstall an environment with,
pyenv uninstall my-environment
New tests can be added in any of the existing tests/test_*.py
scripts. Tests
should be functions, contained within classes. The class contains a bunch of
related tests (e.g., metamers, metrics), and each test should ideally be a unit
test, only testing one thing. The classes should be named TestSomething
, while
test functions should be named test_something
in snakecase.
If you're adding a substantial bunch of tests that are separate from the
existing ones, you can create a new test script. Its name must begin with
test_
, it must have an .py
extension, and it must be contained within the
tests
directory. Assuming you do that, our github actions will automatically
find it and add it to the tests-to-run.
We use jupyter
execute
to test our notebooks and make sure everything runs. You can run it locally to
try and debug some errors (though errors that result from environment issues
obviously will be harder to figure out locally); jupyter execute
is part of
the standard jupyter
install as long as you have nbclient>=0.5.5
.
Similar to adding new test scripts, you don't need to
explicitly add new tutorials to ci.yml
to be tested: as long as your notebook
is in the examples/
directory and has an ipynb
extension, our github actions
will automatically find it and test it.
If your notebook needs additional files to run, you should add a conditional
job
to download them. If you need to upload them, we recommend uploading a tarball
to the Open Science Framework, and they can then be
downloaded using wget
and extracted. See Download TID2013 dataset
in
ci.yml
for an example. We have a single OSF project
containing all the files we've uploaded and will probably want to include yours
there as well.
If your notebook takes more than ~10 minutes on a github runner, you should find a way to use reduce it for tests. The goal of the tests is only to check that each cell runs successfully. For example, the Portilla-Simoncelli texture model notebook runs several metamer syntheses to completion. This allows the user to better understand how the model works and confirm that we are able to reproduce the paper, as well as serving as a convenient way for the developers to ensure that we maintain this over time. However, the tests are only intended to ensure that everything runs, so we can reduce the number of iterations those metamer instances run for. We do this using papermill, which requires several steps:
- Add a cell to the top of the notebook (under the import cell), add the
parameter tag (see papermill
documentation,
and create a variable for each synthesis duration (e.g.,
vgg16_synth_max_iter = 1000
). - Where synthesis is called later in the notebook, replace the number with the
variable (e.g.,
metamer.synthesize(max_iter=vgg16_max_iter)
). - Add a conditional job to
ci.yml
for your notebook which installs papermill and calls it with the syntax:papermill ${{ matrix.notebook }} ${{ matrix.notebook }}_output.ipynb -p PARAM1 VAL1 -p PARAM2 VAL2 -k python3 --cwd examples/
, replacingPARAM1 VAL1
andPARAM2 VAL2
as appropriate (e.g.,vgg16_synth_max_iter 10
; note that you need a-p
for each parameter and you should change nothing else about that line). See the block withif: ${{ matrix.notebook == 'examples/Demo_Eigendistortion.ipynb' }}
for an example.
A similar procedure could be used to reduce the size of an image or other steps that could similarly reduce the total time necessary to run a notebook.
If you have many variants on a test you wish to run, you should probably make
use of pytests' parametrize
mark. There are many examples throughout our
existing tests (and see official pytest
docs), but the basic idea
is that you write a function that takes an argument and then use the
@pytest.mark.parametrize
decorator to show pytest how to iterate over the
arguments. For example, instead of writing:
def test_basic_1():
assert int('3') == 3
def test_basic_2():
assert int('5') == 5
You could write:
@pytest.mark.parametrize('a', [3, 5])
def test_basic(a):
if a == '3':
test_val = 3
elif a == '5':
test_val = 5
assert int(a) == test_val
This starts to become very helpful when you have multiple arguments you wish to iterate over in this manner.
If you are using an object that gets used in multiple tests (such as an image or
model), you should make use of fixtures to avoid having to load or initialize
the object multiple times. Look at conftest.py
to see those fixtures available
for all tests, or you can write your own (though pay attention to the
scope).
For example, conftest.py
contains several images that you can use for your
tests, such as basic_stimuli
, curie_img
, or color_img
. To use them, simply
add them as arguments to your function:
def test_img(curie_img):
img = po.load_images('data/curie.pgm')
assert torch.allclose(img, curie_img)
WARNING: If you're using fixtures, make sure you don't modify them in your test (or you reset them to their original state at the end of the test). The fixture is a single object that will get reused across tests, so modifying it will lead to unexpected behaviors in other tests depending on which tests were run and their execution order.
You can combine fixtures and parameterization, which is helpful for when you
want to test multiple models with a synthesis method, for example. This is
slightly more complicated and relies on pytest's indirect
parametrization
(and requires pytest>=5.1.2
to work properly). For example, conftest.py
has
a fixture, model
, which accepts a string and returns an instantiated model on
the right device. Use it like so:
@pytest.mark.parametrize('model', ['SPyr', 'LNL'], indirect=True)
def test_synth(curie_img, model):
met = po.synth.Metamer(curie_img, model)
met.synthesize()
This model will be run twice, once with the steerable pyramid model and once
with the Linear-Nonlinear model. See the get_model
function in conftest.py
for the available strings. Note that unlike in the simple
parametrize example, we add the indirect=True
argument here.
If we did not include that argument, model
would just be the strings 'SPyr'
and 'LNL'
!
The amount and form of documentation that need to be added alongside a change depends on the size of the submitted change. For a significant change (a new model or synthesis method), please include a new tutorial notebook that walks through how to use them. For enhancements of existing methods, you can probably just modify the existing tutorials and add documentation. If unsure, ask!
Documentation in plenoptic
is built using Sphinx on some of Flatiron's Jenkins
runners and hosted on GitHub pages. If that means nothing to you, don't worry!
Documentation comes in two types: .rst
files (reST, the markup language used
by Sphinx, see
here
for a primer), which contain only text (including math) and images, and .ipynb
files (Jupyter notebooks), which also contain code.
Jupyter notebooks are tutorials and show how to use the various functions and
classes contained in the package, and they all should be located in the
examples/
directory. If you add or change a substantial amount of code, please
add a tutorial showing how to use it. Once you've added a tutorial, see
here for how to include it in the Sphinx site.
reST files contain everything else, especially discussions about why you should
use some code in the package and the theory behind it, and should all be located
in the docs/
directory. Add it to the table of contents in index.rst
by
adding the name of the file (without extension, but with any subfolders) to the
toctree
block. For example, if you add two new files, docs/my_cool_docs.rst
and docs/some_folder/even_cooler_docs.rst
, you should edit docs/index.rst
like so:
.. toctree
:maxdepth: 2
# other existing docs
my_cool_docs
some_folder/even_cooler_docs
In order for table of contents to look good, your .rst
file must be well
structured. Similar to tutorials, it must have a single H1
header (you can have as many sub-headers as you'd like).
You should build the docs yourself to ensure it looks correct before pushing.
You can include images in .rst
files in the documentation as well. Simply
place them in the docs/images/
folder and use the figure
role, e.g.,:
.. figure:: images/path_to_my_image.svg
:figwidth: 100%
:alt: Alt-text describing my image.
Caption describing my image.
To refer to it directly, you may want to use the numref role (which has been enabled for our documentation).
If you have plots or other images generated by code that you wish to include,
you can include them in the file directly without either saving the output in
docs/images/
or turning the page into a notebook. This is useful if you want
to show something generated by code but the code itself isn't the point. We
handle this with matplotlib plot
directive
(which has already been enabled). Add a python script to docs/scripts/
and
write a function that creates the matplotlib figure you wish to display. Then,
in your documentation, add:
.. plot:: scripts/path_to_my_script.py my_useful_plotting_function
Caption describing what is in my plot.
Similar to figures, you can use numref
to refer to plots as well.
All public-facing functions and classes should have complete docstrings, which
start with a one-line short summary of the function, a medium-length description
of the function / class and what it does, and a complete description of all
arguments and return values. Math should be included in a Notes
section when
necessary to explain what the function is doing, and references to primary
literature should be included in a References
section when appropriate.
Docstrings should be relatively short, providing the information necessary for a
user to use the code. Longer discussions of why you would use one method over
another, an explanation of the theory behind a method, and extended examples
should instead be part of the tutorials or documentation.
Private functions and classes should have sufficient explanation that other developers know what the function / class does and how to use it, but do not need to be as extensive.
We follow the numpydoc conventions for docstring structure.
NOTE: If you just want to read the documentation, you do not need to do this; documentation is built automatically, pushed to the plenoptic-documentation github repo and published at http://docs.plenoptic.org/.
However, it can be built locally as well. You would do this if you've
made changes locally to the documentation (or the docstrings) that you
would like to examine before pushing. The virtual environment required
to do so is defined in docs/environment.yml
, so to create that
environment and build the docs, do the following from the project's
root directory:
# install sphinx and required packages to build documentation
conda env create -f docs/environment.yml
# activate the environment
conda activate plenoptic_docs
# install plenoptic
pip install -e ".[docs]"
# build documentation
cd docs/
make html
The index page of the documentation will then be located at
docs/_build/html/index.html
, open it in your browser to navigate
around.
The plenoptic_docs
environment you're creating contains the package
sphinx
and several extensions for it that are required to build the
documentation. You also need to install plenoptic
from your local
version so that sphinx
can import the library and grab all of the
docstrings (you're installing the local version so you can see all the
changes you've made).
And then whenever you want to recreate / update your local documentation, run:
conda activate plenoptic_docs
cd docs/
make html
We build tutorials as Jupyter notebooks so that they can be launched in Binder
and people can play with them on their local machine. In order to include them
in the built docs, add a nblink
file to the docs/tutorials/
directory or one
of its sub-directories. We check for this during the tests, so you won't be able
to merge your pull request into main
unless you've done this!
This is a json file that should contain the path to the notebook, like so, for
docs/tutorials/my_awesome_tutorial.nblink
:
{
"path": "../../examples/my_tutorial.ipynb"
}
note that you cannot have a trailing comma there, because json is very
particular. And note that the number of ../
you need will depend on whether
the nblink
file lives in docs/tutorials/
or one of its sub-directories.
If you have extra media (such as images) that are rendered in the notebook, you need to specify them as well, otherwise they won't render in the documentation:
{
"path": "../../examples/my_tutorial.ipynb",
"extra-media": ["../../examples/assets/my_folder"]
}
note that extra-media
must be a list, even with a single path.
See the nbsphinx-link page for more details.
The location of the .nblink
file (in docs/tutorials
, docs/tutorials/intro
,
etc.) determines which of the sub-headings it appears under in the table of
contents. View the toctree
directives at the bottom of index.rst
to see
which subfolders corresponds to which
NOTE: In order for the toctree
formatting to work correctly, your notebook
should have exactly one H1 title (i.e., line starting with a single #
), but
you can have as many lower-level titles as you'd like. If you have multiple H1
titles, they'll each show up as different tutorials. If you don't have an H1
title, it won't show up at all.