-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extractor for BioFormats (CXD) #338
Draft
weiglszonja
wants to merge
24
commits into
main
Choose a base branch
from
add_bioformats
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+618
−0
Draft
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
d6a7fed
add base extractor class for BioFormats
weiglszonja 5da3988
add CxdImagingExtractor
weiglszonja 871c371
add aicsimageio to requirements-full.txt
weiglszonja 69b99d9
add utils
weiglszonja 894dfef
renamed module
weiglszonja 3722f6d
move import utils locally
weiglszonja 663d4be
try add to api docs
weiglszonja eb33c30
rename module to bioformatsimagingextractors
weiglszonja aa7c97f
try again
weiglszonja 641d871
add BioFormatsImagingExtractor to api docs
weiglszonja 224c78d
sampling frequency as optional argument
weiglszonja bf1b80d
update parsed_metadata with sampling frequency
weiglszonja cb55905
Merge branch 'main' into add_bioformats
weiglszonja 45065f8
add first tests
weiglszonja 29e3385
Merge branch 'main' into add_bioformats
weiglszonja 46c0672
try skip tests when bioformats dependencies are missing
weiglszonja 3133de2
fix import
weiglszonja baf9123
remove JAVA_HOME environment variable setting
weiglszonja 635cfe1
try add new workflow for testing BioFormats
weiglszonja ff54c9a
add run-bioformats-tests.yml to deploy_pr_tests.yaml
weiglszonja 48ddfa8
try install dependencies directly in yml first
weiglszonja 376a3ef
try activate conda env in every step
weiglszonja d3c5d14
try activate conda env in every step
weiglszonja 309da64
Update requirements-full.txt
weiglszonja File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
name: BioFormats Tests | ||
|
||
on: | ||
workflow_dispatch: | ||
workflow_call: | ||
secrets: | ||
CODECOV_TOKEN: | ||
required: true | ||
|
||
jobs: | ||
test: | ||
runs-on: ubuntu-latest | ||
|
||
env: | ||
CONDA_ENV: test | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
- run: git fetch --prune --unshallow --tags | ||
|
||
- name: Setup Miniconda | ||
uses: conda-incubator/setup-miniconda@v2 | ||
with: | ||
auto-activate-base: false | ||
python-version: 3.9 | ||
|
||
- name: Global Setup | ||
run: | | ||
conda activate $CONDA_ENV | ||
pip install -U pip | ||
pip install pytest-xdist | ||
git config --global user.email "[email protected]" | ||
git config --global user.name "CI Almighty" | ||
pip install wheel==0.41.2 # needed for scanimage | ||
|
||
- name: Install full requirements | ||
run: | | ||
conda activate $CONDA_ENV | ||
pip install .[test] | ||
pip install -e .[full] | ||
|
||
- name: Install BioFormats requirements | ||
run: | | ||
conda activate $CONDA_ENV | ||
pip install aicsimageio>=4.14.0 | ||
conda install -c conda-forge bioformats_jar | ||
|
||
- name: Set JAVA_HOME | ||
run: | | ||
conda activate $CONDA_ENV | ||
if [ -z "${JAVA_HOME}" ]; then | ||
echo "JAVA_HOME=$CONDA_PREFIX" >> $GITHUB_ENV | ||
fi | ||
|
||
- name: Get ophys_testing_data current head hash | ||
id: ophys | ||
run: echo "::set-output name=HASH_OPHYS_DATASET::$(git ls-remote https://gin.g-node.org/CatalystNeuro/ophys_testing_data.git HEAD | cut -f1)" | ||
|
||
- name: Cache ophys dataset - ${{ steps.ophys.outputs.HASH_OPHYS_DATASET }} | ||
uses: actions/cache@v2 | ||
id: cache-ophys-datasets | ||
with: | ||
path: ./ophys_testing_data | ||
key: ophys-datasets-042023-${{ runner.os }}-${{ steps.ophys.outputs.HASH_OPHYS_DATASET }} | ||
|
||
- name: Run BioFormats tests | ||
run: | | ||
conda activate $CONDA_ENV | ||
pytest tests/test_cxdimagingextractor.py -n auto --dist loadscope |
7 changes: 7 additions & 0 deletions
7
docs/source/api/imaging_extractors/bioformatsimagingextractors.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
BioFormatsImagingExtractor | ||
-------------------------- | ||
.. automodule:: roiextractors.extractors.bioformatsimagingextractors.bioformatsimagingextractor | ||
|
||
CxdImagingExtractor | ||
------------------- | ||
.. automodule:: roiextractors.extractors.bioformatsimagingextractors.cxdimagingextractor |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
src/roiextractors/extractors/bioformatsimagingextractors/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
"""A collection of ImagingExtractors for reading files with Bio-Formats. | ||
|
||
Modules | ||
------- | ||
bioformatsimagingextractor | ||
The base class for Bio-Formats imaging extractors. | ||
cxdimagingextractor | ||
Specialized extractor for CXD files produced via Hamamatsu Photonics. | ||
|
||
Classes | ||
------- | ||
BioFormatsImagingExtractor | ||
The base ImagingExtractor for Bio-Formats. | ||
CxdImagingExtractor | ||
Specialized extractor for reading CXD files produced via Hamamatsu Photonics. | ||
""" | ||
|
||
from .cxdimagingextractor import CxdImagingExtractor |
8 changes: 8 additions & 0 deletions
8
src/roiextractors/extractors/bioformatsimagingextractors/bioformats_env.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
name: bioformats-environment | ||
channels: | ||
- conda-forge | ||
dependencies: | ||
- bioformats_jar | ||
- pip | ||
- pip: | ||
- aicsimageio>=4.14.0 |
87 changes: 87 additions & 0 deletions
87
src/roiextractors/extractors/bioformatsimagingextractors/bioformats_utils.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
from pathlib import Path | ||
|
||
|
||
import numpy as np | ||
import aicsimageio | ||
from aicsimageio.formats import FORMAT_IMPLEMENTATIONS | ||
from ome_types import OME | ||
|
||
from ...extraction_tools import PathType | ||
|
||
|
||
def check_file_format_is_supported(file_path: PathType): | ||
""" | ||
Check if the file format is supported by BioformatsReader from aicsimageio. | ||
|
||
Returns ValueError if the file format is not supported. | ||
|
||
Parameters | ||
---------- | ||
file_path : PathType | ||
Path to the file. | ||
""" | ||
bioformats_reader = "aicsimageio.readers.bioformats_reader.BioformatsReader" | ||
supported_file_suffixes = [ | ||
suffix_name for suffix_name, reader in FORMAT_IMPLEMENTATIONS.items() if bioformats_reader in reader | ||
] | ||
|
||
file_suffix = Path(file_path).suffix.replace(".", "") | ||
if file_suffix not in supported_file_suffixes: | ||
raise ValueError(f"File '{file_path}' is not supported by BioformatsReader.") | ||
|
||
|
||
def extract_ome_metadata( | ||
file_path: PathType, | ||
) -> OME: | ||
""" | ||
Extract OME metadata from a file using aicsimageio. | ||
|
||
Parameters | ||
---------- | ||
file_path : PathType | ||
Path to the file. | ||
""" | ||
check_file_format_is_supported(file_path) | ||
|
||
with aicsimageio.readers.bioformats_reader.BioFile(file_path) as reader: | ||
ome_metadata = reader.ome_metadata | ||
|
||
return ome_metadata | ||
|
||
|
||
def parse_ome_metadata(metadata: OME) -> dict: | ||
""" | ||
Parse metadata in OME format to extract relevant information and store it standard keys for ImagingExtractors. | ||
|
||
Currently supports: | ||
- num_frames | ||
- sampling_frequency | ||
- num_channels | ||
- num_planes | ||
- num_rows (height of the image) | ||
- num_columns (width of the image) | ||
- dtype | ||
- channel_names | ||
|
||
""" | ||
images_metadata = metadata.images[0] | ||
pixels_metadata = images_metadata.pixels | ||
|
||
sampling_frequency = None | ||
if pixels_metadata.time_increment is not None: | ||
sampling_frequency = 1 / pixels_metadata.time_increment | ||
|
||
channel_names = [channel.id for channel in pixels_metadata.channels] | ||
|
||
metadata_parsed = dict( | ||
num_frames=images_metadata.pixels.size_t, | ||
sampling_frequency=sampling_frequency, | ||
num_channels=images_metadata.pixels.size_c, | ||
num_planes=images_metadata.pixels.size_z, | ||
num_rows=images_metadata.pixels.size_y, | ||
num_columns=images_metadata.pixels.size_x, | ||
dtype=np.dtype(pixels_metadata.type.numpy_dtype), | ||
channel_names=channel_names, | ||
) | ||
|
||
return metadata_parsed |
156 changes: 156 additions & 0 deletions
156
src/roiextractors/extractors/bioformatsimagingextractors/bioformatsimagingextractor.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,156 @@ | ||
"""ImagingExtractor for reading files supported by Bio-Formats. | ||
|
||
Classes | ||
------- | ||
BioFormatsImagingExtractor | ||
The base ImagingExtractor for Bio-Formats. | ||
""" | ||
|
||
from typing import Tuple | ||
|
||
import numpy as np | ||
|
||
from ...imagingextractor import ImagingExtractor | ||
from ...extraction_tools import PathType, DtypeType | ||
|
||
|
||
class BioFormatsImagingExtractor(ImagingExtractor): | ||
"""Imaging extractor for files supported by Bio-Formats.""" | ||
|
||
extractor_name = "BioFormatsImaging" | ||
|
||
def __init__( | ||
self, | ||
file_path: PathType, | ||
channel_name: str, | ||
plane_name: str, | ||
dimension_order: str, | ||
parsed_metadata: dict, | ||
): | ||
r""" | ||
Create a BioFormatsImagingExtractor instance from a file supported by Bio-Formats. | ||
|
||
Supported file formats: https://bio-formats.readthedocs.io/en/stable/supported-formats.html | ||
|
||
This extractor requires `bioformats_jar` to be installed in the environment, | ||
and requires the java executable to be available on the path (or via the JAVA_HOME environment variable), | ||
along with the mvn executable. | ||
|
||
If you are using conda, you can install with `conda install -c conda-forge bioformats_jar`. | ||
Note: you may need to reactivate your conda environment after installing. | ||
If you are still getting a JVMNotFoundException, try: | ||
# mac and linux: | ||
`export JAVA_HOME=$CONDA_PREFIX` | ||
|
||
# windows: | ||
`set JAVA_HOME=%CONDA_PREFIX%\\Library` | ||
|
||
Parameters | ||
---------- | ||
file_path : PathType | ||
Path to the file. | ||
channel_name : str | ||
The name of the channel for this extractor. | ||
plane_name : str | ||
The name of the plane for this extractor. | ||
dimension_order : str | ||
The order of dimension for reading the frames. For .cxd format it is "TCZYX". | ||
See aicsimageio.dimensions.DimensionNames and aicsimageio.dimensions.Dimensions for more information. | ||
parsed_metadata: dict | ||
Parsed metadata dictionary in the form outputted by parse_ome_metadata in order to be parsed | ||
correctly. | ||
""" | ||
from .bioformats_utils import check_file_format_is_supported | ||
import aicsimageio | ||
|
||
self.file_path = file_path | ||
super().__init__() | ||
|
||
check_file_format_is_supported(self.file_path) | ||
|
||
self.dimension_order = dimension_order | ||
|
||
self._num_frames = parsed_metadata["num_frames"] | ||
self._num_channels = parsed_metadata["num_channels"] | ||
self._num_planes = parsed_metadata["num_planes"] | ||
self._num_rows = parsed_metadata["num_rows"] | ||
self._num_columns = parsed_metadata["num_columns"] | ||
self._dtype = parsed_metadata["dtype"] | ||
self._sampling_frequency = parsed_metadata["sampling_frequency"] | ||
self._channel_names = parsed_metadata["channel_names"] | ||
self._plane_names = [f"{i}" for i in range(self._num_planes)] | ||
|
||
if channel_name not in self._channel_names: | ||
raise ValueError( | ||
f"The selected channel '{channel_name}' is not a valid channel name." | ||
f" The available channel names are: {self._channel_names}." | ||
) | ||
self.channel_index = self._channel_names.index(channel_name) | ||
|
||
if plane_name not in self._plane_names: | ||
raise ValueError( | ||
f"The selected plane '{plane_name}' is not a valid plane name." | ||
f" The available plane names are: {self._plane_names}." | ||
) | ||
self.plane_index = self._plane_names.index(plane_name) | ||
|
||
with aicsimageio.readers.bioformats_reader.BioFile(self.file_path) as reader: | ||
self._video = reader.to_dask() | ||
|
||
def get_channel_names(self) -> list: | ||
return self._channel_names | ||
|
||
def get_dtype(self) -> DtypeType: | ||
return self._dtype | ||
|
||
def get_image_size(self) -> Tuple[int, int]: | ||
return self._num_rows, self._num_columns | ||
|
||
def get_num_channels(self) -> int: | ||
return self._num_channels | ||
|
||
def get_num_frames(self) -> int: | ||
return self._num_frames | ||
|
||
def get_sampling_frequency(self): | ||
return self._sampling_frequency | ||
|
||
def check_frame_inputs(self, frame) -> None: | ||
"""Check that the frame index is valid. Raise ValueError if not. | ||
|
||
Parameters | ||
---------- | ||
frame : int | ||
The index of the frame to retrieve. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If the frame index is invalid. | ||
""" | ||
if frame is None: | ||
return | ||
if frame >= self._num_frames: | ||
raise ValueError(f"Frame index ({frame}) exceeds number of frames ({self._num_frames}).") | ||
if frame < 0: | ||
raise ValueError(f"Frame index ({frame}) must be greater than or equal to 0.") | ||
|
||
def get_video(self, start_frame=None, end_frame=None, channel: int = 0) -> np.ndarray: | ||
self.check_frame_inputs(start_frame) | ||
self.check_frame_inputs(end_frame) | ||
|
||
dimension_dict = { | ||
"T": slice(start_frame, end_frame), | ||
"C": self.channel_index, | ||
"Z": self.plane_index, | ||
"Y": slice(None), | ||
"X": slice(None), | ||
} | ||
slices = [dimension_dict[dimension] for dimension in self.dimension_order] | ||
video = self._video[tuple(slices)] | ||
|
||
# re-arrange axis to ensure video axes are time x height x width | ||
axis_order = tuple("TYX".index(dim) for dim in self.dimension_order if dim in "TYX") | ||
video = video.transpose(axis_order) | ||
|
||
return video.compute() |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @h-mayorquin to check out for relevant info in CI testing:
Install wine:
https://github.com/h-mayorquin/spikeinterface/blob/9cd4d127f0db0b354114eda083d892131e070abb/.github/actions/install-wine/action.yml#L1-L22
https://github.com/h-mayorquin/spikeinterface/blob/9cd4d127f0db0b354114eda083d892131e070abb/.github/workflows/full-test.yml#L131-L139
https://github.com/h-mayorquin/spikeinterface/blob/9cd4d127f0db0b354114eda083d8921[…]70abb/src/spikeinterface/extractors/tests/test_neoextractors.py