Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_check_data_loader() slows down my code by at least one order of magnitude #1250

Open
rsanchezgarc opened this issue Dec 6, 2024 · 6 comments

Comments

@rsanchezgarc
Copy link

Hi,

I have noticed that in the newer versions, _check_data_loader() is called at the Image.init.
https://github.com/fepegar/torchio/blame/6bb2457fd23ed76156dfd4f4cc84d0cd42e6f081/src/torchio/data/image.py#L177

I have also noticed that this is causing a significant slowdown in my code. Would it be possible to reconsider this check, or, at least, make it optional?

@fepegar
Copy link
Owner

fepegar commented Dec 7, 2024

Hi, @rsanchezgarc. Thanks for reporting. Do you have a MWE to prove that the new method caused the slowdown?

@rsanchezgarc rsanchezgarc changed the title _check_data_loader() slow downs my code by at least one order of magnitude _check_data_loader() slows down my code by at least one order of magnitude Dec 11, 2024
@rsanchezgarc
Copy link
Author

Hi, I just rechecked it, and the slowdown is only in the range of 20%, so please, ignore this issue.
I am closing it now

@fepegar
Copy link
Owner

fepegar commented Dec 11, 2024

Still bad! But we'd need proof to work on this :D

@rsanchezgarc
Copy link
Author

Here is my MWE

import time
import torch
import torchio as tio

class FastScalarImage(tio.ScalarImage):
    def _check_data_loader(self):
        return

class FastLabelMap(tio.LabelMap):
    def _check_data_loader(self):
        return

#The nested calls are designed to make inspect slower
def ___create_subject(ScalarImage, LabelMap):
    x = torch.rand(1, 32, 32, 32)
    y = torch.rand(1, 32, 32, 32)
    subject = tio.Subject({
        "input_data": ScalarImage(tensor=x),
        "target_data": LabelMap(tensor=y)
    })
    return subject

def __create_subject(ScalarImage, LabelMap):
    return ___create_subject(ScalarImage, LabelMap)

def _create_subject(ScalarImage, LabelMap):
    return __create_subject(ScalarImage, LabelMap)

def create_subject(ScalarImage, LabelMap):
    return _create_subject(ScalarImage, LabelMap)

# Benchmark function
def run_benchmark(n_iters=10000):
    # Test regular TorchIO classes
    s = time.time()
    for i in range(n_iters):
        _create_subject(tio.ScalarImage, tio.LabelMap)
    regular_time = time.time() - s
    print(f"Regular implementation: {regular_time:.3f}s")

    # Test fast classes
    s = time.time()
    for i in range(n_iters):
        create_subject(FastScalarImage, FastLabelMap)
    fast_time = time.time() - s
    print(f"Fast implementation: {fast_time:.3f}s")

    speedup = (regular_time / fast_time - 1) * 100
    print(f"Speedup: {speedup:.1f}%")

if __name__ == "__main__":
    run_benchmark()

Regular implementation: 4.539s
Fast implementation: 3.640s
Speedup: 24.7%

@rsanchezgarc rsanchezgarc reopened this Dec 12, 2024
@romainVala
Copy link
Contributor

hi
Ail, if I test your code example locally I get
Regular implementation: 328.803s
Fast implementation: 4.096s
Speedup: 7927.7%

that's weird !

may be it comes from different dependencies ... ?

python <(curl -s https://raw.githubusercontent.com/fepegar/torchio/main/print_system.py)
Platform: Linux-4.15.0-213-generic-x86_64-with-glibc2.27
TorchIO: 0.20.1
PyTorch: 2.3.1
SimpleITK: 2.0.0rc2.dev912-g1eec0 (ITK 5.1)
NumPy: 1.26.4
Python: 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0]

@romainVala
Copy link
Contributor

romainVala commented Dec 12, 2024

I dig a bit into, and for me it comes from the
in_class function, (called by in_torch_loader() called by _check_data_loader )
that take about 0.016 s to perform
not sure what the stack = inspect.stack() really contains but its length is 22 (with the example here)

0.016*10000 iteration * 2 (image and label) = 320 s
which is close to the time I get

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants