Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variability in video output #32

Open
t-sasatani opened this issue Jul 4, 2024 · 5 comments
Open

Variability in video output #32

t-sasatani opened this issue Jul 4, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@t-sasatani
Copy link
Collaborator

t-sasatani commented Jul 4, 2024

The hashes generated by the hash_video method in miniscope_io.utils still seem variable depending on the environment. You can see this in the test runs based on the following commit:

The tests on GH Mac and Win failed, whereas tests on Ubuntu passed. It also passed in my local Windows environment, so it isn't completely OS-dependent. The hash output is variable, but there seem to be only two variations.

I couldn't figure out why and skipped this error for now by holding multiple reference hashes in the test.

@t-sasatani t-sasatani changed the title adminmbiguity Variability in hash_video method output Jul 4, 2024
@t-sasatani t-sasatani added the bug Something isn't working label Jul 4, 2024
@sneakers-the-rat
Copy link
Collaborator

Heh yes we have gone full circle - I experienced this with the sdcard class too and had a very similar solution.

Ultimately it just came down to the fact that the lossy compression seems intrinsically variable, I think at least some part of the algorithm is stochastic. So what I did is constrain the test to always use the GRAY fourcc code
https://github.com/Aharoni-Lab/miniscope-io/blob/f11d9018e8bde5587664c0575bc51cec2a7e1892/tests/test_io.py#L125

We should just suck it up and just make an ffmpeg-based writer class, where we'll have way more control over video output than with opencv. We can basically borrow the classes from skvideo: https://github.com/scikit-video/scikit-video/blob/master/skvideo/io/ffmpeg.py

scipy is a pretty heavy dependency to take on, so we might want to just rip those classes off since theyre all we need and they are such a thin wrapper around ffmpeg

@t-sasatani
Copy link
Collaborator Author

I'm confused because the Y800 codec, the default in streamDaq, is also lossless. Also the hash_video() method you wrote looks like it shouldn't have writer-dependency as long as the contained video is lossless. So I wasn't sure if switching to something more explicit like ffmpeg would solve this, but it might be worth a try.

@sneakers-the-rat
Copy link
Collaborator

I dont know much about the Y800 codec, but it might have some nondeterministic component too. Since we dont have a lot of visibility into the format we'll probably always be guessing.

These end-to-end tests that test that a video stays unchanged are mostly a placeholder for better, more granular tests anyway - eg. Here we should just be pulling the frames themselves out and hashing those rather than going through a video encoding step, have steps for the intermediate steps too, and test video writing separately, but thats all in due time. The naturalistic binary file as test data is great for testing error handling, but we also should be generating and feeding through known video frames to test as well.

So re: this issue, maybe try it with GRAY, and we can use that as a stopgap until we get something better worked out. It should be possible to pass a fourcc code through the capture method to the writer

@t-sasatani
Copy link
Collaborator Author

Y800 just stores each pixel in each byte, so it should be deterministic. Actually, GREY seems to be a duplicate of Y800 so let's just leave it for now and put those more rigorous tests in the backlog. There is a chance GREY will generate a single hash in our environments but it feels better to keep the problem apparent and exposed at this point.
https://fourcc.org/yuv.php#Y800

@sneakers-the-rat
Copy link
Collaborator

Haha omg exactly yes lets move our video output to something thats like a set of parameters instead of opaque four character aliases. Agreed on leaving this open

@sneakers-the-rat sneakers-the-rat changed the title Variability in hash_video method output Variability in video output Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants