-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numpy array default hasher #43
Labels
Comments
Sounds great! Please find a way to benchmark the improvement, though! Thank you for suggesting a PR! |
How about converting the list to a |
4 tasks
I think this is resolved with #183; see: import numpy as np
import cachier
count = 0
@cachier.cachier()
def dummy_func(a, b=np.array([2])):
global count
count += 1
return a + b
assert count == 0
dummy_func(np.array([1]))
dummy_func(np.array([1]), np.array([2]))
dummy_func(np.array([1]), b=np.array([2]))
dummy_func(a=np.array([1]), b=np.array([2]))
assert count == 1 |
I think we need to have to things to close this confidently:
Is 1 already in place? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I cache a function which is called with large numpy arrays. There is an efficient way to hash them with https://github.com/ifduyue/python-xxhash
I think it's a common use case, so I can merge my method to the default hasher in this library.
The only problem is that hash collisions are possible, and the cache would then work incorrectly, but it's very rare, and also the existing functionality isn't broken. I can also add an option to not hash them but just convert to string with
array.tobytes()
, which prevents collisions but is only is suited for small arrays.The same can be done for regular python lists (with xxhash or
str(list)
).If you want, I can make a PR with this.
The text was updated successfully, but these errors were encountered: