Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why single precision by default in Haversine? #216

Closed
juliohm opened this issue Apr 23, 2021 · 9 comments
Closed

Why single precision by default in Haversine? #216

juliohm opened this issue Apr 23, 2021 · 9 comments

Comments

@juliohm
Copy link
Contributor

juliohm commented Apr 23, 2021

Haversine() = Haversine(Float32(6_371_000))

Most lat/lon datasets are double precision so I don't understand why the single precision here. Can we double it?

@nalimilan
Copy link
Member

See #176 (comment). The idea was that if you have Float64 inputs, the result with be promoted to Float64, but if you have lower precision inputs they will keep using Float32. What's your use case?

@juliohm
Copy link
Contributor Author

juliohm commented Apr 28, 2021

lat/lon values are stored as Float64 in most GIS datasets. Computing anything in single precision is pretty rare because lat/lon values differ in the very last digits of the double precision representation. That is why I got confused with Float32.

This internal design decision could be made explicit in the docstring so that end users coming from GIS won't question it again. I did a quick benchmark to see if the promotion from single to double precision changed the performance, but apparently it doesn't.

@juliohm
Copy link
Contributor Author

juliohm commented Apr 28, 2021

If we talk about the most common use case of the Haversine, it will be in GIS applications. And in this case I would just set the default to Float64 to avoid confusion. People interested in Haversine for other exotic uses like GPU for example, could explicitly provide other type parameter.

@nalimilan
Copy link
Member

Do you have a concrete example where this could be confusing?

@mkborregaard @dkarrasch What do you think?

@juliohm
Copy link
Contributor Author

juliohm commented May 2, 2021

Just instantiating the distance will show the type parameter in the REPL and this is confusing for someone expecting double precision by default.

@dkarrasch
Copy link
Member

I believe that having a "minimal" type here is (at the risk of human confusion) the better choice. There is no lack of precision for that number, and for higher-precision floats it will be promoted anyway. To me personally, it feels a bit like the I in LinearAlgebra, whose default type is Bool for the same reason. It would be a different story if performance was suffering, though. Then we should perhaps take care of the most common use case.

@juliohm
Copy link
Contributor Author

juliohm commented May 12, 2021

Thank you, I think it makes sense given that there is no performance penalty.

Perhaps a note in the docstring would be enough.

@nalimilan
Copy link
Member

See #226 (comment): it turns out we can just use an Int value as the default.

@mkborregaard
Copy link
Contributor

great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants