-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Reimplement core_cython routines as ufuncs #75
Conversation
@lpsinger - Thanks! I've never seen this, so can't really comment on the proposal. @astrofrog @dstndstn - Thoughts? @lpsinger - So you think this is better because it allows to remove type casting and array shape handling code in our Python wrapper functions, right? Could you make the corresponding changes for one example so that we can see the difference here on a concrete example? This would also mean that this code is then actually executed by the tests, and we can see if it works, especially on Windows. |
I agree with @cdeil that I'd like to see what code can be removed as a result to properly understand the benefits. |
bb48842
to
5d7d23d
Compare
OK, done. Note that I moved the handling of |
ce80dbc
to
4c14bb6
Compare
This eliminates the need for much of the explicit broadasting and type checking. At the moment, the patch just implements `healpix_to_lonlat`. I'd like to ask for feedback on the basic approach before undertaking conversion of the other functions. Fixes astropy#72.
4c14bb6
to
1a12149
Compare
For me, as a typical Astropy coder with a little, but mostly superficial familiarity with Cython, and no familiarity with ufuncs, the ufunc version seems even a little more cryptic that what we have now. I mean things like this:
Probably it's mostly that I have never seen this ufunc stuff before, and once one gets used to it, then for experts like @lpsinger it is the nicer version? Uniformity of Cython code within Astropy core and affiliated packages is one concern for maintainability, i.e. if this alternative way is better, than possibly larger fractions of code should be migrated long-term. (but the process could still start here with astropy-healpix) Given that the goal of this package is to be merged as @astrofrog - Thoughts? |
I've written lots of ufuncs in C, but I'm definitely not a Cython expert. The strided pointer dereferencing (such as the I agree that the ufunc approach is going down a bit closer to the C and Numpy C API level, but the payoff is greater robustness and simplicity for the user by letting Numpy deal with the broadcasting, casting, and type checking for you. In Cython, it may be possible to reduce the boilerplate to create a ufunc by writing a little function decorator. |
@astrofrog, what do you think? |
@lpsinger - in general I am in favor of this approach, but have a few high-level comments:
Thanks for your work on this! |
If that's an option, I'd prefer pure C. Personally, I find it easier to read C than to try to figure out how Cython syntax maps to C. And OpenMP is just as easy in C.
They could, but I think that it might be confusing if it was possible to broadcast over the |
That is an option as far as I'm concerned - the only question then is do we keep Cython just for the high-level declaration of the ufuncs? (since that might avoid some boilerplate code when writing solely a C extension?)
Ok, in that case we should probably have one source file for each mode to make it easy to diff them and make sure they are in sync (and in future we could even make a jinja template to avoid duplication of code in the repo, but let's not worry for now). |
@lpsinger @astrofrog - Can we try to move this discussion along and make a decision within the next week or two? I want to spend some time before and after the meeting July 2/3 on @mhvk as Numpy ufunc expert and proponent (see e.g. this discussion) - maybe you could chime in here as well and make a suggestion how you think this new As for me: I'm reading the HEALPix paper and the code in this repo at the moment and trying to understand the details. It's not clear to me yet why we have 4000 lines of code in cextern/astrometry.net (plus another 200 in interpolation.c and 500 lines of wrapper code in astropy_healpix/core_cython.pyx. Compare this to the license-compatible https://github.com/michitaro/healpix/blob/master/src/index.ts or https://github.com/ziotom78/Healpix.jl/blob/master/src/Healpix.jl that have ~ 500 lines of code and have very similar functionality (except interpolation missing) and could also serve as basis for a rewrite. I'm not saying we should do that, I'm just saying that we could consider a rewrite in whatever form we consider the best long-term solution for Some concrete questions:
All (also @dstndstn): please comment and make a recommendation here, so that we can move |
To be fair, most of the code imported from astrometry.net is "bl" (block-list) stuff -- just a generic C linked-list/array hybrid container type, which is only used in the healpix-rangesearch code to return its results. When wrapped in python, one would be better to perhaps use callbacks and native python list appends. On the licensing front, you probably only need stuff that is 3-clause BSD licensed. If that is a problem, I am willing to re-license. |
@dstndstn - Apologies, I didn't mean to criticise your cextern/astrometry.net, I'm still in the phase of learning what it does and trying to figure out what we want for One thing that would help would be a README there with a few notes, e.g. that one should start reading in The part of the code that I find hardest to understand is the query stuff.
That sounds very slow and memory inefficient. Should we really consider or try to implement that? |
Having thought about it more and seen what @mhvk did with ERFA too, I'm in favor of ufuncs. I'm happy to leave it up to @lpsinger to decide whether to do a mix of Cython + C or just C. In terms of re-writing the C code to shorten it: no objections, but at the same time not really urgent.
I think that's correct.
Yes
Or pure C, without Cython
If we were doing a re-write, then actually writing it as a standalone C library that we wrap could be good in the sense that there would then also be a BSD C implementation. If we care about performance then C will always be faster than Cython.
I think ufuncs |
Can you point to an example, please? |
Yes - Astropy does this in astropy._erfa but that is a bit more complicated because the C file is generated on the fly and is very long! |
@cdeil - there is a nice tutorial on writing your own ufuncs [1], though for generalized ufuncs (which act on arrays), I learned most from looking at the numpy test cases [2]. [1] https://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html |
I think this discussion has concluded, and surprisingly all agree on how to implement this (right?): using ufuncs in C. @lpsinger - Next Monday / Tuesday we'll have the coding sprint on astropy-healpix, and I think we should try to resolve all open PRs before. OK if we close this one, and then a new one is started using C? @lpsinger - Do you have time to write that, or should I give it a try? @mreineck - To follow up on #90 (comment) : indeed, if you'd be willing to put BSD on the Python interface in https://gitlab.mpcdf.mpg.de/ift/pyHealpix , that would probably be the perfect example to follow for us here. |
If you are only interested in the interface (i.e. the specification which functionality is provided, and in which form), you don't need to worry about licensing at all - interfaces are not copyrightable (even if Oracle apparently disagrees). But the pyHealpix interface is designed around the concept that the Nside parameter and the ordering scheme are fixed within each call, and as far as I understand, this is not how you want to do it. In any case, feel free to take ideas concerning the interface! If you still want a copy of the files with a BSD header instead of GPL, let me know! |
I wanted a good example how to do ufuncs in C; but now I remember that you're using pybind11, so never mind. I'll look at what @mhvk suggested above in #75 (comment) . |
@cdeil - feel free to ping when you really start implementing the ufuncs. Being not much of a C programmer myself, the main thing that tripped me up was that one has to ensure that items passed in to The other part that may need some care is that for arrays your function gets strides passed in, i.e., the axes are not necessarily contiguous and you need to make a copy if the functions that do the actual work expect contiguous input (or do the relevant copy in a python wrapper). |
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
As a result, broadcasting across nside and ipix is supported automatically. Fixes astropy#72, closes astropy#75.
This eliminates the need for much of the explicit broadasting and type checking.
At the moment, the patch just implements
healpix_to_lonlat_ring
. I'd like to ask for feedback on the basic approach before undertaking conversion of the other functions.Fixes #72.