Accelerating performance #31

jmason42 · 2019-01-11T20:48:10Z

It seems like the most critical need right now is improved performance. I see three avenues:

More Python/NumPy-level performance improvements. Apart from some micro-optimizations, I don't know that there is much more to be gained here (and said micro-optimizations may make future development more cumbersome).
Assisted generation of compiled code via tools like Cython or Numba.
- I've written these sorts of algorithms in Cython before; the code can be painfully opaque but speed-ups are considerable when done right. Unlike Numba, NumPy's RandomState objects can be used fairly effectively by periodically regenerating a large list of random numbers.
- Numba was promising. However I'm increasingly unconvinced of its ability to accelerate more than a few isolated lines of code, and it doesn't play as nicely with NumPy's random number generation as it ought to.
- There are other tools (other JIT compilers, high-level modeling languages used in machine learning) that might be useful, but in my experience these often are 1) too restrictive (can't mix with plain Python code), 2) not broad enough (can't handle loops, or useful things like lists), or 3) slower than pure NumPy solutions (my personal experience with Theano).
A pure C/C++ implementation, as proposed by @prismofeverything. I can't say much here.

Without a run-off of sorts, I don't know which solution would be best. I lean towards whatever is less complicated and more maintainable. As a final reflection I'll note that parts of NumPy, like the random number generation module, are written using a combination of pure C and Cython.

prismofeverything · 2019-01-11T21:38:39Z

Yeah, I think this is the key. I did try the cython approach and found some good speedups (see cython branch). It was insufficient in that state, but this was before the last two non-cython speed improvements (progressively caching calculations further out the stack), so it may be more viable now. Also, in the cython branch I focused mostly on choose and propensity as a kind of initial beachhead, so probably we could get more wins there by extending it to the rest of the code.

As for the C implementation, I am passing values in between python and C and also set up an interface into a pure C gillespie function (so we can separate the wrapper/communication code from the actual implementation of the algorithm). Learning a lot about the python/C interface which could be useful for the future as well. I will continue down this road and hopefully have some results soon so we can get a sense of how much of an improvement we can expect without having to spend too much time on a fully optimized implementation.

@jmason42 If you have experience with the cython maybe you want to pick up the cython branch I started (or start a new one) and pull in our latest improvements? I think I am going to focus on the C. Maybe between the two of us we can get this thing at the performance we need : )

jmason42 · 2019-01-11T21:42:53Z

Sure, I'll take a look at Cython.

jmason42 added enhancement New feature or request question Further information is requested labels Jan 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerating performance #31

Accelerating performance #31

jmason42 commented Jan 11, 2019

prismofeverything commented Jan 11, 2019

jmason42 commented Jan 11, 2019

Accelerating performance #31

Accelerating performance #31

Comments

jmason42 commented Jan 11, 2019

prismofeverything commented Jan 11, 2019

jmason42 commented Jan 11, 2019