-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about Stan syntax and math library coverage #4
Comments
My original goal was to satisfy my curiosity, to figure out which Julia AD system to use, and to figure out whether it would be worth it to use Stan from Julia to compute the gradients of (non-logdensity) loss functions. My previous impression was that Enzyme kept crashing or failing to compute the gradients I cared about, Tapir/Mooncake was also not ready yet, and ReverseDiff was the "best" option, but it still felt super slow and restrictive compared to Stan's AD. OTOH, coding my loss functions up in Stan also sounded super annoying... The answer to my original question seems to be to use Enzyme if it works, otherwise or maybe also generally in the near future use Mooncake, and to live with the potential performance penalty compared to Stan. Based on the PosteriorDB performance comparison, I think one can hope to be faster than Stan, but must expect to sometimes be slower (by less than one order of magnitude). Having recently learned about https://github.com/JasonPekos/TuringPosteriorDB.jl (via TuringLang/DynamicPPL.jl#346), I think it might actually be interesting to see whether the Stan-like syntax allows StanBlocks to be faster than the "natural" Turing implementation, as there are some optimizations which are easy to apply with Stan's syntax, but more difficult otherwise. Though I don't see myself doing that any time soon. Which finally brings us I think to your original two questions 😅 I have not registered StanBlocks, because I have not been intending to extend its functionality any further than is needed for the PosteriorDB models. I also don't think that it makes sense to duplicate the work done on Turing. However, I've found myself using StanBlocks to code up some quick models, if only because I understand it much better than Turing's internals. On top of that, adding functionality beyond what Stan or Turing provide would be much easier for me in StanBlocks than in Turing (or Stan). As such, the answers:
Hope this helps! |
Thanks, @nsiccha -- super helpful! You explored using Julia as a backend language (c.f. C++) very well in StanBlocks.jl. The current performance bottlenecks (e.g. autograd or math library) in StanBlocks are nothing fundamental and can be addressed with time. The advantage of Julia is its ecosystem for profiling, debugging and optimising numerical code (together with many other features), which is often a joy to work with. I think it is a fruitful idea to have a modelling syntax (aka a new macro-based DSL similar to |
Thanks, I agree! Sounds fun, let's do that. Have a nice rest of the year! |
@nsiccha, can you provide some details on
The text was updated successfully, but these errors were encountered: