You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's not clear whether this library is subject to the same exponential blowups as other RE implementations. Adding a note in the README would be very helpful.
This seems particularly relevant after the recent CloudFlare outage which was partly caused by a bad RE that led to crippling CPU usage.
The text was updated successfully, but these errors were encountered:
There shouldn't be any exponential blowup scenarios here.
The precise complexity depends on how you use the library. If it's just matching, then I think it would be O(n) time and O(1) memory for any given regex.
For recognition, I'd predict O(n) memory because of the bookkeeping of which branch was taken at each step. Hopefully the runtime stays O(n), but I'm not sure.
If you're up for confirming these empirically and then preparing a README PR or filing issues (depending on the outcome), I'd gladly accept them.
I was hoping for more than empirical evidence, since you can never be sure there are no expressions that cause problems. I would be much happier with a logical argument based on the nature of the internal algorithm.
AFAICT from the haddocks, there's no way to do backtracking, so that eliminates an entire class of problems. However, is it possible to (inadvertently) create a pathological matcher/recognizer by including the equivalent of many empty in it?
I was hoping for more than empirical evidence, since you can never be sure there are no expressions that cause problems.
So empirical evidence wouldn't convince you yet a note in the README would? :)
Indeed, you can never be sure, period.
The code is sufficiently complex that a manual analysis/proof would require a lot of work and ingenuity. Besides, its runtime behaviour is in part determined by the way the compiler compiles higher-order functions.
However, is it possible to (inadvertently) create a pathological matcher/recognizer by including the equivalent of many empty in it?
It shouldn't be, but the easiest way to check is to try. (Or try to prove if you're so inclined.)
It's not clear whether this library is subject to the same exponential blowups as other RE implementations. Adding a note in the README would be very helpful.
This seems particularly relevant after the recent CloudFlare outage which was partly caused by a bad RE that led to crippling CPU usage.
The text was updated successfully, but these errors were encountered: