-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffered parsing: add parse() method, and return Tokens from take_while() and take() #13
Conversation
I'm still feeling somewhat on the fence about this:
So a part of me is wondering about moving the |
I went and renamed buffered to parse, and removed all but the 3 "core" methods (now named |
I was still a bit unhappy that things didn't play quite as well together as I wanted, but I think I am now happy with this next reworking. Now:
Together, we can get back the prev functionality; we can now do: // Equivalent to tokens.parse_n
tokens.take(3).parse::<u16, String>();
// Equivalent to tokens.parse_while
tokens.take_while(|t| t.is_alphabetic()).parse::<u8, String>();
// Equivalent to tokens.parse_remaining
tokens.parse::<Foo, String>(); I've kept Maybe helper functions like |
Replying in order:
It is not strictly better. I made it this way is so
See above
Good point. That's better.
I changed it for two main reasons. First, it matches ArrayString. Second, its defining feature isn't really that it's on the stack. It's that it's inline (e.g. I think it's reasonable to expect the library user to import the type they are interested in from another crate and only use
It looks like you agree.
I thought I did that. If I didn't it was an oversight. Regardless, that makes sense.
Agree.
Yeah, this method was basically the only reason I added them at all.
That was a good idea. I have to test it some to check, but that solution looks much nicer.
Agree. Totally makes sense to decide what, if any, to add later after the core methods are added. The associated buffer method does allow less copying. I'd have to benchmark to see if it mattered. This proposed method is certainly elegant. The other idea I'd have is to implement a |
The main issue I see with the |
Thankyou for your detailed responses! I liked your This ticks off the main place I'd want to avoid allocations personally; the case where you use yap to parse some stringish tokens, but then need to go and call some external parsing function to turn that string into the desired type (eg a number). I'm still open to thinking about adding a
Ok, yeah I have been quite on the fence about keeping it around anyway, so I'll just ditch it and leave it to the user to bring their own buffer! |
Aside from fixing a couple of bits, I hid
The way it's implemented at the moment, doing
|
Yeah, that's what I was describing. For accessing the buffer directly, I think if I still found it necessary I could impl a new trait on the tokens so it doesn't need to happen here. You might want to add a benchmark ( |
Aah, at the time it didn't quite click that having all of the parse functions iterate once to find the start and end location, and then likely again in the default With this, I do agree that benchmarks would be nice to add anyway, but I won't add them here (probably an issue would be good) since I think nothing wasteful is happening now. Good point about being able to define a trait which mentions a buffer type, if it's useful to be able to access it for other reasons :) If there are no objections I'll probably merge this in the next day or so and put a release out! (I'm also running out of time I want to dedicate to this for now, too; too many other things to do!) |
|
Brill, thanks for spotting these; I added all of the obvious nested parse optimisations inc the ones you suggested (there are more that could be added but they aren't definitely optimal any more), and addressed your other points! |
Is there a reason that |
I'll have a look later; it was quite late when I did the above and partly I just didn't want to think about it! Edit: I had a look, and offhand it's a bit tricky because for |
Sure. I think the |
Maybe, but i had used that several times already In different places :) |
I wouldn't keep |
See rust-lang/rust#64117 for discussion. |
Thanks for the link! It's a little more useful of an alias here because of the additional
(I guess this PR has become a little broad but hey ho) |
@Easyoakland re
|
If the iterator is fused it doesn't matter. If the iterator is not fused (e.g.
Sure. Put it where you want (new crate, new module, etc). Maybe this is how you start filling a |
I'll just leave it as-is; it's only a small thing, and if people don't want to consume then
I moved it to a new |
Two more things that can come after merging this increasingly large PR
|
Do you mean, making
Heh, yeah potentially! |
And yup, probably time to merge this; it's turned into a general "work on yap for a bit" PR! |
That or a blanket |
Ooh, alas Rust wouldn't allow such a blanket |
Anyway, thankyou for all of your time and comments on this; they are much appreciated! I think this parse stuff has worked out quite well in the end, and it's lovely to be able to avoid the "allocate intermediate buffer" in the common cases. |
your |
Argh, it'll be this laptop touch pad; I do that far too often... Fixed on master :) |
This takes the [
BufferedTokens
] stuff from @Easyoakland's PR (#11) and makes a few tweaks with the following rationales:Buffer
associated type for now, because I'm not sure how I feel about it:StrTokens
directly (perhaps we could allow one to easily take slices of this given some locations? Does exposing the buffer type sortof go against the "iterator centric" style of this library.push
fn which accepts chars and not bytes. My goal here was to limit the ways that it can panic (now, only if it runs out of space it will panic, which seems reasonable).@Easyoakland what do you think?