-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attributes parsed more efficiently by Read operations #138
Conversation
I’m concerned that this could break applications that relied on this check, possibly in a security-sensitive way. What about using a map instead? |
A map would break the existing API, since Attr is currently an exposed field in Element. (It would also be less space efficient and possibly less time efficient in many cases.) In retrospect it was a bad idea to default to deduplicating attributes, since it's slow when done with slices of attributes. Also, encoding/xml doesn't do deduplication, and many people switched to this package as an alternative to that one. I am sensitive to the possibility that this change could break existing applications that depend on deduplication, however. Perhaps this entire change will need to wait until a 2.0 release. |
An
I filed golang/go#68295 for Go not rejecting duplicate attributes. |
This could be done, although I'd still want the processed element to maintain the original attribute order after stripping duplicates. Doing anything else might break existing users of this package. I haven't yet come up with a way to do sorting-based deduplication without the overhead of making extra (temporary) copies of the original attribute slices. Moreover, it would still be O(nlogn), which is not ideal (even though it's better than O(n^2)). |
7756312
to
52ce29e
Compare
Ok I modified the branch to use a temporary map to eliminate duplicate attributes. This implementation perfectly preserves the same result as the current main branch. |
055a38c
to
97b5e18
Compare
When reading an XML document, this package uses a more time-efficient technique to detect and remove attributes with duplicated names (within each element).
Branch merged into main. |
When parsing an XML element, this package no longer checks whether attribute names appear more than once in the same element. Instead, duplicate attribute names are allowed, just as with the encoding/xml package.
CreateAttr continues to behave as before ("If an attribute with same key already exists on this element, then its value is replaced").
Deprecated the ReadSettings.PreserveDuplicateAttrs setting since this is now standard behavior and no longer performs any function..