Improve PrefixMap.shrink() - shrink to longest-matching namespace #13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses issue #9
I believe this is a better fix than the currently proposed patch.
I changed shrink() to match against a list of namespace IRIs, sorted
by length. If a substring match is found, I then do a reverse-lookup
from IRI to prefix.
The sorted list is only generated during shrink() if it does not
already exist: it is cached between calls. Calls to set() or
remove() nullify the cached list.
Two new properties have been added to PrefixMap for all this: _rev
which holds a reverse-lookup from namespace IRI to prefix, and
_nscache which holds the list of namespace IRIs, sorted by length.
Additionally, in set() I now silently ignore prefixes beginning
with '_' (an underscore char - eaten by github), to prevent
accidently clobbering of the new properties.
Which, albeit a subtle behaviour change (which I'm not too keen
on), shouldn't matter anyway because in Turtle, SPARQL, et al.,
underscore is not a valid start character for a prefix name.