Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve PrefixMap.shrink() - shrink to longest-matching namespace #13

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jimsmart
Copy link

@jimsmart jimsmart commented Dec 6, 2016

Addresses issue #9

I believe this is a better fix than the currently proposed patch.

I changed shrink() to match against a list of namespace IRIs, sorted
by length. If a substring match is found, I then do a reverse-lookup
from IRI to prefix.

The sorted list is only generated during shrink() if it does not
already exist: it is cached between calls. Calls to set() or
remove() nullify the cached list.

Two new properties have been added to PrefixMap for all this: _rev
which holds a reverse-lookup from namespace IRI to prefix, and
_nscache which holds the list of namespace IRIs, sorted by length.

Additionally, in set() I now silently ignore prefixes beginning
with '_' (an underscore char - eaten by github), to prevent
accidently clobbering of the new properties.
Which, albeit a subtle behaviour change (which I'm not too keen
on), shouldn't matter anyway because in Turtle, SPARQL, et al.,
underscore is not a valid start character for a prefix name.

Addresses issue awwright#9

I believe his is a better fix than the currently proposed patch.

I changed shrink() to match against a list of namespace IRIs, sorted
by length. If a substring match is found, I then do a reverse-lookup
from IRI to prefix.

The sorted list is only generated during shrink() if it does not
already exist: it is cached between calls. Calls to set() or
remove() nullify the cached list.

Two new properties have been added to PrefixMap for all this: _rev
which holds a reverse-lookup from namespace IRI to prefix, and
_nscache which holds the list of namespace IRIs, sorted by length.

Additionally, in set() I now silently ignore prefixes beginning
with '_', to prevent accidently clobbering of the new properties.
Which, albeit a subtle behaviour change (which I'm not too keen
on), shouldn't matter anyway because in Turtle, SPARQL, et al.,
'_' is not a valid start character for a prefix name.
@jimsmart
Copy link
Author

jimsmart commented Dec 6, 2016

Just noticed there are whitespace issues with this patch - happy to fix, please advise re: tabs/space/tabsize for the project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant