-
-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes the currently failing json-schema-test-suite tests #370
Conversation
- Add custom type validator for draft6 (a la draft3), in order to fix the change in floats without fractional component (now considered ints) - Add a 'scope_key' property to validators, to handle the difference in id vs $id.
- uritemplate package latest version is 3.0.0, updated extra_requires to >= - Fixed jsonpointer test where instance is not str
Hey! Awesome, thanks so much for trying to move this along, these 3 things (draft 6, ref v2, and perf) have all hit little walls which has left them hanging there :/, so definitely definitely appreciate some help moving them to completion. Gonna have to leave some detailed comments, for now just doing a first pass. |
@@ -155,6 +155,7 @@ for more information. | |||
### PostgreSQL ### | |||
|
|||
* [postgres-json-schema](https://github.com/gavinwahl/postgres-json-schema) | |||
* [is_jsonb_valid](https://github.com/furstenheim/is_jsonb_valid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna just merge this immediately (the squash of the upstream changes)
@@ -56,7 +56,7 @@ def _validates(cls): | |||
return _validates | |||
|
|||
|
|||
def create(meta_schema, validators=(), version=None, default_types=None): | |||
def create(meta_schema, validators=(), version=None, default_types=None, scope_key="id"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this makes me a bit uneasy (which is why I didn't do it immediately myself, I was trying to convince myself what the right thing was...).
I think to be safe (future-proof) we need to make the interface a bit wider -- i.e., rather than just specifying the name of a key to use for scoping, we should provide something like a function here that the validator will use to detect scope (and yeah to make the current implementations just be lookups). I'm worried about the potential for some more complicated process, and once we already need an argument to do it, might as well have it be a generic one. Lemme know if that makes sense.
We also need to have direct tests for this (i.e. separate from any specific draft support).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lemme know if that makes sense.
You mean specifying a validator for id
or $id
to handle the scope change?
I did consider that, however I wasn't sure the corresponding architectural change would be nice and wondered if a simpler solution is better at least until such time as there are future differences. But I see your point.
We also need to have direct tests for this (i.e. separate from any specific draft support).
Agreed on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be slightly narrower than a validator -- basically it is a function to replace the calls just to someschema.get("id")
-- so, scope_of(schema)
, say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And really, yeah this is tightly related to #346 unfortunately, which is to say, I think the even better thing is to come up with the right interface for that, provide the RefResolver class you want to use to this function, and have a function on that interface that is "scope of". I just can't decide whether it's obvious enough how that looks quite yet, and how much work it is to change it :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provide the RefResolver class you want to use to this function
That was kinda where my initial thought was with this too. Unless I'm misunderstood, my concern is that a user may specify a RefResolver
instance when calling .validate()
-- this would mean a user is then specifying the draft number twice, i.e. Draft4Validator
and Draft4RefResovler
, which seemed unpleasant? I'm not up to speed enough to determine if that's a narrow enough issue though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right so I think the idea there would be to stash what is passed in here as the resolver callable onto the validator interface -- i.e., the newer way to construct resolvers would become DraftSomethingValidator(resolver=DraftSomethingValidator.Resolver(...))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright well I think I've got a bit of a better picture of the direction you'd like this to go in. I'll try and take a look at the related issues and then see if I can work towards something. Of course I shan't be offended if you're actively working on these things and prefer effort elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideas I have, time, much less so :(, especially given that the ideas are often only partially baked and need refinement or experimentation to see whether they fully fit -- so definitely definitely would appreciate the help!
I think the one thing that seems most clear at the moment is the stuff I mentioned about type
, so I might recommend that one as the first self-contained area to poke at -- widen the type interface, get that merged, and then come back here and wiggle this onto the new one.
But there are many many paths forward and yeah I super appreciate any help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a look at the type
stuff [1] - it's WIP and not yet ready for a PR, but if you have the time to give some feedback before I take it further it'd be really appreciated.
In particular:
- Rather than the single
is_type
injected function discussed in WIP: Support namedtuple validation #365, I've taken the approach of a function per type. I figure this is more composable for extensions - I presume a need to maintain backwards compatibility with the old
default_types
and per-instancetypes
properties. I believe this does so (tests using this pass) - but ideas appreciated - I feel this works well for 'simple' types, for example handling the change in integer in draft6 and e.g. a hypothetical scenario where strings encoding ints are valid under
integer
[2]. However, some difficulty remains for something like Support namedtuple #364 to extendobject
to allownamedtuple
- a user would have to provide custom validators for allobject
-based checks, e.g. in my test case [3]. Wether or not this is an acceptable level of work for a custom type or we should look to facilitate this I'm not sure.
[1] draft6...bsmithers:types
[2] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L56
[3] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L95
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So! I think that looks like a great start -- if I can inject one idea there (which is only 80% baked I think, so lemme know if I go off a rail) -- I think the right thing here is to take a cue from what's happening on the ref resolver and format checker sides -- to introduce a TypeChecker
class, one that you associate an instance of with a validator class in create and extend.
The one major mistake I'd love to correct would be to not have mutable APIs for that -- so, to talk about the specific namedtuple
case, the API I'd expect would look roughly like:
def object_or_namedtuple(value):
return Draft4Validator.type_checker.is_type("object") or getattr(value, "_replace", None) is not None
say, and then Draft4Validator.type_checker.redefine(object=object_or_namedtuple)
which should return a new instance (which you'd use via Draft4Validator(type_checker=that)
where the appropriate redefinition has happened (attr.evolve
+ a pyrsistent.pmap
of checkers are the pieces we'd use to make that happen).
You can then also have type_checker.redefine(**types)
to convert from the old format, and I think the above does seem like a reasonable amount of work to do the namedtuple case.
How does that sound to you (and as usual apologies for spitting information out slowly here after you've already gotten started :), as usual with these things what seems right only starts to come out and evolve after progress has begun).
if schema is True: | ||
resolver = RefResolver.from_schema({}) | ||
_schema = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that now you will mis-report schemas.
One major gap in draft 6 support is writing some tests for the new schema coersion behavior (of bools) -- specifically, this will misreport what the schema was in ValidationError.schema
.
@@ -391,6 +391,15 @@ def type(validator, types, instance, schema): | |||
yield ValidationError(_utils.types_msg(instance, types)) | |||
|
|||
|
|||
def type_draft6(validator, types, instance, schema): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too I think draft 6 taught us we need to widen the type interface. I put some notes in #364, maybe have a look there if you don't mind?
@@ -356,7 +358,7 @@ def __init__( | |||
self._remote_cache = remote_cache | |||
|
|||
@classmethod | |||
def from_schema(cls, schema, *args, **kwargs): | |||
def from_schema(cls, schema, scope_key="id", *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).
I don't love this idea -- now a caller needs to know implementation details of the draft they're using.
I'd much rather encapsulate that, even if that means that we now need to have separate ref resolvers per draft (and we'd also need to preserve exactly the old interface for a class called just RefResolver
). Annoying, but I think the whole change is annoying :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).
The problem is that they're not, because that from_schema
method needs to understand id. My initial thought was actually to drop that method completely, but of course that's extremely unpleasant for users in the wild.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but what I mean is -- as far as RefResolver
is concerned, when presented with any schema, there is an algorithm you can use to unambiguously know this (but now this introduces the possibility of error).
Even now, you have this algorithm:
- Look for a
$schema
key - If you find one, you (RefResolver) know what the correct scope key is
- If not, you use the "package" default (which I'm not too happy with how we're managing currently :/, but it's a thing)
You can take that algorithm, and layer it on top of an "explicit" one -- say a Draft3Draft4RefResolver
and a Draft6RefResolver
alongside a resolver_for_schema
that calls the right thing as above.
(I know I'm being terse still, lemme know if that's unclear)
@@ -28,7 +28,7 @@ | |||
"jsonpointer>1.13", | |||
"rfc3987", | |||
"strict-rfc3339", | |||
"uritemplate>3.0.0", | |||
"uritemplate>=3.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the > was failing here, but that's because the patch with this wasn't merged yet (and in fact 3.0.0 doesn't have what we need).
So I think it was correct as-is [and probably even needs some more work once the upstream PR is merged].
Maybe you can chime in upstream with any comments? The PR is python-hyper/uritemplate#36
@@ -296,6 +296,8 @@ def is_css3_color(instance): | |||
draft6="json-pointer", raises=jsonpointer.JsonPointerException, | |||
) | |||
def is_json_pointer(instance): | |||
if not isinstance(instance, str_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a test :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It already does - the test fail was what caused me to fix :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool. Hm, I thought I'd ran these after updating. Thanks for fixing.
@Julian Hi, could this be merged? It should help to resolve other pull requests, I reckon. |
@manycoding will have another look now but I'm not even sure anything in this PR is relevant anymore, all the tests now pass already and a lot of the specific fixes here went in separately. What specifically from this one are you looking at/for? |
817b724 add perl implementation and test suite to the user list ca14e01 Merge branch 'pull/382' 3dabf55 move non-format tests out of the format directory, while keeping all ECMA regex tests together 4121aa5 move format tests to their own directory 4bae8aa Add more idn-hostname tests to draft-2019-09 and draft-07 6d91158 [325] Add some more hostname tests e593057 Merge pull request #389 from ssilverman/readme-drafts fb3766d README: Improve issue/PR links 79bef22 README: Update language around drafts ade47e4 README: Add Snow to the list of supporting Java validators fc0c14e README: Update simple JSON example 1167669 README: Update structure, consistency, spacing, and wrapping 9514122 Merge pull request #388 from json-schema-org/ether/maxProperties=0 7646490 test that maxProperties = 0 means the object is empty c3f4319 Merge pull request #384 from ChALkeR/chalker/unique 7766f48 Improve uniqueItems validation 7555d41 Add unnormalized $id tests 11f70eb [300] Add tests for valid use of empty fragments in "$id" b106ff0 [299] Add tests for invalid use of fragments in "$id" 4a74d45 Fix "tilde" spelling 3eca41b Merge pull request #379 from json-schema-org/ether/remove-wrapped-refs d61bae8 remove wrapped refs 536ec07 [359] Add unevaluatedProperties/unevaluatedItems cousin tests ac63eb7 Small README update that introduces the concept of directories 697944e Merge pull request #374 from karenetheridge/ether/allOf-anyOf-oneOf 33f8549 test all the *Of keywords together 44b99ed Merge pull request #373 from karenetheridge/ether/items-and-contains 4a2b52f some tests of items + contains 7f00cc8 add test that was present for other drafts but not 2019-09 a3f9e2e Merge pull request #371 from karenetheridge/ether/if-then-else-boolean aeeaecc some tests with if/then/else and boolean schemas b8a083c Merge pull request #372 from nomnoms12/unique-false-and-zero 85728f1 Add tests for uniqueness [1] and [true] fd01a60 Add tests for uniqueness [1] and [true] 0a8823c Merge pull request #370 from karenetheridge/ether/nul-char fa6f4dd add tests for the NUL character in strings 8bf2f7f Merge pull request #369 from ssilverman/data-desc 2ba7a76 Add data description 4f66078 Merge pull request #367 from karenetheridge/ether/additionalItems 283da7c some more tests for additionalItems 7ba95f3 add tests for keywords maxContains, minContains 2f2e7cf Merge pull request #365 from karenetheridge/ether/move-ecma-regex 8388f27 move ECMA regex tests under format/ git-subtree-dir: json git-subtree-split: 817b724b7a64d7c18a8232aa32b5f1cc1d6dd153
Currently using jsonschema at work and interested in getting draft6 support finalised. To jump in, I've provided a couple of quick fixes for the previously failing tests.
Before:
FAILED (skips=47, failures=3, errors=10, successes=1209)
After:
PASSED (skips=47, successes=1254)
Note that I pulled in the latest tests from the JSON schema test suit repo.
I was hesitant to have
RefResolver
needing to know about id vs $id, but I could see no other way without removingRefResolver.from_schema()
from the interface. I also briefly looked at the tests being skipped, which appear to requireRefResolver
to be able to parse the id attribute, but I'll comment on that separately.