Skip to content

Types_File

Andy Lowry edited this page Jun 4, 2018 · 1 revision

Defining the DSL - The Types File

The primary job of anyone wishing to create a JsonOverlay based parser is to create a types file, which is a YAML file that defines all the Java source files to be generated for the parser. The generator is then executed in order to create or update Java sources that define whatever interfaces, implementation classes, and enums are called for.

Each of the generated classes will be a subclass of JsonOverlay<?>, which is the base class for all objects created by the parser to represent the parsed source document. In particular, every class generated for a non-enumeration type will be a subclass of PropertiesOverlay<?>, which is a base class for types that represent JSON objects with fixed properties. Classes generated for enumeration types will extend EnumOverlay<?>, which knows how to parse JSON strings into corresponding enum constants.

Top Level Object

The types file defines a single JSON object with the following overall structure:

types:
  - <type-decl>
  ...

At its heart, the types file is just a single JSON object with a types object property that declares a bunch of named types. There are a handful of other top-level properties that may be used. Those are described in Top-Level Properties. In the meantime we’ll get right to the type declarations themselves.

Type Declarations

Here’s a simple type declaration for a Java type that has a string property, and an integer property.

name: Person
fields:
  name:
    type: String
    structure: scalar
  birthYear:
    type: Integer

The name property of a type declares the name of the Java interface that will be generated for this type. The generated implementation class will be have a name that incorporates the type name and adds a suffix (e.g. PersonImpl).

The generated parser uses type declarations to interpret JSON objects it encounters in a source document. A typical parser will expect one of the declared types to appear at the top-level of a source document, in the form of a JSON object. That object will give rise to an instance of the Java class generated for the type, which will be the value returned by the parser. The properties of the top-level object will be used to obtain values for the fields of the returned object. Objects that appear elsewhere in the source document will likewise give rise to instances of other generated classes, based on other declared types.

The fields object defines the properties that will be made visible for this type via its generated methods.

Additional properties may be used in type declarations but are less commonly required. They are described in Type Properties.

Field Name and Parent Path

The name of each property in this object would appear to be the name of a corresponding field in the Java class, and that is, indeed, typically true.[1] In fact, the a property name in the fields object is the parent path of the field within a source document to be parsed by the generated parser. This is how the types file defines not only the types to be generated, but the structure of the JSON/YAML files that are to be parsed.

Here’s what a YAML file containing a Person object instance might look like:

members:
  - name: John Wright
    birthYear: 1983
  - name: Alice Fleming
    birthYear: 1971

Here we see two people that are members of an organization. They appear in a YAML array that is the value of the members property of the top-level document. The name and birthYear properties of each person object are what the generated PersonImpl class will look for in order to obtain properties for its name and birthYear fields.[2]

So the properties in the fields object correspond to property names that will be recognized in the source document presented to the parser. Or more generally, its a path through the source document. For example, we might have defined the birthYear field like this:

dateOfBirth/year:
  name: BirthYear
  type: Integer

Now, the path to the field value within the source document is dateOfBirth/year - that is, the parser will expect a dateOfBirth property whose value is an object with a year property - like this:

members:
  - name: John Wright
    dateOfBirth:
      year:1983
  - name: Alice Fleming
    dateOfBirth:
      year: 1971

Now we’ve explicitly specified the field name, using the name property in the field declaration. The reason we didn’t need to do that previously is that the name property in a field declaration defaults to the last component of the field’s parent path, with its first character in upper case.

Field Structure

The name field declaration in our example specifies structure: scalar, while the birthYear property does not. In fact, they are both declared as scalars, becuase scalar is the defualt value for the structure property in a field declaration.

In general, the structure property indicates whether a given field is expected to appear as a single value (scalar), a list of values (collection - a JSON array), or a map of named values (map - a JSON object).[3]

Important
The field name default rule mentioned above is slightly different for lists and maps, since the parent path for such fields will typically end in a plural noun, and the field name should be singular. Therefore, for list and map fields, if the parent path ends in s, that s is omitted from the defualt name. Of course, whenever these rules don’t work correctly, the correct name can be explicitly given.

Field structure determines not only the basic type of the JSON structure expected by the parser when parsing this field; it also affects the set of methods that will be generated for the field within its generated type. For a scalar field there will be basic get and set methods, while for a list type there will also be methods getting the number of elments and for getting/adding/removing elements at a particular position. Similarly, a map field will generate methods for getting/setting values under specific names.

Field Type

Each field is declared with a type property. This can be either a primitive type or a declared type.

Primitive types are:

Type Name Description

String

A JSON string, represented as a Java String in the parsed representation

Integer

A JSON number that has no fractional part, represented as a Java Integer in the parsed representation

Number

Any JSON number, represented by a Java Number (which is superclass to all the Java numeric types, including BigDecimal and BigInteger; the appropriate representation will be used based on the parsed value)

Boolean

A JSON boolean, represented as a Java Boolean in the parsed representation

Object

any JSON value at all, represented in the parsed value by a Java List<?> for an array, a Map<String,?> for an object, and an appropriate Java type for scalars.[4]

Primitive

any primitive JSON value, encompassing all of the above except Object

A field may also specify a type defined within the types file. For example, in our example we might have a type named Club defined like this:

name: Club
fields:
  name: {}
  members:
    type: Person
    structure: collection

When parsing a Club, the parser will expect a JSON object with a name property and a members property. A string value will be expected for the name property,[5] and for members the parser will expect a JSON array, each of whose elements will be parsed as a Person.

Specifying a Plural Name

The field name is used in the names of generated methods within the generated type. For list and map fields, some of the generated methods operate on the field as a whole, rather than individual values contained within it, and for these methods we need a plural form for the name.

By default, the plural form is obtained by appending s to the field name. Where this does not work, the plural can be specified explicitly as the value of the plural property in the field declaration.

For example, if we wanted to extend our Person object with a list of children, we might do something like this:

name: Person
fields:
  name: {}
  dateOfBirth/year:
    name: BirthYear
    type: Integer
  children:
    name: Child
    plural: Children
    type: Person
    structure: collection

There are a handful of additional properties that can be specified in field declarations. See Field Properties for the complete list.

Comingling Maps In a Typed Object

As mentioned at the outset, a non-enumeration type declaration gives rise to a generated class that extends PropertiesOverlay<?>, which knows how to interpret JSON objects using the fixed set of fields supplied in the type declaration. It is also possible to extract other properties from the same JSON object or a sub-object and collect them into one or more map fields appearing in that type.

There are two unusual features of the field declaration for such comingled maps:

  • The field should be supplied with a parentPath property, with a value that indicates the path through the JSON structure from the current object to the object whose properties are to be collected into the map. When you wish to include properties from the that contains the declared field values for this type, set the parentPath to the emtpy string "".

  • You should supply a keyPattern property with a string value. The value will be interpreted as a Java regular expression, and it will be used to select the properties that will be included in the map, by matching the property names to the regex. It is important that each property in a parsed JSON object appear in at most one parsed object - generally as the value of field in a declared type, or as an entry in a map field. For this reason, it is important that the keyPattern exclude any properties that might otherwise be incorporated into other parsed values.

Good examples of comingling come from the OpenAPI v3 Specification. The KaiZen OpenApi Parser is a JsonOverlay-based parser for OpenAPI v3.

An example of a map comingled with a fixed-field object is the Path object. This object contains a few "normal" fields like summary and description and others, as well as a collection of fixed fields that are all of type Operation, defining various operaitons that can be performed on a web resource. The operation properties are get, put, post, delete, options, head, patch, and trace.

Rather than defining individual fields for each of these properties, the KaiZen parser defines a map field named operations that collects all the operations together. Here’s what this looks like:

- name: Path
  fields:
    summary: {}
    description: {}
    operations:
      structure: map
      parentPath: ""
      keyPattern: get|put|post|delete|options|head|patch|trace
    get:
      type: Operation
      noImpl: true
    put:
      type: Operation
      noImpl: true
    ...

Here, the operations object is declared as a map field consuming properties from the current object (parentPath: "") and including only those named in the supplied keyPattern. The type declaration also supplies noImpl field declarations for the individual operations, as a convenience. The generator will not generate methods for these fields, but methods have been manually added to the generated class, implemented with the operations map.

It is also possible to create multiple maps consuming the same JSON object, whether or not that object is also parsed as a fixed-field type. An example that runs throughout OpenAPI v3 comes from its use of "extension" properties. These are properties that can have arbitrary values and that accompany the fixed fields of most objects defined in the specification. Extension properties must all have names that begin with "x-" In many cases, extensions accomapny properties that are collected into some primary map object.

For example, the paths property in the top-level object of an OpenAPI v3 model contains properties that name URL paths for an API, and the values of those properties are the Path objects used in the operations example above. However, the paths object may also contain extension properties. This is how it looks in the KaiZen parser types file:

fields:
  ...
  paths:
    structure: map
    keyPattern: "/.*"
  pathsExtension:
    name: PathsExtension
    type: Object
    structure: map
    keyPattern: "x-.+"
    parentPath: "paths"
  ...

The paths field is completely normal, and its parentPath is paths, defaulting to the property name of the field declaration itself. The pathsExtension field specifies parentPath: paths so that it ends up consuming properties from the same JSON object during the parse. However, both of these map fields are equipped with mutually exclusive key patterns, so that no parsed property value will be incorrectly consumed by two different parsed objects.

Enumerations

A type declaration can describe an enum type to be generated, rather than an interface/class combination. To do this, provide an enumValues property in the type declaration. In this case, properties other than name and enumValues will be ignored.

The enumValues property should be a list of the enum member constants. For example:

- name: Color
  enumValues: [RED, ORANGE, YELLOW, GREEN, BLUE, INDIGO, VIOLET]

The enum definition will appear in a Java source file in the same directory as all the generated interfaces. A Java class will also be generated with all the other generated classes. The class in this case will be an extension of EnumOverlay<?>, which is a subtype of JsonOverlay<?> that knows how to parse JSON string values into the corresponding constants in the generated enum.

YAML Aliases

If you find yourself repeating values in your types file, you should know that the types file parser understands YAML aliases and references, and you can use these to DRY up your specification. For example, you might find yourself using the same complicated keyPattern in several map field declarations. Rather than specifying it separately in each use and running the risk of getting it wrong in one or more of them, you can create a named alias for it at its first appearance and then use references elsewhere.

For manageability, you may want to consider collecting all your alias definitions at the top of your types file, e.g. in a top-level property named decls. (This property and any other unknown properties in the types file will be silently ignored by the code generator.) All aliases must precede their first reference within the YAML file, so this decls property should be at or near the top.

Here’s an example of what this might look like, taken from the KaiZen parser’s type file:

decls:
  - extPat: &extPat "x-.+"
  - noextPat: &noextPat "(?!x-).*"
  - namePat: &namePat "[a-zA-Z0-9\\._-]+"
  - noextNamePat: &noextNamePat "(?!x-)[a-zA-Z0-9\\._-]+"
  - pathPat: &pathPat "/.*"
  - extName: &extName extension
  - extDef: &extDef
      name: Extension
      type: Object
      parentPath: ""
      structure: map
      keyPattern: *extPat

There are aliases for several key patterns that are used elsewhere in the types file, as well as an alias named extDef for an entire object. The latter is used throughout the file to define extension map fields.

Simple key pattern aliases are used elsewhere in the types file like this:

   keyPattern: *namePat

An object alias like extDef can be "merged" into another object like this:

pathsExtension:
  name: PathsExtension
  <<: *extDef
  parentPath: paths
----

This is really how the KaiZen parser declares its pathsExtension field; in the example earlier, the values from the extDef alias were shown explicitly. Note how explicitly provided property values override values from a merged object alias.

See The YAML specification (or google for gentler introductions) to learn more about aliases.

Types File Reference

Top-Level Properties

Property Name Default Value Description

types

none

This is the list of type declarations to be generated.

imports

none

The generator can usually create all necessary imports in its generated interfaces and classes. However, if you manually add members to the generated sources, those members may require additional imports. Regeneration will copy the manually added members, but it will not recreate the required imports. Specifing them here will allow that to occur.

The value of this property shoudl be a JSON Object whose keys are simple class names and whose values are the corresponding fully-qualified class names, e.g. ExternalClass: com.example.library.ExternalClass.

modelType

none

If this property is provided, it should be the name of what you would consider the root type of your set of types. This will cause every generated interface to extend IModelPart, and this will cause the Overlay#getModel() to become a more convenient version of Overlay#getRoot(), returning the root object as an instance of the model type, rather than as a generic JsonOverlay<?>.

defaultExtendInterfaces

none

This is a list of interfaces that all generated interfaces will extend.

discriminator

none

This establishes a default discriminator for any supertypes in your types collection. A type is a supertype if there is any other type that names it in extensionOf.

Type Properties

Property Name Default Value Description

name

none

Name of the generated interface type. The generated implementation class type will be created by adding a sufix (specified as an option to the generator) to this name. The name should be a valid Java identifier. Collisions with commonly used types (e.g. String) will probably cause bizarre errors.

fields

none

This is a JSON object that defines the fields declared for this type. See Field Properties.

extendInterfaces

none

If the generated interface for this type should extend one or more other interfaces, they can be listed here. You can use simple names if you provide the fully qualified names in the top-level imports map.

imports

none

If this appears, its value a JSON object in which three properties are recognized: intf, impl, and both. The value of each should be a list of types that need to be imported in the generated sources for this type. The generated interface will import types specified in either intf or both, while the generated implementation class will import types specified in either impl or both. You can used simple names as long as you provide fully-qualified names in the top-level imports map.

noGen

false

If this is true, the generator will not generate any Java files for this type. Why would that be useful? Perhaps you have manually created the files and want the type recognized as a field type in other declared types.

extensionOf

none

If given, this should be another declared type. This type’s generated implementation class will then extend that type’s implementation class, so that instances of this type will be accepted where the supertype is expected.

abstract

false

If true, then this type will not be instantiable. An implementation class will be generated, definining its field methods, and it will be provisioned with a factory that, when invoked, will use the declared discriminator to determine which actual type is needed, and delegate to that type’s factory.

discriminator

top-level discriminator value

The discriminator is a JSON Pointer that is used during parsing to figure out what sort of object is actually required when a supertype is being parsed. The pointer is applied to the JSON structure being parsed for this value, and it should yield a string value. That value will be compared to the discriminator values specified for the various sub-types, to decide which factory to delegate to. Note that when an instance is created in a non-parsing mode (e.g. when copying an existing object), discrimination is done using the presented object’s type.

descriminatorValue

this type’s simple name

Works in conjunction with the discriminator value configured for this type’s supertype (or super-supertype or whateve). The supertype’s factory will extract a value from the presented JSON using its descriminator, and if that value matches this type’s discriminatorValue, the supertype factory will delegate to this type’s factory. There is no logic to detect colliding values and other possible anomalies.

renames

none

If you’re not happy with any of the generated method names, you can adjust them here. For example, suppose you have a boolean field named hasChildren. By default you will end up with a method named isHasChildren, which is unfortunate. You could fix that by adding isHasChildren: hasChildren to the renames object for the type.

enumValues

none

The presence of this property causes the type to be generated as an enum rather than a class. The property value should be a list of strings, which will become the names of the enum instance members in the generated enum type.

Field Properties

Property Name Default Value Description

name

last component of field declaration property name [6]

The name of this field. It should begin with a capital letter and consist of characters that are valid for Java identifiers. This name will be used in the generated code as a type name, and it may also appear with its initial letter downcased as a method parameter. Avoid names that would cause collisions or that would end up as reserved words when down-cased (e.g. "Interface"). The Generator currently does not detect such problems, and bizarre errors may result.

type

String or declared type

This is the type of the field. If the field name corresponds to the name of a declared type (including the type containing this field), then this field’s type defaults to that declared type. Otherwise it defaults to String.

structure

scalar

This must be one of scalar, collection, or map. This controls whether the parser expect to see a single value of this field’s type, a JSON array of such values, or a JSON object whose values of that type in the source document.

plural

name with "s" appended

The plural form of this name, used in generated method signatures

keyName

"name"

When generating methods for map fields, this name will be used for parameters in the method signature corresponding to the map key. E.g. by default a map field for a field named Foo of type Integer will have a method whose declaration is public Integer getFoo(String name). Specifying keyName: sigil will result in public Integer getFoo(String sigil) instead.

noImpl

false

This will prevent the generator from generating any code for this field in the implementation class. However, methods will still be declared in the interface. This implies that the missing methods will need to be manually added by the developer. This can be used to declare methods for "virtual" fields - fields that really represent values that are contained within other values reachable from the containing type. Imagine, for exmaple, that our Person type were extended with a Parents field that is a list of Person`s. One might declare `noImpl fields named Father and Mother to ensure that corresponding convenience methods are available in the generated API, but the parser developer would need to add implementaitons, making use of the existing Parents-related methods.

boolDefault

false

This is the value assumed for a field of type Boolean when the corresponding value in the source document is missing. The generated is…​ method for that field will return this value, while get…​ will return null.

parentPath

property name of field declaration

Normally, as described above, the parent path of a field is the field declaration’s property name within its conaining type’s fields map. Probably the only reason to override this is when the parent path should be empty, i.e. the parent object itself is the source of this field’s value. This probably sounds loony, but it is essential when one wishes parse a JSON object into a declared type and a map for additional properties beyond the fixed fields declared for the type. See [Comingling Maps In a Typed Object].

keyPattern

none

If specified for a map field, this regular expression will be used to filter properties from the parsed JSON object, so that only properties whose names match the pattern will appear in the map. Again, see Comingling Maps In a Typed Object for the most common use of this.


1. Well, technically not. In the current implementation, properties are not really even declared for fields in the generated Java class; instead, field values are maintained in a private map declared in the PropertiesOverlay type, which all the generated type implementations extend.
2. The example top-level document, when parsed, would give rise to a Java object of some other type that would also be declared in the types file, e.g. a type named Club, with a field named members.
3. Map values use LinkedHashMap for their representation, so property order is preserved.
4. We use the Jackson parser to consume JSON and YAML documents, and when parsing an Object field, we use Jackson’s ObjectMapper#convertValue(JsonNode, Class<?>) method, specifying Object.class as the target of the conversion.
5. String is the default type when a field’s name is not recognized as a declared type, so it is omitted here, leaving an empty declaration, for which YAML syntax requires an explicit empty object: {}
6. The first character is upper-cased, and for list and map fields, a trailing s character is dropped.