This page documents the API of Ohm/JS, a JavaScript library for working with grammars written in the Ohm language. For documentation on the Ohm language, see the syntax reference.
ohm.grammar(source: string, optNamespace?: object) → Grammar
Instantiate the Grammar defined by source
. If specified, optNamespace
is the Namespace to use when resolving external references in the grammar. For more information, see the documentation on Namespace objects below.
ohm.grammarFromScriptElement(optNode?: Node, optNamespace?: object) → Grammar
Convenience method for creating a Grammar instance from the contents of a <script>
tag. optNode
, if specified, is a script tag with the attribute type="text/ohm-js"
. If it is not specified, the result of document.querySelector(script[type="text/ohm-js"])
will be used instead. optNamespace
has the same meaning as in ohm.grammar
.
ohm.grammars(source: string, optNamespace?: object) → Namespace
Create a new Namespace containing Grammar instances for all of the grammars defined in source
. If optNamespace
is specified, it will be the prototype of the new Namespace.
ohm.grammarsFromScriptElements(optNodeList?: NodeList, optNamespace?: object) → Namespace
Create a new Namespace containing Grammar instances for all of the grammars defined in the <script>
tags in optNodeList
. If optNodeList
is not specified, the result of document.querySelectorAll('script[type="text/ohm-js"]')
will be used. optNamespace
has the same meaning as in ohm.grammars
.
When instantiating a grammar that refers to another grammar -- e.g. MyJava <: Java { keyword += "async" }
-- the supergrammar name ('Java') is resolved to a grammar by looking up the name in a Namespace. In Ohm/JS, Namespaces are a plain old JavaScript objects, and an object literal like {Java: myJavaGrammar}
can be passed to any API that expects a Namespace. For convenience, Ohm also has the following methods for working with namespaces:
ohm.namespace(optProps?: object)
Create a new namespace. If optProps
is specified, all of its properties will be copied to the new namespace.
ohm.extendNamespace(namespace: object, optProps?: object)
Create a new namespace which inherits from namespace
. If optProps
is specified, all of its properties will be copied to the new namespace.
A Grammar instance g
has the following methods:
g.match(obj: string|object, optStartRule?: string) → MatchResult
Try to match obj
against g
, returning a MatchResult. If optStartRule
is given, it specifies the rule on which to start matching. By default, the start rule is inherited from the supergrammar, or if there is no supergrammar specified, it is the first rule in g
's definition.
g.trace(obj: string|object, optStartRule?: string) → Trace
Try to match obj
against g
, returning a Trace object. optNamespace
has the same meaning as in ohm.grammar
. Trace objects have a toString()
method, which returns a string which summarizes each parsing step (useful for debugging).
g.semantics() → Semantics
Create a new Semantics object for g
.
g.extendSemantics(superSemantics: Semantics) → Semantics
Create a new Semantics object for g
that inherits all of the operations and attributes in superSemantics
. g
must be a descendent of the grammar associated with superSemantics
.
Internally, a successful MatchResult contains a parse tree, which is made up of parse nodes. Parse trees are not directly exposed -- instead, they are inspected indirectly through operations and attributes, which are described in the next section.
A MatchResult instance r
has the following methods:
r.succeeded() → boolean
Return true
if the match succeeded, otherwise false
.
r.failed() → boolean
Return true
if the match failed, otherwise false
.
When r.failed()
is true
, r
has the following additional properties:
r.message: string
Contains a message indicating where and why the match failed. This message is suitable for end users of a language (i.e., people who do not have access to the grammar source).
r.shortMessage: string
Contains an abbreviated version of r.message
that does not include an excerpt from the invalid input.
An Operation represents a function that can be applied to a successful match result. Like a Visitor, an operation is evaluated by recursively walking the parse tree, and at each node, invoking the matching semantic action from its action dictionary.
An Attribute is an Operation whose result is memoized, i.e., it is evaluated at most once for any given node.
A Semantics is a family of operations and/or attributes for a given grammar. A grammar may have any number of Semantics instances associated with it -- this means that the clients of a grammar (even in the same program) never have to worry about operation/attribute name clashes.
Operations and attributes are accessed by applying a semantics instance to a MatchResult.
This returns a parse node, whose properties correspond to the operations and attributes of the semantics. For example, to invoke an operation named 'prettyPrint': mySemantics(matchResult).prettyPrint()
. Attributes are accessed using property syntax -- e.g., for an attribute named 'value': mySemantics(matchResult).value
.
A Semantics instance s
has the following methods, which all return this
so they can be chained:
mySemantics.addOperation(name: string, actionDict: object) → Semantics
Add a new Operation named name
to this Semantics, using the semantic actions contained in actionDict
. It is an error if there is already an operation or attribute called name
in this semantics.
mySemantics.addAttribute(name: string, actionDict: object) → Semantics
Exactly like semantics.addOperation
, except it will add an Attribute to the semantics rather than an Operation.
mySemantics.extendOperation(name: string, actionDict: object) → Semantics
Extend the Operation named name
with the semantic actions contained in actionDict
. name
must be the name of an operation in the super semantics.
semantics.extendAttribute(name: string, actionDict: object) → Semantics
Exactly like semantics.extendOperation
, except it will extend an Attribute of the super semantics rather than an Operation.
A semantic action is a function that computes the value of an operation or attribute for a specific type of node in the parse tree. There are three different types of parse nodes:
- Rule application, or non-terminal nodes, which correspond to rule application expressions
- Terminal nodes, for string and number literals, and keyword expressions
- Iteration nodes, which are associated with expressions inside a repetition operator (
*
,+
, and?
)
Generally, you write a semantic action for each rule in your grammar, and store them together in an action dictionary. For example, given the following grammar:
<script type="text/markscript"> // Take the grammar below and instantiate it as `g` in the markscript environment. markscript.transformNextBlock(function(code) { return "var g = require('ohm-js').grammar('" + code.replace(/\n/g, '\\n') + "');"; }); </script>Name {
FullName = name name
name = (letter | "-" | ".")+
}
A set of semantic actions for this grammar might look like this:
<script type="text/markscript"> // Replace '...' in the action dict below with some actual function definitions, // so that we can be sure that the code actually works. markscript.transformNextBlock(function(code) { return code.replace('...', "return lastName.x().toUpperCase() + ', ' + firstName.x()") .replace('...', "return this.interval.contents;") }); </script>var actions = {
FullName: function(firstName, lastName) { ... },
name: function(parts) { ... }
};
The value of an operation or attribute for a node is the result of invoking the node's matching semantic action. In the grammar above, the body of the FullName
rule produces two values -- one for each application of the name
rule. The values are represented as parse nodes, which are passed as arguments when the semantic action is invoked. An error is thrown if the function arity does not match the number of values produced by the expression.
The matching semantic action for a particular node is chosen as follows:
- On a rule application node, first look for a semantic action with the same name as the rule (e.g., 'FullName'). If the action dictionary does not have a property with that name, use the action named '_nonterminal', if it exists. If not, the default action is used, which returns the result of applying the operation or attribute to the node's only child. There is no default action for non-terminal nodes that have no children, or more than one child.
- On a terminal node (e.g., a node produced by the parsing expression
"-"
), use the semantic action named '_terminal'. If the action dictionary does not have a property with that name, the default action returns the node's primitive value. - On an iteration node (e.g., a node produced by the parsing expression
(letter | "-" | ".")+
), use the semantic action named '_iter'. If the action dictionary does not have a property with that name, the default action returns an array containing the results of applying the operation or attribute to each child node.
Each parse node is associated with a particular parsing expression (a fragment of an Ohm grammar), and the node captures any input that was successfully parsed by that expression. Unlike many parsing frameworks, Ohm does not have a syntax for binding/capturing -- every parsing expression captures all the input it consumes, and produces a fixed number of values.
A node n
has the following methods and properties:
n.child(idx: number) → Node
Get the child at index idx
.
n.isTerminal() → boolean
true
if the node is a terminal node, otherwise false
.
n.isIteration() → boolean
true
if the node is an iteration node (i.e., if it associated with a repetition operator in the grammar), otherwise false
.
n.children: Array
An array containing the node's children.
n.ctorName: string
The name of grammar rule that created the node.
n.interval: Interval
Captures the portion of the input that was consumed by the node.
n.numChildren: number
The number of child nodes that the node has.
n.isOptional() → boolean
true
if the node is an iterator node having either one or no child (? operator), otherwise false
.
n.primitiveValue: number|string|...
For a terminal node, the raw value that was consumed from the input stream.
In addition to the properties listed above, within a given semantics, every node also has a method/property corresponding to each operation/attribute in the semantics. For example, in a semantics that has an operation named 'prettyPrint' and an attribute named 'freeVars', every node has a prettyPrint()
method and a freeVars
property.