- Scheme 100%
|
|
||
|---|---|---|
| grammars | fix bug when PromiseType is parsed as two tokens | |
| scripts | remove old peg-converter.scm | |
| tests | add tree->model implementation for Const | |
| webidl | improve explanation | |
| LICENSE | Rearrange files; change LICENSE | |
| README.md | fix parsing of mixin members | |
WebIDL parser for Guile
[EXPERIMENTAL]
Code for parsing WebIDL specs in Guile Scheme.
This is an early stage development, the API may change quite often.
I wrote this to help exploring the possibilities of using WebIDL for generating bindings for Scheme/Lisp implementations that compile to WASM (particularly Guile Hoot).
See the LICENSE file for the specific terms that apply to this project. Note that for any copyright year range specified as YYYY-ZZZZ in this package, the range specifies every single year in that closed interval.
Current state
The code was tested with reference IDL files available at .
Currently only two files fail, due to a deviation of the spec (see note below).
API
Here an overview of important procedures to use the parser. For more details and documentation please consult the corresponding modules.
(webidl parse)
(parse-webidl-file <PATH> #:simplify? [default #t] #:output-parse-tree? [default #f])
(parse-webidl-string <string> #:simplify? [default #t] #:output-parse-tree? [default #t])
By setting #:output-parse-tree to #t, a parse tree is generated, instead of a list
of model definitions (see (webidl model)).
To get a raw parse tree corresponding to WebIDL's grammar, set #:simplify? to #f.
(webidl model)
Provides constructors and accessors for IDL datatypes, like , etc. See exported procedures in the module definition.
Initialization
-
current-model-table[parameter] Collection of models found byparse-procedures. -
(make-model-table)Create a new a model table. Use this to initializecurrent-model-table.
Finding and iteration
Procedure for scanning and iterating over definitions. The keyword argument
@var{kind} (one of (dictionary interface namespace typedef) or #f) can be
used to restrict over a specific kind of definition.
See docstring to get more info.
(find-definition name #:key (kind #f))(fold-definitions proc init #:key (kind #f))(for-each-definition proc #:key (kind #f))
Example call:
(parameterize ((current-model-table (make-model-table)))
(parse-webidl-string
"typedef Promise<(DOMString or Blob)> data;
[Exposed=(Window,Worker),
Serializable]
interface DOMRect : DOMRectReadOnly {
constructor(optional unrestricted double x = 0,
optional unrestricted double y = 0,
optional unrestricted double width = 0,
optional unrestricted double height = 0);
[NewObject, Random] static DOMRect fromRect(optional DOMRectInit other = {});
inherit attribute unrestricted double x;
inherit attribute unrestricted double y;
inherit attribute unrestricted double width;
inherit attribute unrestricted double height;
};")
(let ((intf (find-definition "DOMRect" #:kind 'interface)))
(format #t "interface name: ~s~%interface-parent: ~s~%"
(interface-name intf)
(interface-parent intf))))
> interface name: "DOMRect"
> interface-parent: "DOMRectReadOnly"
Understanding resulting tree
Following a strategy similar to Guile's built-in PEG parsing library,
the resulting tree keeps optional parts of the grammar as symbols instead
of sublitsts. For example, a tree definitions is followed by a list of
entries, where each entry consists of an extended-attribute-list and a
definition. Since extended-attribute-list is optional, it may be a
symbol or a list tagged by extended-attribute-list.
(definitions
((extended-attribute-list
(extended-attribute-no-args (identifier . "SecureContext")))
(definition
...))
(extended-attribute-list
(definition
...))
(extended-attribute-list
(definition
...))
((extended-attribute-list
(extended-attribute-no-args (identifier . "SecureContext"))
(extended-attribute-ident-list
(identifier . "Exposed")
(identifier-list (identifier . "Window") (identifier . "Worker"))))
(definition
...)))
Here four definitions were found. The first and fourth of them have
extended-attribute-lists, the second and third not. Keeping the symbols
in-place makes it easier writing code to parse this tree, since every
entry has the same number of entries.
Development
Most of the parsing code is generated by a parser of IDL specifications.
The script scripts/generate-webidl-parser.scm reads the grammar
grammars/webidl.txt (copied and slightly modified from
) and, with help of
scripts/parse-webidl-grammar.scm, generates the file
webidl/parse-generated.scm.
The resulting code is a recursive descendent parser for IDL files. Why
not use the PEG library in Guile like in parse-webidl-grammar? I
tried that first, but had some difficulties dealing with the results and
some corner cases. Besides I wanted to have fun exploring ways of
writing an LL1 parser in Scheme.
Debugging
You can get debugging info by setting the parameter debug-parser to #t
(from (webidl util)). This will show rules that get applied, their results
and also tokens read, so one can see, for example, where the parser stopped
in the token stream.
Modules
Modules for external use:
-
(webidl parse)has procedures that consume the token stream and produce a model, or a parse tree as s-expression. -
(webidl model)data structures to model an IDL specification.
Modules for internal use:
-
(webidl lex)reads an input port and produce a token stream tailored for IDL specs. -
(webidl simplify)does some simplifications of the resulting parse tree. For instance a tree of the form(argument-list (arguments (argument ...) (arguments (argument ...) arguments)))is transformed into a tree
(argument-list (argument ...) (argument ...) ...) -
(webidl parse-to-model)definestree->model, which allows converting a parse tree into a model.
Grammar problems
I found some issues while developing the parser.
ExtendedAttribute specific parsing (not LL1)
As noted in the original WebIDL spec, ExtendedAttribute has a special handling in practice, which conflicts with the grammar spec. Therefore that part gets a custom parser.
Empty interface declaration
Some IDL files I found in the wild provide bodyless interface declarations. I changed the WebIDL slightly to accept them, may revert that in the future though.
stringifier without attribute ()
Also found in the wild (Selection.webidl from gecko) are stringifiers
without the attribute keyword. I changed the grammar to accept them, and
may revert that in the future.
#ifdef
Lots of IDL (particularly from Gecko) have #ifdef directives. Currently
I don't provide special handling for that, and parsing those files currently
produce an incomplete parse tree.
non 1-lookahead property:
PartialInterfaceMember/Operation and PartialInterfaceMember/AsyncIterable
break the 1-lookahead property. I solved the parsing problems I saw by trying
AsyncIterable first, but that may be not good enough.
Constructor inside partial interface
According to the grammar, partial interfaces don't have constructors. Unfortunately some files in the wild have that. I decided to stick to the spec for now and disallow them.