3:2
top
← prev up next →

json-parsing: JSON Parsing, Folding, and ConversionπŸ”— i

Neil Van Dyke

License: LGPLv3 Web: http://www.neilvandyke.org/racket/json-parsing/

1IntroductionπŸ”— i

The json-parsing package for Racket provides JSON parsing and format conversion using a streaming tree fold. This tree fold approach permits processing JSON input of arbitrary size in relatively small space for some applications, unlike the common approach of parsing the entire input to an AST before processing the AST.
The supported JSON format is as specified on http://json.org/, as viewed on 2010εΉ΄12月25ζ—₯.
The format converters in package include a convertor to SJSON s-expression format. SJSON has been made to be fully compatible with the jsexpr of Dave Herman’s PLaneT package dherman/json:3:=0.
The parser does not consume any characters not belonging to the JSON value,and can be used to read multiple JSON values or to be intermixed with other kinds of reading from the same input.
The tree fold approach of this package’s parser was inspired and informed by Oleg Kiselyov’s SSAX XML parsing work.
Implementing the json-parsing package was originally intended as an exercise for getting more experience with SSAX-like folding, before undertaking some new XML packages, but the JSON work has turned out useful in its own right. A future version of this package might also implement alternative tree fold approaches.

2ExceptionsπŸ”— i

When the parser encounters invalid JSON, it raises an exn:fail:invalid-json exception. While this exception will be caught by handlers such as exn:fail?, the distinct exception type permits JSON-parsing errors to be handled separately from other errors, and it also includes some location information.

procedure

( exn:fail:invalid-json?x)boolean?

x:any/c
Type predicate.

procedure

( exn:fail:invalid-json-locationexn)

(list/c(or/cexact-positive-integer?#f)
(or/cexact-nonnegative-integer?#f)
(or/cexact-positive-integer?#f))
exn:exn:fail:invalid-json?
Gets information on the location of the error within the input stream. Currently, this is a list of three elements, of the three values returned by Racket’s port-next-location procedure.

3Parse FoldπŸ”— i

syntax

( json-fold-lambda
#:error-nameerror-name-symbol
#:visit-object-startvisit-object-start-proc
#:visit-object-endvisit-object-end-proc
#:visit-member-startvisit-member-start-proc
#:visit-member-endvisit-member-end-proc
#:visit-array-startvisit-array-start-proc
#:visit-array-endvisit-array-end-proc
#:visit-stringvisit-string-proc
#:visit-numbervisit-number-proc
#:visit-constantvisit-constant-proc)
Special syntax that expands to a JSON parser procedure. Normally you would use this if you were defining a new application of what processing the parser should do while it is parsing JSON. The resulting procedure of this syntax has the arguments:

(inseedexhaust?)

where in is an input port or string, seed is a seed value,and exhaust? is whether or not to exhaustively consume all input and ensure that there is no other non-JSON-whitespace.
json-fold-lambda has many arguments, all of which must be present. Here is an example of how you might define a my-json-to-sjson procedure using json-fold-lambda:
(definemy-json-to-sjson
(json-fold-lambda
#:error-name'my-json-to-sjson
#:visit-object-start(lambda(seed)
(make-hasheq))
#:visit-object-end(lambda(seedparent-seed)
`(,seed,@parent-seed))
#:visit-member-start(lambda(nameseed)
'())
#:visit-member-end(lambda(nameseedparent-seed)
(hash-set!parent-seed
(string->symbolname)
(carseed))
parent-seed)
#:visit-array-start(lambda(seed)
'())
#:visit-array-end(lambda(seedparent-seed)
`(,(reverseseed),@parent-seed))
#:visit-string(lambda(strseed)
`(,str,@seed))
#:visit-number(lambda(numseed)
`(,num,@seed))
#:visit-constant(lambda(nameseed)
`(,(casename
((true)#t)
((false)#f)
((null)#\nul)
(else(error'my-json-to-sjson
"invalid constant ~S"
name)))
,@seed))))
As you can see, the arguments provide a set of procedures that are applied at various states in the parsing. Each of these callback procedures accepts at least one seed value from its preceding sibling and/or parent, and it produces a seed value for the next sibling, child, or parent.
The concepts object, member, and array are non-leaf nodes in the tree. The start callback for each non-leaf node receives a seed from its preceding sibling, and the value it produces is the seed for its first child. The end callback receives both the seed from the last child, and the parent seed (the sibling predecessor seed of the node; the same seed received by the corresponding start).
The leaf nodes each simply receive a seed from the sibling predecessor callback (or, if the first sibling, from the parent start; or, if the first callback, from the seed provided to the parser call), and provide one to the sibling successor (or, if the last sibling, to the parent end; or, if the last callback, to the result of the parser call).
Note that two different techniques are used above to build collections of objects during processing, using seeds. The first is to use a hash that is passed in the seed, which in this case is used because SJSON requires a hash as part of its format. The second, and more common, is to construct lists by incrementally consing onto the front of the list, so that the list is ordred backwards, and waiting til the list is finished to put it in correct order using the reverse procedure.
The parser procedure returns either the value of the last callback, or, if the end of the input is reached without a JSON value, the eof-object.

procedure

( make-json-fold [ #:error-nameerror-name]
#:visit-object-startvisit-object-start
#:visit-object-endvisit-object-end
#:visit-member-startvisit-member-start
#:visit-member-endvisit-member-end
#:visit-array-startvisit-array-start
#:visit-array-endvisit-array-end
#:visit-stringvisit-string
#:visit-numbervisit-number
#:visit-constantvisit-constant)
(->*((or/cinput-port?string?)
any/c)
(#:exhaust?boolean?)
any)
error-name:symbol?='<make-json-fold>
visit-object-start:(->any/cany/c)
visit-object-end:(->any/cany/cany/c)
visit-member-start:(->symbol?any/cany/c)
visit-member-end:(->symbol?any/cany/cany/c)
visit-array-start:(->any/cany/c)
visit-array-end:(->any/cany/cany/c)
visit-string:(->string?any/cany/c)
visit-number:(->number?any/cany/c)
visit-constant:(->symbol?any/cany/c)
This is like json-fold-lambda, except it is a procedure, rather than syntax. make-json-fold can be used in the less-common case that you need to define a new parser dynamically.
Note that, in the produced procedure, the exhaust? argument is optional (defaulting to #t).

4ConversionπŸ”— i

4.1Conversion to JSONπŸ”— i

procedure

( json-to-sjson-visit-object-startseed)any/c

seed:any/c
( json-to-sjson-visit-object-end seed
parent-seed)any/c
seed:any/c
parent-seed:any/c
( json-to-sjson-visit-member-start name
seed)any/c
name:symbol?
seed:any/c
( json-to-sjson-visit-member-end name
seed
parent-seed)any/c
name:symbol?
seed:any/c
parent-seed:any/c
( json-to-sjson-visit-array-startseed)any/c
seed:any/c
( json-to-sjson-visit-array-end seed
parent-seed)any/c
seed:any/c
parent-seed:any/c
( json-to-sjson-visit-stringstrseed)any/c
str:string?
seed:any/c
( json-to-sjson-visit-numbernumseed)any/c
num:number?
seed:any/c
( json-to-sjson-visit-constantnameseed)any/c
name:symbol?
seed:any/c
Fold visitor procedures used by json->sjson. May also be used by other fold definitions.

procedure

( json->sjsonin[#:exhaust?exhaust?])sjson?

in:(or/cinput-port?string?)
exhaust?:boolean?=#t
Parse a JSON value from input port or string in, and return an SJSON parsed representation.

4.2Conversion to SXMLπŸ”— i

procedure

( json->sxmlin[#:exhaust?exhaust?])sxml/xexp?

in:(or/cinput-port?string?)
exhaust?:boolean?=#t
Parse the JSON input from input port or string in, and return in a contrived XML data format that can be processed with various SXML tools.

4.3Conversion to XMLπŸ”— i

procedure

( write-json-as-xml in
[ #:exhaust?exhaust?
#:outout])void?
in:(or/cinput-port?string?)
exhaust?:boolean?=#t
out:output-port?=(current-output-port)
Parse the JSON input from input port or string in, and write it in contrived XML data format to output port out (which defaults to the value of the current-output-port parameter). This is mainly a demonstration of “streaming” processing that can scale to arbitrary JSON input sizes.

procedure

( json->xmlin[#:exhaust?exhaust?])string?

in:(or/cinput-port?string?)
exhaust?:boolean?=#t
This is like write-json-as-xml, but instead of writing to a port, it returns the XML as a string. Most people would not choose to do this.

5HistoryπŸ”— i

  • Version 3:2 — 2016εΉ΄03月02ζ—₯
    • Tweaked info.rkt, filenames.

  • Version 3:1 — 2016εΉ΄02月25ζ—₯
    • Fixed deps.

  • Version 3:0 — 2016εΉ΄02月21ζ—₯
    • Moving from PLaneT to new package system.

    • Moved unit tests to main source file.

  • Version 2:0 — 2012εΉ΄06月13ζ—₯
    • Converted to McFly and Overeasy.

  • Version 0.3 — Version 1:2 — 2011εΉ΄08月22ζ—₯
    • Added json-to-sjson-visit- procedures.

    • Documentation fix.

  • Version 0.2 — Version 1:1 — 2010εΉ΄12月27ζ—₯
    • Added missing export.

  • Version 0.1 — Version 1:0 — 2010εΉ΄12月26ζ—₯
    • Initial release.

6LegalπŸ”— i

Copyright 2010–2012, 2016 Neil Van Dyke. This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See http://www.gnu.org/licenses/ for details. For other licenses and consulting, please contact the author.

top
← prev up next →

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /