JsonStream – A Node.js Module You Should Know About

Last updated 3 weeks ago

node logoHello everyone! This is the thirteenth post in the node.js modules you should know about article series.

The first post was about dnode - the freestyle rpc library for node, the second was about optimist - the lightweight options parser for node, the third was about lazy - lazy lists for node, the fourth was about request - the swiss army knife of HTTP streaming, the fifth was about hashish - hash combinators library, the sixth was about read - easy reading from stdin, the seventh was about ntwitter - twitter api for node, the eighth was about socket.io that makes websockets and realtime possible in all browsers, the ninth was about redis - the best redis client API library for node, the tenth was on express - an insanely small and fast web framework for node, the eleventh was semver - a node module that takes care of versioning, the twelfth was cradle - a high-level, caching, CouchDB client for node.

This time I'll introduce you to a very awesome module called JSONStream. JSONStream is written by Dominic Tarr and it parses streaming JSON.

Here is an example. Suppose you have couchdb view like this:

{"total_rows":129,"offset":0,"rows":[
 { "id":"change1_0.6995461115147918"
 , "key":"change1_0.6995461115147918"
 , "value":{"rev":"1-e240bae28c7bb3667f02760f6398d508"}
 , "doc":{
 "_id": "change1_0.6995461115147918"
 , "_rev": "1-e240bae28c7bb3667f02760f6398d508","hello":1}
 },
 { "id":"change2_0.6995461115147918"
 , "key":"change2_0.6995461115147918"
 , "value":{"rev":"1-13677d36b98c0c075145bb8975105153"}
 , "doc":{
 "_id":"change2_0.6995461115147918"
 , "_rev":"1-13677d36b98c0c075145bb8975105153"
 , "hello":2
 }
 },
 ...
]}

And you want to only filter out doc values from the rows. You can do it easily with JSONStream this way:

var parser = JSONStream.parse(['rows', /./, 'doc']);

This creates a stream that parses out rows.*.doc.

Since it's a stream you have to feed it data and then have it output the data somewhere. You can do it very nicely and idiomatically in node this way:

req.pipe(parser).pipe(process.stdout);

Here is the output:

{
 _id: 'change1_0.6995461115147918',
 _rev: '1-e240bae28c7bb3667f02760f6398d508',
 hello: 1
}
{
 _id: 'change2_0.6995461115147918',
 _rev: '1-13677d36b98c0c075145bb8975105153',
 hello: 2
}

Where req is request to couchdb view and parser is the JSONStream parser, and it all gets piped to process.stdout. The output, as you can see, is only the rows.*.doc. That was a really easy way to parse a JSON stream without reading the whole JSON into memory.

You can install JSONStream through npm as always:

npm install JSONStream

JSONStream on GitHub: https://github.com/dominictarr/JSONStream.

Read more articles →
Thanks for reading my post. If you enjoyed it and would like to receive my posts automatically, you can subscribe to new posts via rss feed or email.
Cradle – A Node.js Module You Should Know About
A Perl regex that matches composite numbers (and doesn't match prime numbers)

AltStyle によって変換されたページ (->オリジナル) /