json-glib

This morning I described libxml2 which Odysseus uses to parse webfeeds, via a needed abstraction (to handle RSS’s slight messiness and Atom’s power) binding it to Odysseus’s templating language.

This evening I’ll cover the other parser I’ve implemented datamodel bindings to: json-glib. Mostly I’m just using this to test that templating language, but outside of templating I use it to parse DuckDuckGo’s online autocompletions.

The first thing json-glib does is it converts any passed GIO GInputStreams into a string and synchronizes asynchronous methods (leaving me to remark how much easier it’d be to implement this in Vala over C).

Then it feeds that text into a “JsonScanner” and emits an event, which it also does after finishing parsing.

This scanner “lexes” the text into a sequence of leaf values and punctuation largely in one method/switch statement, with a wrapper method doing requested post-processing including skipping tokens. The raw value is stored in a C union provided by GLib.

On both ends (the text input and the tokens output) it tracks the next and current values, so it (or the parser) can “peek” at that next value.

It also provides a utility for reporting invalid tokens.

From there what’s left to parse are the “objects” and “arrays” though json-glib can parse a containing var statement. Both of those have a dedicated method for parsing it, together comprising the core of the JsonParser.

These store outer “nodes” on the callstack whilst will constructing a new one, and will emit signals on starting, ending, and for each item in that collection.

The nodes are marked immutable once they’ve been parsed.

The output from all this is a lightweight dynamically typed data model of json-glib’s own creation, wrapping GLib’s GHashMap/GQueue (the latter for traversing entries in source order) and GPtrArray. With numbers, booleans, nulls, and strings having an extra layer of indirection for the parser’s convenience.

Sidenote: GLib implements the same collections as libgee, but libgee could offer a nicer API due to generics.

More utilities are provided to more easily read, construct, and serialize these