WebKit JavaScript

NOTE: This page is a recollection of what I've written previously. I do not guarantee it's accuracy.

JavaScript is (unfortunately, in my opinion) a central component in the modern Web. But here I’ll drop a whole bunch of details and just focus on what happens when JavascriptCore receives “scrollBy(?, ?)” (? stands for arbitrary numbers) from Odysseus’s UI outside the sandbox via WebCore. Feel free to suggest another interesting angle to explore via the Fediverse.

In JavascriptCore this code goes through several passes in order to figure out what it should be doing:

  1. A lookup table is used to classify the first character of each token after stripping whitespace, translating the above to: CharacterIdentifierStart, CharacterOpenParen, CharacterNumber, CharacterComma, CharacterNumber, CharacterCloseParen.
  2. Seperate logic for each of those classifications are used to expand these tokens, whilst distinguishing it from similar tokens. That same lookup table is used further here to help expand upon Identifier tokens: IDENT, OPENPAREN, DOUBLE, COMMA, DOUBLE, CLOSEPAREN.
  3. The parser traverses a sophisticated call graph corresponding to JavaScript’s syntax in order to call the ASTBuilder methods: createResolve, createDoubleExpr, createArgumentsList, createDoubleExpr, createArgumentsList, createArguments, makeFunctionCallNode
  4. That builder merges some of those AST nodes together in order to give the AST: FunctionCallResolveNode( ArgumentsNode( ArgumentsList( DoubleNode, ArgumentsList(DoubleNode) ) ) )
  5. That AST is traversed by a visitor informed by variable scope captured by the parser to yield the bytecode: OpMove (for this), OpGetFromScope (for “scrollBy”), OpMove (from a constant table), OpMove (from a constant table), OpCall
  6. That code is cached and “linked” indicating it should be run by the interpretor.
  7. The bytecode interpretor is written in a custom assembly language with a Lambda Calculus-based macro system in order to declare which machine codes are run for each bytecode, and possibly which instruction it interprets next (though that can be implicit).
  8. The first time many instructions like notably OpGetFromScope and OpCall are run, the call a “slow path” and cache details from it inline so similar subsequent runs will be fast.

OpCall distinguishes between three different types of functions: those defined in JavaScript as opposed to C++, and some concept of “Builtin” functions. OpGetFromScope and related bytecodes meanwhile have objects seperate their keys into shared “structures” (V8 calls them “hidden classes”) so we have something useful to cache from doing a hashtable lookup.

DOM Bindings

To expose the extensive DOM APIs like window.scrollBy to JavaScript, WebCore uses a couple of Perl programs, which it calls from it’s buildscripts, to compile a custom interface language into C++.

The Perl script drives the other two, with the first preprocessing it in order to generate it’s build dependencies for the sake of the build system.

And the latter actually compiles to C++, calling seperate library files to do the parsing and compiling. The parser uses Perl regular expressions for lexing before assembling those tokens into an AST, as it turns out Perl isn’t a bad language for this. And after any dependencies have been pulled in the code generator restructures the abstract syntax in order to generate it’s code, seperating the JavascriptCore-specific bits into another module with useless yet harmless loose coupling. A seperate pass is then used to write the C++ to disk so the #includes can be written to the right place.

The remaining code generation is handled via C++ templates and other shallow wrappers around JavascriptCore.