GSK (GTK4 Scenegraph Kit)

To start this new decade (2020) I wish to describe GTK4’s upcoming “GSK” GPU-based renderer, which implements the CSS box model upon OpenGL’s Depth Buffering algorithm.

Whereas movies & increasingly games have been switching to Recursive Ray Tracing.

GSK’s job is to efficiently combine images (from GDK Pixbuf), glyphs (from FreeType via Pango), layout (from GTK & Pango), & widget styles (from GTK’s CSS engine) into an output image to be sent to the window manager.

Side note: Text is not directly sent to GSK for rendering. Instead GTK4 implements a new “backend” for Pango to send GSK the glyphs rendered by FreeType. Which 1) Interprets bytecode from the font file & 2) converts that vector graphics into raw pixels using the Bentley-Ottmann algorithm.

And to be clear, I do not mean to imply GTK4 is coming out soon, just that it hasn’t come out yet. I don’t think anyone knows when it’s coming out.

Public API/Data Model

With GSK’s public API you can build a render tree to pass to a Renderer subclass’s (which’ll describe tomorrow) .render() method. Most of the actual logic here appears to be for debugging, whether that’s performance profiling or highlighting diffs (computed by a Divide & Conquer algorithm). Though the renderer (especially Cairo) might benefit from those diffs.

The renderer to use determined by an environment variable or display property.


GSK is centered around the lightweight (compared to most GObject classes) superclass GskRenderNode. They all have a bounding box & can (de)serialize, diff, & render (using Cairo Vector Graphics) themselves. The Renderer subclass for Cairo mostly just calls the latter method, which looks a little ugly to me.

There’s GskRenderNode subclasses for:


There’s also routines to aid (de)serialization, with parsing help from GTK’s CSS engine, and lightweight classes for:

Renderers

I’ll now discuss how the datamodel I described gets sent to the GPU. This is done by the GskGLRenderer or GskVulkanRenderer, or alternatively you can can use the GskBroadwayRenderer (which straightforwardly serializes the tree to be rendered by JavaScript & web browsers) or GskCairoRenderer (which calls asks the render tree to take over it’s job).

OpenGL backend

The GskGLRenderer starts by initializing OpenGL, including the GDK output, debugging utils, the output target (stored in a GskRenderer property), and checking the bounding box. And when it’s done it tidies this all up, alongside any memory it has allocated in the process of rendering.

The actual logic is also wrapped in calls to begin/end a frame via the GskGLDriver (for additional GL setup/teardown), which is also responsible for memory (de)allocating textures (via a hashtable).


Between that setup & teardown (in .do_render()) it sets up the OpenGL camera/transformation matrices, texture caches (dropping the old ones if they’re too fragmented), & maybe (for debugging) a profiler. After which queueing up operations (compacting similar ones as it goes, and flattening the tree using a stack & “current” values).

It starts operations to set the OpenGL projection, viewport, modelview matrix, clip, before iterating over the tree, tidying up, set up GL, and interpret them.


Flattening that render tree is mostly quite straigtforward, however:

Minor optimizations are applied throughout that process thanks to being able to refer to the previous op, but once it’s all queued up it needs to be interpreted. But there’s not much to comment on as the ops all match naturally to OpenGL calls or parameters exposed by GSK’s shaders (which’ll cover tomorrow).

Texture Atlases

Texture atlases stores multiple images in a single OpenGL texture, with new regions getting allocated via what it calls a “skyline bin packing algorithm”. A websearch is not turning up much more explanation of how this allocator works.

But it appears to work (simplified explanation) by iterating over a sorted linked list, thereby scanning the X then Y axis for locations where the rect will fit. Larger rects allocated 1st.

Vulkan backend

The Vulkan renderer, in contrast, creates/runs “pipelines” representing corresponding to the draw ops OpenGL generates, to take advantage of Vulkan’s parallelisable CommandBuffers. This involves more files, but not actually any/much more code.

Clipping is handled on the CPU, and there’s less need for texture atlases because Vulkan can handle arbitrary image sizes. Though it’s glyph cache still uses an atlas, with a simple left-to-right texture allocator.

Vulkan does require it’s backend for GTK4’s GSK renderer to run a seperate pass for pre-rendering any textures that’ll later need to be processed in a later pass.

And finally the shaders (written in GLSL for the sake of OpenGL) do need to be precompiled for Vulkan.

Shaders

I’ll now describe the (GLSL) code running on the GPU to do the actual rendering.

This starts with a handful of common input variables & utilities to output a (clipped) colour, hittest a rounded rectangle, or lookup a point in a texture. That hittest first involves hittesting the unrounded rectangle, then each corner, then combines those latter tests. Different variants are implemented for ES2, GL3, & GL2.

The formula for hittesting an ellipsis is: (dotproduct(p/r, p/r) - 1.0) / length(2 * (p/r) / r), where dotproduct(a, b) = a.x*b.x + a.y*b.y & length(a) = sqrt dotproduct(a, a)`.

Feel free to explain this formula to me, I haven’t studied much geometry.


To color blend between two surfaces, lookup the appropriate colours in both textures & mixes them according to the specified (of numerous) formulas.

To copy (“blit”) an image onto the output you simply look up that pixel & apply the opacity. That texture position from which to read is interpolated between this “fragment shader” and the “vertex shader”.

To apply a blur to a texture, it computes an incrementalGaussian x/y/z, pixels-per-side, & pixel step based on the blur radius. From there it computes a sum & coefficientSum & updates the incrementalGuassian.

Then iterates between 1 & the computed pixels-per-side, and uses each of the intermediate (inclusive) numbers to update the sum, coefficientsum, & incrementalGuassian based on the pixel-step, provided blurDir, & incrementalGaussian.

The output colour is sum/coefficientSum.

To render a single-coloured border, it checks if the point is in an outer rounded rect but not an inner rounded rect, and outputs the specified colour if so.

To render a solid fill it simply outputs the provided colour for each fragment pixel. Both for this & the previous shaders, premultiplied-alpha (a technique for making colours behave like vectors via the statement color.rgb *= color.a) is computed on the GPU.

Another shader allows you to multiply a texture pixel by a matrix.

To recoulor an image multiply the desired at each output pixel by the alpha looked up from the texture.

To cross-fade two textures, use the provided progress to compute the opacity to apply to the pixels looked up from the two textures, applies those opacities & add the two colours.

To render an unblurred inset shadow, compute the desired outline & inside rounded rects and fill the difference, much like for borders.

To apply a linear gradient it iterates over the colour steps to find the relevant ones & uses the builtin mix() function to interpolate between them.

An unblurred outset shadow computes a single rounded rect to fill.

Repeating texture fills simply uses the mod() operator to bring the coordinate for lookup into range.

And I don’t see the difference between the outset_shadow & unblurred_outset_shadow fragment shaders. Barely any vertex shaders are used.

Geometry support

Provided by Graphene, which is all pretty much straight out of a maths textbook.