Text Layout & Rendering

Any text GTK renders is sent through Pango & its transative dependencies to handle richtext, linewrapping, internationalization, & more. Pango directly addresses the first 2, and I will be reimplementing Pango in Haskell. Or ideally paying someone else to do so if I can manage a decent income stream from my hobby…

If you ever wrote some XML markup to style your text in GTK, that is Pango! Btw unlike HTML there’s no inline-styling engine here, that’s the difference.

The public API mostly consists of a Pango Context, a GObject class.

It tracks & wraps a base direction, base & resolved “gravity”, a gravity hint, serial number to invalidate iterators, language (2 properties), fontmap, whether to round glyph positions, a font description, & a matrix transform. The main entrypoint is its get_metrics method!

get_metrics after normalizing its parameters considers returning its cached value. Otherwise loads the fonts from the map, iterates over them retrieving each’s metrics keeping the last, retrieves a language-specific sample-string to render (usually localized “I can eat glass and it doesn’t hurt me”, or something or other about a fox & a lazy dog) to pass to the true main entrypoints before cleaning up.

I’ll discuss pango_itemize_with_font later! The result of which is iterated over…

For each item it iterates over it attempts to lookup the font & its metrics, to conditionally copy some attributes over to the overall metrics & inform some postprocessing via Harfbuzz (with results tweaked, & extra abstraction). A width running-sum is computed.

Attribute lists are a sorted array of subclassable slices implicitly of the same string representing different styling options.

There’s textual formats from/to which attribute lists & their corresponding strings can be parsed or serialized. One’s seamingly for debugging, the other reuses the GMarkup lightweight XML parser bundled with GNOME’s GLib to build something convenient to use. Where various tagnames are parsed specially as shorthands.

GMarkup requires manually tracking relevant aspects of the tagstack. Attributes multi-buffered before being emitted, including in that tagstack. Can autounderline accelerators.

A Pango FontDescription holds a familyname, style, variant, weight, stretch, gravity, textual variations, mask, size, & some bitflags for whether the family or variations are static or the size is absolute. Font descriptions can be merged, hueristic differences can be computed to determine best match, more typical comparison methods, & various accessors. Includes a fairly sophisticated parser, used in Attributes parsing.

Fonts I’ll discuss later…

“Gravity” refers to an optional right angle rotation, can be converted to radians or matrix transforms & can be looked up for the Unicode “script” (roughly, alphabet) being used. A gravity hint defines which relevant gravity to prefer, mostly relevant when mixing scripts.

Pango defines its own (partial) matrix multiplication implementation with conversion from geometric transforms.

There’s logic for negotiating & inferring ISO language codes, & parsing preference lists. Or find sampletext.


To init it’s itemization Pango captures given context & text & attr iterator whilst computing/saving end, run_start, changed, embedding levels (via FriBiDi, later topic!), embedding end, gravity-related data (gravity, centered baseline, hint, resolved value, & font desc’s), script iterator with its range, width iterator, & emoji iterator properties. Nulls out result, item, embedding end offset, emoji font description, derived lang, current fonts, cache, basefont, first space, & font pos.

Upon both initialization & iteration range invariants are enforced. Finalization frees several of these properties. Iterating to the next item involves advancing the appropriate iterator.

Processing each resulting run involves checking which aspect has changed possibly computing a new gravity, derived language, or current font. Then processes each non-whitespace character allocating a new linkedlist “item” for output & stores results there handling final item specially.

For postprocessing it reverses that linkedlist & computes a running sum.

Pango’s “itemization” process is split into several iterators which are unioned together: embedding levels (precomputed), richtext attributes, Unicode scripts, emojis, & widths.

Some languages are left-to-right, others are right-to-left. Some are vertical (though those can usually be written horizontally too), & some are even diagonal (though no computer system I know of supports those)! Embedding levels computes which to use.

Precomputing embedding levels involves (after converting from Pango types to FriBidi types) computing the number of UTF8 characters, allocating 3 per-char sidetables of which it returns one (others are temporary), iterates over each char once looking up & recording their BiDi types with special handling for brackets whilst bitwise-or’ing & maybe (if flagged “strong”) and’ing these bitflags together, fastpaths unmixed text dirs, otherwise defers to FriBidi, & converts back to Pango types.

Pango attributes are stored in a sidearray from the text itself to make them trivial to iterate over! Though a stack is required to yield the end of all the attribute it has previously yielded the start of. Furthermore this stack is consulted to extract the styling for this run of text.

As stated previously Pango’s “attributes” are what’s parsed out of the XML (via GMarkup) you hand to it or GTK. They represent Pango’s richtext support!

To split the text into runs requiring different “scripts” (approximately a.k.a. “alphabets”) Pango iterates over each UTF-8 character. For each char Pango looks up the script The Unicode Consertium catalogued for it, for the “common” script looks up the corresponding charcode it pairs with, maintains a size-capped stack to balance those paired chars, & either fixes up any previously unknown scripts including in that stack or yields a script boundary.

To determine whether a char is an emoji Pango uses a lexer contributed by Chromium written in Ragel. The iterator checks whether the current is an emoji or not & scans all subsequent in the same classification.

Similarly the width iterator classifies (with some special cases) chars by horizontal or vertical writing directions according to builtin lookuptable.

Pango Layout

The PangoLayout GObject class tracks its PangoContext, richtext attributes, font description, tab indents (sized array of indents with boolean for units), the plain text, serial number for itself & its context, number of bytes, number of chars, layout width & height, initial indent, line spacing, justification/alignment, whether to ignore newlines, whether to autodetect textdirection, whether to wrap & whether it has, whether to ellipsize & whether it has, count of unknown glyphs, cached layout rects with flags, cached tabwidth, a decimal mode, resulting logical attributes, list of resulting lines, & a linecount. There is bitpacking involved, there’s a couple fields denoting which fields (the bulk of them) should be memcpy’d when duplicating.

Has standard GObject methods, & plenty of accessors.

A couple of these accessor methods wraps the XML parser for richtext markup. The serial number is used to detect changes invalidating layout computation, freeing the computed lines & resetting various properties whenever the input fields mutate via the accessors.

Upon accessing output properties (upon which some interesting logic is implemented) the PangoLayout lazily recomputes them ensuring all needed inputs are set. If flagged to infer textdir consults FriBiDi per-char or its context.

After clearing its output fields & retrieving initial extents/height from initial font, Pango’s highlevel layout algorithm involves repeatedly optionally looking for paragraph boundaries, optionally determining the base direction filling in gaps from the previous value, determines whether this is the final iteration with the last segment, runs the itemization algorithm I described above, copies attributes over to results whilst locating the correct slices thereof, optionally updates some per-item flags & attributes utilizing relatively-complex Unicode processing I don’t understand some of which is language-specific, applies some itemization postprocessing I’ll describe later, either repeatedly splits the items into lines (again I’ll describe later) or constructs a single-line, checks whether we’ve surpassed the high-limitation, & in preparation for the next iteration if any the next start index. Then it cleans up & aligns text!

PangoLayout’s has a method for iterating over the “runs” in a given line to locating UTF-8 offset in the appropriate run corrected to avoid landing the middle of clusters before iterating over the Harfbuzz-computed glyphs taking into account FriBiDi-computed text direction.

There’s a couple method for computing the appropriate line from the linkedlist for an index, one computing extents. And there’s methods combining these.

Another method computes new indexes moving up or down a line.

There’s a method which iterates over the lines locating the one which contains y-coordinate, then defers to the method for computing the x-coordinate. And a method which similarly does the reverse.

There’s further wrappers around these methods which returns properties of their results. A relatively complex one computes “strong & weak” rectangles to depict bidirectional text-cursors. And a further wrapper which incorporates Harfbuzz positions.

There’s methods for refining the chosen alignment, computing the x-offset to apply that alignment. This gets incorporated into the extents upon requesting it, & this info retrieved at the end of the core layout algorithm to ensure this is applied.

There’s yet more methods locating & processing appropriate runs. There’s an iterator over the computed lines or runs computing line-extents as it goes, with plenty of its own getters.

Pango Itemization Postprocessing

Text layout/rendering is full of little nuances which must be handled for internationalization’s sake. Here’ll describe such nuances (selecting “variants” & tweaking fontsize) which can’t fully be captured by intersecting several iterators to split the text up into “items”.

For variants it iterates over each item checking if a valid font-variant has been selected for it. If not it’ll split upper & lowercase letters to approximate them via fontsize & text transformation.

Another pass allows font designers to handle the difference between visual & geometric sizes, by iterating over every item. For each it analyzes the actual size of the selected font to compute how much it needs to be scaled to achieve the desired size. Results are saved into the item's attributes. This size adjustment is tweaked further for superscript, subscript, & smallcaps text.

Pango Line-Splitting

Pango’s main job is arguably to split text up into lines (or is it to split text more generally around richtext, writing systems, etc more generally?), so how does that work?

Until there’s no more items (if there was none to begin with it adds an empty line) Pango allocates a line lightweight gobject, initializes several properties on it (possibly with minor computation), then it iterates over the previously-split “items”.

For each previously-split “item” Pango gathers styling properties & runs Harfbuzz. If the trailing chars a newline it inserts the item into the line's linkedlist incrementing its length by what Harfbuzz computed whilst adjusting positioning to take into take into account tabs, before returning to the loop indicating this case. If the item fits entirely on the line it does likewise.

Otherwise it may have to split the item. For which it first sums the width & specialcases tabs as all-fit.

Then Pango checks if it can split at the end of the item & whether to add a hyphen. This tweaks the following fastpath check which checks if that width fits.

Having exhausted options it computes per-char widths iterates over the chars looking for valid breakpoints (taking into account hyphens) until we pass the max linelength. For each it might trial-split the text to consider the hyphens or tabs before updating the max-valid run.

If this fails it might try again with looser constraints.

If Pango has successfully split the item to fit (or has a force flag set) it either applies the split & returns that the first half all fits, or it indicates that nothing fits. Or splits as best it can indicating some of the item fits.

Otherwise it indicates nothing fits. Harfbuzz data is freed since linesplitting invalidates it.

Upon all fit the outerloop checks for tabs & removes the item from its todolist. Upon empty some fit it sets a flag & exits. Upon nonfit it backs up over previously-selected runs to see if they have valid splitpoints & reruns those checks for that tweak. For newlines it removes the item from its todolist & exits. Upon exiting it updates some counts & appends the line to the computed output.

Having computed a line Pango inserts missing hyphens, truncates whitespace, reverses runs, normalizes the text baseline, from Harfbuzz, considers ellipsizing the line, converts logical to visual order, redistributes whitespace, optionally “justifies” words in a seperate pass, & updates some flags.