Haskell Graphics

For Haphaestus (my TV & ereader browser engine) I’ll need to “composite” visual output onto the monitor. This page will document the I/O libraries I’m pulling in to do so!

OpenGL Bindings “gl”

I plan to use hardware-accelerated rendering to avoid the need for more complex optimizations elsewhere. For that I’ll need to use, directly or indirectly, OpenGL in Haskell.

The OpenGL bindings I’ll use is the “gl” hackage, though Haskell has others.

I’ve written about OpenGL before.

There’s also the lower-level superceding “Vulkan” standardised API, but that’s too lowlevel for my needs… The Depth Buffer Algorithm is just what I need!

As part of their OpenGL standard, Khronos Group publishes an authoritative XML file platform-independantly defining the API GPU drivers should expose to applications (e.g. videogames), as well as a couple of text file listing the function names & links to documentation.

The “gl” hackage is generated from these files!


The XML file is parsed via HXT (which my “Amphiarao” webpage debugger uses for XPath support). After parsing the whitespace-stripped XML it locates the <registry> elements to parse all contained <groups>, <enums>, <extensions>, <commands>, & <feature> elements collecting their results into a Registry record.

<groups> contains <group name=?>, & in turn <enum name>, elements. <enums> contains <enum name value> elements.

<extensions> contains <extension name supported?>, & in turn <required>, elements. These <required> elements where present may contain text-containing <comment>, <enum name>, & <command name> elements. And maybe a ‘profile’ attribute.

<commands> contain <command> elements. <command> contains <proto> (and in turn text-containing <name> & <ptype> elements), similarly <param group? len?>s, maybe <vecequiv name>, & maybe <alias name>.

<feature name> elements contain <require comment profile> & similarly-structured <remove> elements. These in turn contain <enum name> & <command name> elements.

All these described elements are converted into pure-Haskell values of types provided by the Registry submodule, with a little postprocessing.


If the source directory already exists it’ll recursively delete before regenerating it.

From here the parsed XML is restructured via richer datastructures into a Module. Rewriting names & type signatures from C syntax to Haskell in the process, largely relating to pointers (as defined by the Haskell FFI), void, argument lists, & enums.

It then generates a Graphics.GL.Internal.FFI module containing relevant imports, & (do I have this right?) callback function signatures, Graphics.Internal.Shared for common datatypes, a module per version, & a module per extension + one.

Finally there’s a minimal library for serializing the reformatted modules generated as per above.

The generated FFI function signatures go on to call into your GPU driver, or Mesa3D, at runtime. Other hackages define OpenGL’s more unique datatypes. For example Fixed represents floating point numbers by conceptually dividing CInts by 216, with Halfs defined similarly.

Linear Algebra “linear”

As GPUs have been becoming more general-purpose, OpenGL (now Vulkan) is becoming more lowlevel, & GPU drivers (apart from the opensource Mesa3D) are dropping the highlevel abstractions it is becoming increasingly vital to reimplement the same datatypes used on the GPU (via GL Shading Language) on the CPU such that they can be readily processed there. This is precisely what operator overloading is for!

NOTE: Mesa3D implements this internally in C without operator overloading.

For Haskell I’ve already mentioned the “half” & “fixed” hackages for alternative floating point representations possibly used on GPUs. Today I’m discussing the higher-level Linear Algebra as implemented by the “linear” hackage.


This provides Algebra & Coalegebra typeclasses (“traits” in Rust, roughly) inheriting from the builtin Num typeclass. Which it implements for it’s own types, Void & nil.

It provides trivial FFI helper functions that can be used to hand these to OpenGL.

There’s a Conjugate typeclass which negates the imaginary component, if present, of a number.

There’s a Covector type for efficiently concatenating mathematical formulas, expecting to hold a value implementing it’s Coalgebra typeclass.

There’s an Epsilon typeclass for testing whether floating point value is as close as the computer can represent.


Matrices (formulas on vectors) are defined for several sizes as being vectors of vectors, with some newly-defined operators. (Yes, Haskell allows adding new operators to the language - not only overriding existing ones)

The Foldable, Functor, Lens, & maths typeclasses are used to define these matrix operations generically. Specifically the formula on (e.g. 2D) vectors matrices represent are x' = Ax + By & y' = Cx + Dy, storing the uppercase parameters. Usually with an extra dimension for geometric translation. Quaternion (rotation) conversion is implemented, as is from row-major to column-major form.

Conversion between different matrices is implemented, as is extracting the translation column & computing “matrix determinants”, “inverses”, & “transpositions”. A matrix inverse reverses the geometric transform it applies, not cheap to compute for arbitrary matrix (better to build it up as you construct the non-inverse matrix).

There’s functions (in a seperate file) specifically for constructing matrices to represent a camera in OpenGL, e.g. to place camera at center of virtual universe. Or apply perspective effects.


Quaternions represents rotations in 3D space, which get converted into trigonometric formulas within matrices, as a Vec3 place an extra number. Implements numerous typeclasses including for binary serialization that can be handed to OpenGL.

There’s a typeclass for computing “trace” of various mathematical types.

5 files implement vectors of sizes ranging from 0 to 4 inclusive, implementing maths typeclasses on fixed-size arrays of numbers that can represent e.g. points, directions, & colours.

Additional typeclasses & freestanding functions are defined for any of those vectors (based on the typeclasses they implement directly) in a seperate file.

There’s implementations of “plucker” & “affine” coordinate spaces, never heard of those before.

Typograffiti uses these in it’s slightly higherlevel abstractions around OpenGL which I’ll probably reuse for Mondrian. Also “linear” could be useful for implementing layout…

Text-Rendering “Typograffiti”

The most fundamental a web browser does is, ofcourse, rendering text! In Haphaestus I plan to use Typograffiti for this (an Auckland project) for this, which uses OpenGL3.3+ hardware acceleration & FreeType2.

I’ve contribute to Typograffiti making sure it still builds against latest dependencies & exposing a lower-level API I can hook up to some FontConfig language bindings I’ll need to write. Then bind to CSS…


The first step of using Typograffiti is to initialize a font store. The bulk of the work involved is in initializing the OpenGL pipeline for rendering each word (no not computer jargon, usual meaning of the word “word”) using it’s own internal OpenGL abstraction I’m tempted to reuse in “Mondrian” box rendering. Other than that, it wraps returns that callback with some empty collections in a transactional-mutable TextRenderingData & in turn FontStore.

To construct that OpenGL pipeline it GLSL-compiles (glCreateShader, glShaderSource,glGetShaderiv(COMPILE_STATUS), & error reporting) a vertex shader (2D matrix transform & copies over UV coordinates) & fragment shader (blends specified colour with that from texture). Then “links” the two into a single “program” specifying default parameters (glCreateProgram, glAttachShader repeated, glBindAttribLocation repeated, glGetProgramiv(LINK_STATUS), & error reporting or glDeleteShaders).

Once it has that program it enables it, enables blend with mode GL_SRC_ALPHA & GL_ONE_MINUS_SRC_ALPHA, & retrieves the indices (glGetUniformLocation) for the projection, modelview, mult_colour, & tex parameters to those GLSL programs.

Inside the callback for rendering a human-word it allocates & binds an OpenGL VertexArrayObject (glGenVertexArrays & glBindVertexArray) & 2 buffers (glGenBuffers). Populating them from information in the “atlas” (discuss later).

Upon success it buffers that computed data (glBindBuffer, glBufferData, glEnableVertexAttribArray, glVertexAttribPointer, & a single glBindVertexArray) before returning two additional callbacks for drawing & releasing as a AllocatedRendering struct.

To draw it combines the given transforms into a single matrix (unless they’re specified as a matrix), switches to the correct GLSL program (glUseProgram), retrieves screensize via callback, computes an orthographic projection matrix, sets the program’s parameters, temporarily “binds” the given textures (glActiveTexture & glBindTexture, glBindTexture cleanup), & triggers the OpenGL rendering (glUseProgram, glBindVertexArray, & glDrawArrays(GL_TRIANGLES).

Cleanup involves repeated glDeleteBuffers & glDeleteVertexArrays calls.


The other major component of it’s API is getTextRendering which uses the given FontStore (holding that OpenGL pipeline) to render the given string at the given fontsize with the specified fontfile.

To do so Typograffiti first unwraps the fontstore state to determine whether it still needs to load it in. If so it allocates an Atlas & inserts into the binary-search-tree cache.

To allocate an Atlas (in game programming “atlas” refers to a single imagefile/texture containing several of the images, laidout in 2D space, you actually want to display), with FreeType2 initialized Typograffiti reads the fontfile via FreeType2 (ft_New_Face), configures the fontsize (ft_Set_Pixel_Sizes or ft_Set_Char_Size), & computes how much space is required for the initial charset (repeated ft_Load_Char(FT_LOAD_RENDER) summing/512px-wrapping the glyph.bitmap.width property).

With that sizing information & char-to-bbox binary-search-tree mapping it allocates a texture for them (glGenTextures, glActiveTexture, & glBindTexture), configures it’s alignment (glPixelStorei(GL_UNPACK_ALIGNMENT)), & clears the image (glTexImage2D).

Then for each char of the initial charset it copies the chars into place (glTexSubImage2D) & constructs a richer char-bbox mapping.

It configures the texture some more, adding a “mipmap” for scaling, binds it, & returns results.

NOTE: The fixed charset of atlases is concerning, but trivial enough to work around in a browser engine…


Once you have an “atlas” texture for a given fontfile (constructed as described yesterday; btw this is the piece I hacked on to contribute lower-level API…) whether or not it was already cached, it constructs then atomically-caches (Binary Search Tree Data.Map.alter) a draw callback using it’s global OpenGL pipeline (described previous day) with computed text size. These are returned in a RenderedText struct, for the caller to call the drawRenderedText callback each frame.

To construct the render callback it first iterates over each word in the input string (as determined by the standard prelude) calling said OpenGL pipeline for any missing words. (Hmmm, another necessary fix: allow dealing in terms of lower-level “glyphs” instead of strings. I’ll wait until I have a library to use that with…)

The core rendering logic for which involves iterating over the (single-word) string, converting the char into a FreeType index, checks for FreeType adjustments, retrieves the coordinates out of the texture, & appends to the data for OpenGL to render.

Now that we’ve ensured all words in the string are cached, Typograffiti computes how much x&y space to leave for spaces (between words) from the space glyph.

The draw function it returns for the caller to call iterates over the chars of the given string specially handling newlines & spaces (missing tab handling…) specially to update positioning. Otherwise it consults the cache for a render callback to call with the current position, adds adds it’s with to the next position.

The returned size is computed very similarly to the render callback, just without calling the in-cache render callbacks. Postprocessed to ensure the space hight is considered.

The updated wordcache is passed back to the public function for it to insert into the FontStore’s textRenderingDataFontMap.

OpenGL Abstractions

Typograffiti uses it’s own minimal abstractions over the OpenGL bindings generated from Khronos’s XML files, so today I want to check what those abstractions offer me to use when rendering boxes.

In large part it allows us to use more Haskelly datatypes rather than the raw FFI types those raw bindings expect. But also it abstracts away some OpenGL boilerplate where certain functions always appears in sequence, which is especially a problem when compiling shaders!


There’s functions for allocating, activating, & binding as relevant textures, vertex array objects, buffers, buffer geometries, & bound textures. There’s functions for cleaning up those allocations after a callback was run.

There’s a function for outputting any OpenGL errors which were encountered. There’s a function for converting float vecs into GLfloat vecs. One wrapping glDrawArrays with a given program & vertex array. Ones for compiling OpenGL shaders & programs, reporting any errors.


glGetUniformLocation has a type conversion wrapper. And there’s a typeclass to call the appropriate OpenGL function for setting uniforms (with conversion between Haskell/Linear types & OpenGL’s) reporting any errors. This could be extremely handy for me when rendering CSS boxes in “Mondrian”!

There’s a couple of further Matrix abstractions, and computation of a bounding box from an array of vertices. Which is used during caching the rendering callback for a word.

FreeType2 Bindings

A pivotal step of rendering text is to decode font files, which ammounts to a usually-fancy (there’s excellent anti-colonialist reasons for this fanciness) lookup table from characters (or rather “glyphs”) to images. Whether those images are “raster” or “vector”.

Definitions: Raster images store a grid of pixels, vector images describe how to draw the raster image. Any such vector images need to be “rasterized”.

For all this Typograffiti uses the excellent & popular FreeType2 C library.

Here I’m describing the Haskell’s FreeType2 language bindings as implemented by Jason Dagit. Though Jason appears to be offerring slightly more Haskelly APIs than Typograffiti expected, I had to correct for this to get Typograffiti to build.


A copy of the FreeType2 source code is bundled here, and as far as I can tell the Haskell bindings are handwritten with the aid of helper functions for memory allocation & error reformatting.

bracket is used to pair up setup/cleanup functions for the caller to specify a callback to run between them, regardless of any errors thrown.

There’s record types implementing Storable (a typeclass from Haskell’s Foreign Function Interface) to convert via the HSC preprocessor (splices in C’s preprocessor by generating & running a C program) to convert to/from the corresponding C struct.

Where relevant the raw FFI bindings are exposed in “Internal” modules with ' suffix.


bracket is used to pair up setup/cleanup functions for the caller to specify a callback to run between them, regardless of any errors thrown.

There’s record types implementing Storable (a typeclass from Haskell’s Foreign Function Interface) to convert via the HSC preprocessor (splices in C’s preprocessor by generating & running a C program) to convert to/from the corresponding C struct.

Where relevant the raw FFI bindings are exposed in “Internal” modules with “ ‘ “ suffix.


Found the error-reporting utils! There’s ofcourse one ftError which converts non-zero return values into FtError exceptions. And there’s AutoError (largely serves to remove need for parenthesese with ftError) & AutoAllocaError typeclasses (also adds alloca calls where appropriate).

These frequently reduces the need to handwrite format conversions, but they don’t always. Sometimes alloca still needs to be called manually.

Still very raw bindings…

Colours

A major component of visually-styled webpages are colours. Text can be coloured. And an element’s background or border can be coloured. This is simple enough to parse if you’re specifying RGBA colours (especially hexcodes), but what about e.g. hsla() colours? What about enforcing colour contrast? That’s what I’ll be using the “colour” for.

Not that this is complex code to rewrite, I’m just happy not to be debugging it personally! From people who better understand this arithmatic/biology.


The central types (which implements Show, Read, & Eq typeclasses) in Colour is Colour & AlphaColour, themselves wrapping the Chan type. Chan inturn wraps a numeric type (with their 0, 1, multiply, add, subtract from 1, multiplysum, typeconvert, & sum ops) compiletime-annotated with whether it represents Red, Green, Blue, Alpha, etc.

There’s also a more trivial RGB type.

There’s a trivial non-typesafe matrix API implemented upon 2D linked-lists of numbers.

RGB has a function for converting into HSLSV as a 5-tuple.

There’s a RGBGamut type (implements Eq, Show, & Read) which combines the RGB type with the CIE Chromaticity type, including an extra field for the “whitepoint”. Has APIs for generating matrices to convert back & forth between RGBGamut & RGB types. Chromaticity inturn exposes X, Y, & computed Z fields.

A seperate submodule validates RGBGamut & converts between the other types via different “transfer functions”.


RGBSpace.HSL submodule implements accessors for the different fields returned by hslsv, and converts from hue-saturation-lightness back to RGB. RGBSpace.HSV submodule does similar.

sRGB submodule implements conversion between common bit representations.

And finally the International Comission on Illumination (CIE) has standardized their own colourspace based on our biology.

CIE’s colourspace is converted to/from RGB mostly via matrix multiplies. Though there is function which divides the colour’s CIE coordinates by a given “whitepoint” with appropriate exponents & multiplicands added whilst subtracting the resulting Y from X & resulting Z from Y. Lightness arithmatic is slightly different.

I’ve already discussed their concept of Chromaticity, and the CIE.Illuminant submodule exposes a bunch of standard ones.

Names submodule exposes colour constants, with a corresponding colourname parser.


The formula for converting grayscale to HSLSV is to return the shade as the L & V components. Otherwise converting RGB to HSLSV involves computing the min, max, & average of RGB channels. L is that average. S is is the diff over (lightness-conditionally subtracted from 2) sum. S0 is the diff over max.

As for Hue from RGB it roughly sorts the channels, divides lower diff by the range & multiplies by 60°, adding 120° multiplied by whether it’s in the red, green, or blue quadrant, & ensures the result is in range 0-360°.

To convert back from HSL to RGB it starts by splitting the hue circle into thirds for initial red, green, & blue values. Then recomputes L & S into “p” & “q” to factor them & their diff into each channel based on the channel’s previous range.

To convert HSV to RGB compute f = (h/60°) mod 1, p = v(1-s), q = v(1-fs), & t = v(1-(1-f)s). Which sixth of the circle the hue is in determines which permutation of v, p, q, & t is selected as the red, green, & blue channels. Yes, one var’s dropped.

HSL & HSV are coarser approximations of how humans percieve colour compared to CIE. RGB represents how we sense colours (without duplicating data for our eyes’ “rods”).

Image Decoding “Juicy Pixels”

As a visual web browser Haphaestus will support displaying images, either as part of the HTML or as CSS decoration. To decode these images into raw pixels for me to composite onscreen (via OpenGL) I’ll use JuicyPixels which implements this all in pure Haskell!

Today will serve an introduction with the following days describing each image format it can load & save.


JuicyPixels can encode/decode Images to/from ByteStrings or specified files.

The functions for parsing/saving a file wraps those for ByteStrings allowing most of the code to be “pure” with no (explicit) I/O or sideeffects. There’s also variants of these functions for attaching metadata and/or applying colour pallets.

When reading you can use a function which (with the aid of helper utils) attempts each image decoder before returning the concatenated error strings or a successfully parsed image.

There’s typeclasses for converting between colour representations.


As for how it reads files into bytestrings, there’s a build flag for switching between mmap (faster, but adds a dependency) & “lazily” reading in the file in chunks as-needed. Personally I won’t need this, I have bytestrings already!

The internal Image type shared by all the encoders & decoders specifies a width, height, & generic array for the raw pixel values; and there’s a mutable variant. A typeclass determines how to interpret each entry of that array, & the bitsize of each entry.


There’s a wrapping DynamicImage tagged enum which determines at runtime which pixel representation to use. And a sized-array to represent limited colourpallets for image formats which use that for compression. This comes with it’s own PallettedImage type, that fallsback to DynamicImage.

The typeclass for pixel representations also handles colour blending & mutating specified pixels in an image.

There’s fold & map implementations for most of these types.

And there’s a couple algorithms implementing “colour quantization” to compute a suitable palette to improve compression ratios when saving to one of it’s several supported formats.

PNG

One of the most common fileformats today is PNG (Portable Network Graphics). Which is quite effective at, usually losslessly, compressing most images. Particularly stylized ones. Certainly Haphaestus will encounter them quite a bit!

Compressing images is important because specifying each individual pixel as a full, say, 32bit word can get fairly expensive! Still not as bad as audio or video!


It starts using the binary hackage to aid decoding a PngRawImage from the bytestring input. A PngRawImage consists of a header specifying it’s width, height, bitdepth, colour type, compression, filter (zeroed), & interlacing; followed by a sequence of sized, typed, & error-checked sections. It implements binary’s typeclasses on these types manually, whilst computing/checking that CRC.

To parse embedded metadata it retrieves any pHYs, gAMA, tEXt, & zTXt sections & lightweight-parses them into a common format.


Once all the iDAT sections are concatenated & size-validated they’re decompressed via the ZLib C library (already an indirect dependency of mine thanks to the HTTP client I chose, and is on my Linux From Scratch study schedule), whilst checking for a pLTE (pallete) section to parse which is corrected for the given bitdepth/imagetype.

Before finally deinterlacing (determined via header) the image & converting into the common types. The common code is where colours are decoded possibly via the parsed pallete.


The PNG header may specify use of scanline (a.k.a. no interlacing) or adam7 deinterlacing. Scanline deinterlacing minorly parameterizes the filtering function. Adam7 deinterlacing rearranges the data according to a standard matrix to improve ZLib compression ratios. There’s several different strategies for decoder individual pixels selected based on other header fields.

Filtering involves asking each run which, if any, filtering it wants applied between sub, average, up, & paeth before iterating to the next.


As for encoding PNGs, that’s partially handed off to a typeclass over the common types to determine the appropriate header fields. All deferring to common code (one of two functions) which assembles all the image data into the previously-described Abstract Syntax Tree & encoding to a ByteString via binary. Not bothering with any compression beyond ZLib & applying prescribed palletes.

Please use a better encoder like fhanau’s for anything you publish online!

Bitmaps

Today’s topic of study is quite simple: the uncompressed Bitmap image format! O.k., not entirely uncompressed…

These aren’t used much anymore (why have uncompressed images when you can have losslessly-compressed images?), but it was an important to place to start from. Both to give ourselves a baseline to improve upon, and to figure out how to just get an image onscreen!


It starts by parsing & validating a fileheader & bitmap header off the input bytestring.

It then branches over the bits-per-pixel, number of planes, compression algorithm to determine how to parse the main datablock (which is at a header-specified offset & size) into the common image representations. Offloading much of the effort onto them (they can convert between pixel representations), they’re all pretty trivial.

Another datasection specified by the bitmapheader specifies a “profile” to incorporate into the returned metadata, alongside those extracted from the headers.


As for encoding bitmaps, it gathers all the data to pack into the header (or fills with constants) including byte sizing/positioning data & serializes them into a single bytestring followed by the palette, image itself (encoded according to a typeclass), & ICC profile metadata.

Whether encoding or decoding the headers are temporarily stored in records with the aid of, once again, the binary hackage. Fileheaders consist of a magic number, filesize, reserved bytes, & offset.

The bitmap header, of which all or some fields may actually be meaningful, consists of a headersize, pixel width, pixel height, number of colour planes, number of bits per pixel, which compression to use, the bytesize, x & y resolution, count of colours in palette, how many of them are important. Added in version 2 there’s now red, green, blue, & (in v3) alpha bitmasks. In v4 there’s two fields identifying the colourspace. And in v5 there’s now 3 fields for ICC profile.

JPeG

JPeG was specifically designed for storing & transferring photos, which unlike most other imagery (with dominant exceptions), focuses on realism more than visual clarity. As such JPeGs require more extensive compression (possibly discarding finer details we won’t notice), and it’ll take me more effort to cover it all!


A JPG image consists of a sequence of frames surrounded by 0xFFD8 & 0xFFD9.

Each of these JPG frames may be a AppFrame (byte-tagged bytestring), Adabe APP14 (version number, bitflags, & colourspace), JFIF (unit, x, & y resolution, & embedded RGB8 thumbnail), Exif (TIFF-like metadata sequence), extensions (byte-tagged bytestring), quantization (sequence of precision, destination, & 16bit-pixel array), huffman compression dictionary (type (DC or AC), destsize, vector of bitsizes, vector of codes, vector representing packed tree), scan blobs (length, component count, list of component selectors & single bytes for both DC & AC coding tables, 2 bytes for “spectral selection”, approx high byte, approx low byte, & a trailing bytestring), scans headers (tag, header length, sample precision, height, width, component count, & list of components themselves consisting of tags, horizontal & vertical sampling factors, & quantization destination), & 16bit interval restarts.


If it’s successfully decoded all those JPG frames it’ll seek out all scan headers to determine the “image kind” - baseline or progressive. After decoding these it gathers all the metadata from the headers to possibly return alongside the image whilst determining which colourspace that image is in.

Progressive decoding requires a little more effort, but in either case it runs a shared state machine, gathers results into an image & huffman-decompresses it whilst applying a quantization matrix.


That state machine involves converting every jpeg frame into state mutating actions, consisting of DC & AC decoding tables; quantization matrices; restart interval; frame; index mapping; app14, app0JFif markers; whether progressive decoding flag, horizontal & vertical max resolutoin, & how many blobs was seen.

AdobeAPP14, Exif, & JFIF frames set the appropriate markers to the embedded data. App & extension frames are ignored. Scans frames perform minimal reformatting setting it’s fields.

ScanBlob JPG frames decodes, concatenates, & yields each contained “scan” whilst incrementing the blobs counter. Decoding each scan involves looking up the specified component from the state, retrieves DC & AC Huffman tables, to lookup the value in them. Then retrieves several other statefields, and if the current frame is set (errors if not) it retrieves a couple of it’s fields to finish populating each “unpacker parameter” it yields for the square.

Interval Restart JPG frames sets the corresponding statefield. Huffman tables should be chooser whether to set the AC or DC Huffman decoding tables, but as of now (in Juicy Pixels) appears to be a noop. Similar story (without the branch) for quantization tables. Does Juicy Pixels need help too?


Progressive enhancement, if enabled, starts by allocated iterates over all height &, inner loop (pre-selected logic), width “lines” which’ll be postprocessed a few times. Looks like diff compression.

Otherwise it just applies the Huffman decoding & quantization table in a non-trivial wrapping loop. Rest of decompression is left to colourspace transformation.

There’s default Huffman tables

Encoding involves gathering JPG frames representing the image & all attached metadata. With the aid of a new typeclass. Sizes of these frames are computed. Memory is alloc’d to compress the pixels, iterating over them splitting into square blocks to serialize with DCT transformation & quantization.

GIF

GIF, the animated image format using LZW compression, is largely historical at this point with browservendors encouraging use of more efficient formats explicitly designed for video instead. Even if it has made a lasting impact on internet slang! Though if you see any spinners online those are still probably GIFs.

I don’t think I’ll bother having Haphaestus animate GIFs…

Juicy Pixels, in addition to it’s normal static image APIs, offers special GIF APIs. Each doing firstpass decoding via the binary hackage.


GIFs consist of a header specifying whether it’s v87a or 89a, it’s width, height, background index, whether it has a global map, the colour resolution, whether the palette’s sorted, the size of that palette, & an optional palette followed by a sequence of individual images (each headed by sizing/positioning, encoding bitflags, colourtable size, a local palette, & LZW rootsize; & optionally a disposal method, user input & transparancy flags, delay, & transparent colour), & looping behaviour.


Once that’s all decoded, optionally stripped all trailing frames, and verified it checks whether transparency was requested.

If so it decodes the back image (tranparent colour given by header index into palette), first image (raw data, wrap in appropriate common type), & palette (raw data) before iterating over the remaining image frames. If not it decodes the back image (with different transparent colour) & palette, adds alpha fields to colours, & iterates over all frames as per before.

Decoding non-base images require LZW decompression & optional deinterlacing. Deinterlacing means rearranging the data for better compression.

Decoding trailing frames also involves decoding their new colour palettes, & iterating over all image pixels hittesting the frame’s bounding box copying it’s opaque pixels over the previous frame (or selected bg).


LZW decompression (patent expired) involves initializing decompression tables to be numeric sequences before entering a loop exiting upon range checks or certain (endofinfo, 257 in TIFF-variants) opcodes which looks up a decoded value from the decompression tables to copy it the specified number of times into the destination followed by some additional expansions I’m not following. There’s an escaping “clear” (256 in TIFF-variants) opcode.

Then updates indices, reads next byte, & recurses.


Encoding images (with caller-specified delays & palettes) into a GIF involves reformatting that list of frames after validation involves LZW-compressing the raw images & assembling them into the previously-described AST to be serialized into a bytestring with the aid of the binary hackage.

LZW compression involves consulting a trie for the bits to write out via a Juicy Pixels-internal BitWriter abstraction, which is useful for writing any compressor!

HDR

If you have a nice monitor you might have come accross “high dynamic range” image formats like Radiance (.hdr, .pic) which allows you to see full detail in both the bright & dark spots of the photo. Then again most people don’t have HDR displays, as much as TV manufacturers want to insist otherwise it’s not that big of a feature. Unless maybe if you’re a photographer touching up such photos.

So I doubt Haphaestus will encounter many of these on the web!


Decoding a HDR image starts by reading textual & possibly-commented key-value pairs from the header, extracting FORMAT entry, parsing a textualsize, & extracting trailing data to wrap it all in a RadianceHeader record.

That metadata is converted into a common representation & run-length decodes (one of two variants, with size validation) the raw floating point subpixel values to wrap in a ImageRGBF (via some hacky type juggling) common type.


Encoding a Radiance image converts the RGBF pixel representation into RGBE with or usually without two-pass runlength compression, before assembling the header record & serializing it.

Juicy Pixels doesn’t appear any Radiance serializers which can embed metadata.

Radiance photo format doesn’t generally appear to be good choice for the web. What little compression it has appears very naive, theoretically making it’s filesize not worth the new features not everyone has the hardware to view.

Targa

Targa is an image format designed for hardware decoding on early graphical computers. It appears to be well-suited to “pixel art”. Juicy Pixels supports it, and so in turn will Haphaestus!

As always Juicy Pixels’ first step is to decode the raw abstract syntax tree. Which for TGA involves a header (itself consisting of header ID length, whether or not there’s a palette, the colourspace, start, length, depth, x & y offset, width, height, pixel depth, x & y origin, & bitflags each 8 or 16 bits) file ID, palette, & raw image.


Then it branches over the image type then pixel depth to determine which pixel encoding to return with or without a palette, recurses to decode that palette if specified, flips the pixels horizontally or vertically as-needed, and possibly performs bounds-checked run-length decompression as determined by bit 7. There isn’t much metadata to return from a TGA file.

Encoding TGAs involves reconstructing that AST deferring to a typeclass to ensure the raw pixel data is in a valid encoding.

TIFF

TIFF (Tagged Image File Format) has long been a widely supported image format supporting embedded metadata, even if JPG & PNG are more widely used. Seamingly for good reason, though with TIFF’s versatility I can certainly why hardware sensors would like it!


As usual for Juicy Pixels it first decodes the given bytestring to an abstract syntax tree. In this case TIFF images consist of:


To further decode the image from this AST (after correcting for rowperstrip = 0) it branches over the various fields to determine what pixel encoding it should return & what decompression it needs to apply as a preprocessing step.

In most cases it’ll branch over the plane configuration after allocating temp memory, followed by an optionally de-diffing step.

If there’s planar config there’s more memory-layout logic, but in either case it locates the data to decompress & merge it into the temp memory.


Depending on the specified compression algorithm, it may copy bytes straight to the vector (no compression), performs run-length compression with multibyte escaping, or calls into the GIF (not PNG) code to apply it’s own minor variant of LZW decompression.

A seperate pass copies the decompressed data into the array in a supported encoding as determined by a typeclass. Which is where YCBCr subsampling is decompressed if present.

Trivial per-pixel postprocessing is applied in a few cases.


Encoding is even simpler! Once it outputs the correct fileheader (as determined via a typeclass) it can directly save any of it’s pixel encodings. Though it doesn’t appear to actually save metadata for some reason?

SVG

SVG is a W3C image format which rather than storing a compressed 2D array of colours represents the image by storing XML instructions for tracing shapes & how they should be stroked and/or filled. It’s an important image format I’ll need to support in Haphaestus, and I’ll need to rewrite any HTML I parse to ensure any inline SVG markup I encounter is converted into something resembling an with appropriate alt text so that I can use Rasterific SVG’s codepath rather than my own HTML rendering. Because rendering SVG is very different from rendering styled HTML! (yes traditionally webpages were rendered by lowering vector graphics like SVG but Chrome & Firefox no longer do so, and to avoid more complex optimizations I will follow suite & use GPU-compositing)

Due to the need for additional dependencies for parsing, rasterization, & font rasterization (which will be discussed seperately) this decoder is packaged seperately from Juicy Pixels as “Rasterific SVG”, but still converts into the same types. It’s implemented in pure Haskell, including it’s rasterizing & parsing dependencies! But basically Rasterific SVG itself serves to convert from the SVG parsing types to the SVG rasterization types.

The first step of rendering a parsed SVG document to a Juicy Pixels image is to tell that seperate module to resolve use attributes & apply CSS rules.


To convert the parsed, use-resolved, & styled SVG document to drawing operations which another dependency (again I’ll study this after Wasabee) converts into a JuicyPixels image or PDF at the caller-or-document-specified pixel size, it iterates over each element rendering it’s tree into a document-sized page which it then combines & geometrically transforms to the desired size/location.

To render each element it initializes some default parameters & recurses over the tree.


Rendering any text elements are split into preperation & (with dynamically computed positioning info) rendering steps. This preperation involves flattening the tree merging attributes loading & caching the font files via yet another dependency. Each of those resulting spans are converted into a TextRange drawing operation wrapped in printTextRanges & drawOrdersOfDrawing calls & positioned as instructed.

Have to verify that actually includes shaping…


To render an image it locates & decodes said image via Juicy Pixels (I’ll need to hook up HURL here…), failing to do so it recurses into child elements. If successful it converts the attributes, generates fill (drawImageAtSize) & possibly stroke paths, & emits those draw ops with appropriate matrix transforms & opacity.

To render a “use tree” (after more general preprocessing) it recurses, then resizes/repositions to fit the desired bbox & as specified.

Symbol tree SVG ops are rewritten into Group tree ops. Group tree ops in turn recurses over all the subtrees before apply specified matrix transforms & opacity.


To render a rectangle it converts it’s fill & stroke attributes into Rasterific ops & hands them a rect or rounded rect to render. As always adding any matrix transforms & opacities. Circles, ellipses, & lines work basically the same.

PolyLines are rewritten into unfilled paths of alternating a MoveTo & LineTos. Polygons are converted into Path ops of a MoveTo & LineTos.

Mesh gradient ops require significant conversion efforts into “gradient meshes” before being rendered via Rasterific as a “mesh patch”. Not clear what’s happening there.


Paths involves a couple preprocessing passes. The first, normalization, pass resolves the meaning of multiple args to a drawing op usually duplicating that op. Or for MoveTo translates the tail to LineTo.

The second pass over the path removes lone ops which ops which are meaningless alone, tracks the first/curve-control/last points since the last MoveTo, resolves horizontal & vertical lines, translates curves into cubic beziers, converts relative offsets into absolute points, & optionally ends by adding a line from start to end.

There’s a second variation of that outerpass used during text rendering. And a simpler varient converting all to bezier curves is used in gradient meshes.


For any of these strokes it may look up a start/mid/end marker to add to the paths at a computed location, rendering that looked up marker via recursing back into the tree traversal. Beyond that Rasterific provides 2 conversion from strokes to fills, dashed or not.

Converting gradients/textures from the SVG AST to Rasterific requires some significant yet straightforward type conversion. Arcs in Paths requires some decent trigonemetry.

Parsing SVG

tl;dr; Rasterific SVG’s “SVG Tree” dependency parses it’s input into an internal AST & back using XML Light & Attoparsec, with trivial postprocessing for dereferencing & styling.

Before SVG can be rendered it needs to be parsed, dereferenced, & styled; which it starts by having XML Light parse the XML.

To parse this XML AST (from XML Light) to its own SVG AST & vice versa it uses pattern matching with a handful of utils & a couple of typeclasses. One of which defines lists of callbacks around Lens-generated accessors to aid parsing attributes to aid parsing via another couple utils.

This SVG AST consists of:

* indicates the type has a corresponding typeclass.

Draw attrs implements the Semigroup & monoid typeclasses, most of their properties are also monoids. And there’s typeclasses for CSS, as well as legacy typeclasses.


Once SVG Tree (this subcomponent of Rasterific SVG) has parsed SVG to its internal AST using pattern matching via XML Light as an intermediary, references still need to be resolved & styles applied before it can be converted into vector graphics operations as I described above.

To resolve references it iterates over the doc’s elements rewriting any “uses” to hold the value looked up in the doc’s definitions (expecting a geometry) under the use’s info’s name.

As for styling…

SVG Tree extracts the doc’s parsed style rules & elements. For each each element it queries for matching CSS declarations & repurposes the XML attr parsing infrastructure (including Lens, later topic after SVG) to apply those looked-up styles. Looking up those declarations involves iterating over all style rules & their selectors to interpret those selectors in two layers: recursively for relational selectors, & via all for per-element tests.

Supports just about as much CSS as is relevant! Without any significant optimizations, which probably aren’t needed given how little SVG is actually used in realworld SVG.


At which point it all comes back to parsing! How’s the CSS parsed? What’s it’s AST?

It defines:

Parsed via attoparsec without a lexer.

Thanks to Attoparsec this CSS parser strongly resembles its BNF. As does the parser for the SVG path, transform, etc attributes which also comes with serializers. As for colour & texture attributes, though its augmented with a hardcoded mapping from colournames to Juicy Pixels values.

Rasterizing

Monitors are incapable of displaying vector graphics like SVG, only raster, so before it can be displayed any vector graphics must be “rasterized”. For that Rasterific converts the parsed/dereferenced/styled SVG to (with minor lowering & layout computation) Rasterific drawing operations.

O.K., early on some videogames manually controlled the positioning of the electron beam in their cathode-ray-tube displays to create very sharp linedrawings for their time. Which might be partially why wireframes are so iconic of early 3D graphics.

But we can’t do that now, and software rasterizers have always been more capable than hardware ones.


Rasterific’s public API uses a Drawing monad (paramaterized synonym for Free Monads’ F type) to build a tree of “draw commands”.

These draw commands (with a linkedlist inlined) include:

Textures can be a solid Juicy Pixels colour or:

Implements a debugging serialization & monadic typeclasses for these types.


PDF is a… I’d say “paginated vector graphics image format” is a better descriptor but it tends to be used as a document format. Beaurocracies public & private love it since it hardly asks for any changes from them (hence why we’re all familiar with the format), but leaves me complaining about the lack of text reflow to fit the size of my window/screen. Since the text positioning is all hardcoded.

Its frequently used to send documents to a printer so they can be viewed on paper. In either case Rasterific supports rendering its draw commands to PDFs, with intermediate “producers” & maybe “draw orders” stages.

To convert draw commands into PDF “producers” Rasterizer computes the initial transformation matrix, unwraps those draw commands, & iterates over them. For each CustomRender it skips it. For each MeshPatchRender it converts into filling a rect with a new texture & lowers that. For each Fill it converts the path (grouped into polygons) into PDF syntax and appends a fill operation.

For each Stroke it writes the parameters to the PDF file, converts the paths to PDF syntax ensuring each polygon is closed, & outputs an “S” opcode. For each DashedStroke it wraps such a Stroke operation (recurses to lower it) wrapped with opcodes configuring the dash state.

For each opaque WithGlobalOpacity concatenates the rendered subtree with the tail. For each semi-transparent WithGlobalOpacity uses a helper function the nontrivial PDF output.

For each WithImageEffect it flattens the tree, ignoring actually applying the effect. For each WithTransform it’ll recurse altering the transformation parameter & if flagged to do so (initially not) it outputs the inverse transform to the PDF. For each SetTexture it converts the given texture if possible to PDF syntax surrounding or before the rendered subtree (during which it also alters the texture parameter during recursion).

For each WithCliping it recurses to render the clipping using “W n” rather “f” opcodes for fills & the normal subtree possibly surrounded by “q” & “Q” opcodes to preserve state. For each TextFill it renders it lowers it to draw orders (yuck! that means I can’t select?) & lowers those, by converting it’s path, texture, & fill command to PDF syntax. For each WithPathOrientation lowers to draw orders much as per preprocessing & recurses with positioned draw commands.

To convert these “PDF producers” into the PDF output it runs some infrastructure for assigning IDs to PDF objects & appends some trailing code to hold it pull together.

There’s a typeclass helping with serializing Rasterific types to PDF syntax.

Whilst lowering draw commands to PDF producers it might selectively share a preprocessing stage with rasterization.


Rasterific has a preprocessing stage shared with the actual rasterization routine lowering the “draw commands” (linkedlist inlined) flattening into a list of “draw orders”. This involves an iteration over the embedded linked-list/tree with some context state & a list to prepend its results to, handling each draw command differently:

The clipping often gets rasterized to an image. The context’s matrix-transforms are applied when retrieving textures or geometries

Mesh/”Coon” patches are rasterized by first extracting/computing the parameters at each coordinating, recomputing it into a “tensor patch”, & repeatedly a power-of-two times interpolates the coefficients & points to hand off to cubic bezier rasterization. Which in turn retrieves a mutable monadic canvas, does some more computation including its own power-of-two iteration count, & interpolates the colours filling in the pixels. There’s a fastpath without per-pixel boundschecks.

When lowering paths, there’s a pass which removes invalid path components holding NaNs or infinite numbers, once results are matrix-transformed via a typeclass & converted into a list.

To convert into a stroke path to a fill path it first lowers the given geometry to sanitize sanitized lists split by polygon, & for each of these polygons iterates over the path twice (forward & reverse orders) applying its width (considering normals, maybe recursive) & line joins (via geometric maths).

To dashize a line to be stroked it transforms its input pattern into an infinite & offset list & co-recurses between “taker” & “droper” iterators over the path & that pattern. There’s a subiterator used here & elsewhere to estimate the length of each “primitive” breaking them where needbe yielding two lists where the first is of the desired length (basic geometry).


Rasterizing SVG into a JuicyPixels image in Rasterific involves 3 passes: lowering the saved “draw commands” into “draw orders” of shapes to fill & callbacks, evaluating those draw orders, & some underlying infrastructure - all run within a ST monad.

A draw order holds a list of polygon paths, a fill texture, a fill method, maybe a clip texture, & a callback. To fill a draworder (there’s a seperate fastpath for no clip texture) it fills each polygon path with the clip texture if any & retrieves the image to evaluate the callback within an altered monad.

For each polygon path it retrieves the mutable image & its size, clips the path to the bounding box, converts the path to spans, & converts the fill texture to a callback it can evaluate over each valid resulting span. Mutating to the texture to include the given clip if any.

Converting a texture into a callback involves returning fastpaths for solid fills (especially opaque ones, using RLE) & prerendering meshes via a given callback, before entering the slowpath & wrapping its result with an iterator over the pixels to be filled, whilst propagating “samplers” & matrix transforms. The slowpath recursively lowers the texture tree (also propagating matrices & samplers) to return a callback which computes each individual pixel, possibly involving interpolation.

Clipping ofcourse involves geometry.

And the conversion to run-length-encoded pixels to fill involves decomposing its input into “edge samples” (via geometric maths) consisting of an x/y position, alpha, & height, convert into a sorted list, & iterates over them converting each pair of edges on the same scanline into a “coverage span”. Upon new values of y closes previous line & starts appending the next. Involves a callback determining whether to fill each run.


Beneath Rasterific’s rasterization logic I described above it allocates (involves additional checks in debug builds, & calling a couple typeclass methods) & fills a mutable “vector”/array (involving type coercion around the vector typeclass) & wraps it with a MutableImage JuicyPixels type to render into, & when done coerces into an imutable Image type promising JuicyPixels we’re not aliasing any memory. Along with (via typeclass method) its underlying vector.

All of Rasterific’s rasterization of its draw commands operates within a ST-monad, which is a standard Haskell library type for establishing monads around some temporarily-mutable value like vectors. I’ll revisit the “free monads” used to construct those draw commands later.

Rasterific provides modules lowering higher-level geometries to the draw commands & paths I described earlier, and for lowering geometric transforms to matrices. It reimplements what linear algebra it needs itself.

Rasterific implements a Lens API around its path components, and implements a subset of Lens itself.

Cubic & quadratic bezier curves each has additional support code.

That about covers the rest of Rasterific’s codebase. Not near has heavily optimized as Cairo Vector Graphics, but that’s a tall ask! These two libraries do have slightly different featuresets. Should be suitable for rendering the occasional SVG if I don’t rely on it for compositing a full webpage!

Free Monads

In Haskell it can be useful to repurpose the do syntactic sugar to aid construction of complex trees. I’ve seen this technique used before to aid constructing HTML (Blaze hackage, which contributed concatenation optimizations upstream) in a serverside webapp. But its also used for gathering “render commands” to interpret.

Rasterific uses a seperate hackage for this. Which implements Monad and other typeclasses around a callback taking 2 callbacks.

The two callbacks represent pass & failure codepaths. So that Monads’ >>= operator (which the do syntax is syntactic sugar upon) converts the right-value into a success callback passed to the left-value, wrapping the result to be ready for the next >>= operator.

The same hackage implements numerous variations upon this, supporting typeclasses for each. e.g. a linkedlist threaded through a given type.

Difference Lists

To aid stroking & clipping in vector graphics Rasterific uses a “difference list” collection type.

Difference lists (DList) are an abstraction around functions that take a linkedlist to append to the linkedlist it returns, thus making concatenation instantaneous thanks to Haskell’s laziness. And nonempty-difference lists (DNonEmpty) wraps that type extracting its head into a seperate field to ensure its always present.

Haskell’s a “lazy” language, that means it puts off computation until it knows the user actually wants an answer! When you declare a variable (rather constant) it’ll save not the computed value but rather a closure representing how to compute it.

In the case of difflists this ammounts to storing the tail to concatenate next to the head, or if GHC can optimize your program well enough a linkedlist with both head & tail pointers!

Primitive Monads

For writing the raw pixels to RAM, Rasterific pulls in a dependency “primitive” which extracts GHC’s guts & reassembles them into something presumably nicer. I don’t see that much to comment on unless I jump headfirst into the rabbithole.

Haskell tends to prefer working on linkedlists (unary-trees) rather than arrays, but if you need to write to a fixed-size canvas for it to be sent (indirectly) to the monitor… Something like this is needed!

Array Sorting

One of the most fundamental tasks in computing is to take a list of items & rearrange them (by some definition) into sorted order, for which there are several known “algorithms”. An extra wrinkle for Rasterific is that Haskell likes linkedlists (each item holds a reference to the next item) whereas for its purposes arrays (stored in contiguous memory) are more efficient. So Rasterific pulls in “vector-algorithms” which implements these upon “vectors”/arrays.


Going in alphabetical order the first sorting algorithm is American Flag! Which sorts character-by-character (or whatever) of the sortingkey. It starts by allocating 2 arrays before entering a count-loop than “flag”-loop.

The countloop counts the number of entries under each char. The flagloop repeats for the maximum number of chars insertion-sorting any slices <= 25 entries or performs seperate accumalate/counting, permute/rearranging, & (with its own initial counts) recursion iterations.


Heapsort is usually half-done to maintain an efficient collection where its always to retrieve the smallest (or largest) item.

In heapsort as implemented here fastpaths are taken for lists of <= 4 items. “heapifying” (organizing into a binary tree with parent val’s < children) an array involves iterating up the tree (halve index) swapping parent with max-child val to reestablish this invariant.

Sorting a heap involves swapping each head with the tail reestablishing invariants each time.


InsertionSort (which is what I used in designing hypothetical hardware to implement webbrowsers upon) involves iterating over each item & comparing it to every item in the already-sorted tail to see where it fits.

Videos visualizing the algorithms discussed so far:


Introspective Sort is a QuickSort variant combined with HeapSort when recursion gets too deep & Insertion Sort for postprocessing. Leaving small slices unsorted for said postprocessing to catch.

QuickSort selects a “pivot” value (median here) to split the rearrange the array into slices of values greater & less than the pivot both to be sorted via recursion (or here possibly HeapSort).

Rasterific uses this prior to sort the polygon-edges between which it should fill!


MergeSort is fast at combining sorted lists at the possibly cost of allocating extra tempspace, though since lists of 1 item are trivially sorted… Here InsertionSort or (shared by other sorts) handcoded fastpaths are used for slices <= 25.

MergeSort splits its input in half & sorts them (usually via recursion) so that it can iterate over both sorted lists simultaneously copying the lowest head to the front of the combined sorted list.


RadixSort (of which American Flag is a variant) allocates 2 tempspace arrays before repeatedly (swapping src & dest arrays each iteration) counts, computing running sum, & rearranging entries into the allocated buckets.

Since this doesn’t rely on comparisons it can have linear performance (as opposed to linear-logarithmic), at the cost of memory.


There’s a module for finding appropriate insertion points via binary search or “galloping”.


TimSort is often what your standard library implements, having been originally designed for Python. Tim Peters did a lot to make Python perform decently, & he wrote The Zen of Python! This is a combined Merge/Insertion Sort optimized to take advantage of pre-sorted slices.

It uses Insertion Sort where presorted runs (either ascending or descending) aren’t long enough, or if the entire list isn’t long enough. Preexisting (after possibly being reversed) runs are repeatedly merged.

Text Rendering

When rendering SVGs you may want to embed text. To determine the vectors to fill/stroke to represent said text in the given OpenType font, Rasterific & Rasterific SVG use FontyFruity!

FontyFruity in turn mostly wraps “binary” (as described earlier for Juicy Pixels) configured to parse OpenType file to an AST FontyFruity defines. Then follows indirections of specified tablerows for more “binary” decoding.

As for retrieving curves from the file…


The 1st step is to iterate over the font-size-str triples looking each char in language-compatible charmaps, concatenating results. Very naive algorithm, I’m sure it has internationalization issues. Hopefully they’d be willing to make an exception to their pure-Haskell implementation here…

The 2nd step is to iterate over those results to compute running-sum pixel positions whilst looking up & tweaking the curves to render.

Returns the repositioned curves read from the fontfile.

Atlas

When I upload the images Juicy Pixels decodes to the GPU in Haphaestus, it would be beneficial to combine multiple images into the same “texture” thus making better use of GPU memory. For this I’ll use “atlas”, which appears to be a Haskell wrapper around (which I’ll focus on today) some public domain C. Which wasn’t designed to be wrapped in language bindings judging by all the Template Haskell magic they imported.


The main function takes a “context” consisting of a size, alignment, mode, heuristic, counted coordinates linkedlist, & a dedicate freelist; & an array of ID’d size (input) + position (output) “rectangles” with a flag for successfull packing internally reused to store original index before temp-sorting by height-then-width.

For each non-empty rect it, after, attempts each valid x whilst iterating over positioned nodes & ~hittests against positioned rects for minimum y, returning the minimum y with or without considering waste.


After that main pass if you’re willing to spend twice as much CPU-time to address edgecases you may configure it to iterate over the positioned nodes again to find the least wasteful x-position within the selected gap.

On success it allocates a new node out of the linked-list to record the result in, and with aid of the main pass to select the insertion point freeing x-overlapping nodes.

If a DEBUG compiletime-flag is set performs validation of the size of the linkedlists.


These results are populated into the rectangles array.

After unsorting those rectangles back to original indexes it reencodes pack failures, NORing all results together to indicate whether it’s been fully successful via return result.

There’s a constructor which configures default context properties & initializes the dedicated freelist from externally-allocated memory.

Atlas Haskell Bindings

The Tetris AI-like algorithm I’ve just described yesterday is written in C, whereas I want to call it from Haskell. These language bindings are written with the aid of lots of Haskell language extensions roughly allowing C & Haskell code to be mixed in a single file.

There’s a a couple constructors which wraps the C “context” initializer & mallocForeignBytes, defaulting to allow one image for each pixel the atlas is wide.


Calling the “pack rects” core logic involves temporarily-allocating a results array copying the input data into it partially-via a given callback, & depending on whether it was successful iterates over & decodes results calling a given selector or other. This language binding has a couple of trivial wrappers.

A seperate file handles datatype language bindings. Mostly using GCC’s C preprocessor, but also providing some data for the main file to allow embedding slightly more complex C code there.

Indirect Dependencies

These graphics dependencies may use additional dependencies of their own which (excluded the extensive typeclasses linear implements, since it’s code isn’t actually run) is documented here.

Software Transactional Memory

Software Transactional Memory refers to a memory model where you can apply database-style transactions to mutations of your data in-RAM. Haskell makes it relatively easy to implement this by allowing STM-code to be constrained so that the only sideeffects are the STM mutations allowing the code to be safely rerun upon failure. Though it ultimately calls down to the C runtime to linkedlist enqueue & apply these mutations (which I wrote about previously).


Today I’m studying the “stm” hackage which, on supporting versions of GHC, extends Haskell’s standard library’s IORef support. It is used by Typograffiti for managing it’s atlases, though Typograffiti could make better use of it (found that to be a very intensive refactor I’m not comfortable with yet).

Otherwise it defines a TVar type itself wrapping IORefs and a STM monad to mutate them within tracking the concatenated callback to use for rollbacks/caught exceptions.


There’s several abstractions around readTVar/writeTVar for mutating a TVar via a callback function.

TSem wraps integral-counter TVars.

TQueue wraps a pair of linkedlist TVars moving the write linkedlist over the read linkedlist once that read linkedlist’s emptied. Or allows converting to a single non-TVar linkedlist.

TMVars wraps a Maybe within a TVar to allow exclusive access to the to the innermost value.


TChan merges this TVar concept with linkedlists with head & tail pointers.

TBQueue resembles TQueue with configurable max lengths which are enforced. For this it pulls in a (legacy?) “nats” hackage superceded & wrapping the standardlib Integer type.

And TArray wraps an array of TVars, implementing the same typeclasses as other arrays. For this it pulls in a hackage to wrap Haskell’s FFI standard library with datatypes, IO-typeclasses, & optimization rules.

Arrays/”Vectors”

In discussing Software Transactional Memory the other day I mentioned an array implementation wrapping Haskell’s FFI libraries, though usually lazy linkedlists are used. GHC also support it’s own “primitive” Array# type which is wrapped by the “vector” hackage with slice indices & all the typical list functions & typeclasses.

Bounds checks appear to always be performed at this layer at least at compiletime if not runtime.

Teaches GHC several optimization rules.


Array# is complemented by a MutableArray# type which doesn’t require duplicating the data to modify it, as long as you’re modifying it within a monad. Bounds checks on these mutable vectors decides when to enlarge it’s memory allocation.

Both the mutable & immutable wrappers are given a new typeclass, with a special implementation for storing data implementing FFI typeclasses, or boils everything down to (relatively, as indicated by unsafe prefix) bounds-unchecked reads/writes.


The raw read & write functions are defined on a typeclass which may be implemented specially for primitive, FFI, & various relatively manual implementations via C-style macros which apparantly should cover everything else.

For immutable array transformations which can’t readily be implemented via slicing it implements it’s own monadic lazy-linkedlist specifically to be converted back into Vectors via MVectors aided by size hints, and abstracts into a internal Bundle type.

Binary serialization & deserialization

Juicy Pixel’s first step in decoding basically any image format is to decode the binary data stored in a bytestring (I’ve previously discussed bytestrings) into an abstract syntax tree to decompress & convert into a relatively-common format. And vice versa for encoding, not that it does much compression at all.

Converting between the bytestring & format-specific abstract syntax trees is done using the “binary” hackage. Which resembles a (simplistic) parser combinator library like Parsec.


There’s a scanner which tells the caller when we require more data, wrapped with an implementation of the Monad & Alternative typeclasses for do syntactic-sugar. The caller is told whether or not the monadic-parser succeeded, or if it needs more input.

There’s utils for manipulating these datastructures & further wrappers decodes fixed-bitsize numbers or abstracts reading in files, lazy bytestrings, or whatever.


To encode data on the otherhand “binary” uses a double-layered yet trivial monadic-wrapper around “bytestring”’s Builder abstraction.

These Get & Put wrappers are pulled together into a Binary typeclass (and some utils around it), which is implemented for several Standard Prelude types, and can be automatically derived via the DerivingGeneric language extension (or you could use a script that outputs the sourcecode). Though Juicy Pixels manually implements it.

Inline C in Haskell

For whatever reason “atlas” in turn utilizes “inline-c” to mix C & Haskell code to reuse a public domain trivial C library. Here I’ll describe how the “inline-c” hackage allows mixing Haskell & C code in the same file, with the aid of numerous GHC language extensions.

This serves to generate a C library to compile & link into your Haskell code, without exposing any gory details.


Beneath the public API (also includes Ptr utils) there’s a Context storing a types table, “antiquoter” parsers, optional output postprocessor callback, the output language, & whether to to support C++; a global MVar around this + a counter for identifier allocation & C code for a Template Haskell-registered callback to output to Template Haskell for compilation on exit; functions which Parsec-parses Template Haskell-provided input & via given callback enqueues C code to be written whilst generating AST to import & call it.


An alternative callback to the core logic captures function pointers.

Further beneath that there’s an approximate C(++) parser; a datamodel representing the C type system with utils & parsers; functions to convert between arbitrary C & Haskell types; & utils to mangle Haskell identifiers to C identifiers.

ZLib Bindings

I’m using ZLib indirectly in Rhapsode/Haphaestus/etc via http-client & Juicy Pixels.

Behind it’s public API it has records with default constructors representing compression & decompression parameters. And a datastructure facilitating coroutines yielding or requesting data before ending with trailing data or an error.

Coroutines for compression vs decompression are seperate types, and both have “fold” functions for converting into an output type via given callbacks.

Upon this it wraps a “stream” API switching between a state where it fills input buffer or enlarge output buffer as-needed & a state where compresses (“deflates”) from the input buffer to the output buffer. And a similar function for compression/”inflate”.

And further wrappers to run these in a IO or ST monad.

Beneath that it’s “Streams” API mutates ZLib’s (not GZip? Was I confused? Yes) C API via a dedicated monad. And ofcourse there’s lightweight wrappers around several API functions FFI-imported from ZLib.

“Lens” Accesses & Traversal

Haskell can automatically generate getter functions for you via the “record” syntax, though the corresponding setter syntactic-sugar can leave things to be desired. Like when parsing CSS properties into record fields.

These fieldnames are amongst the first things GHC lowers.

To improve upon these accessors (via the Template Haskell language extension) we have the Lens framework. Support for which is implemented by Linear, SVG-Tree, & Rasterific.


Reading over this hackage one file at a time, Control.Lens.At implements typeclasses for checking whether an index is a collection (+ converting from array) & another 2 (different return types) for looking up a maybe-value at that index (sets hold ()); these are implemented for several common types from the Prelude & elsewhere.

Control.Lens.Cons implements operators & pattern-matching syntactic sugar wrapping 2 typeclasses for appending/prepending items to various list implementations.

Control.Lens.Each provides a typeclass which mutates every item in a collection, supporting numerous collection types built-in.

Control.Lens.Empty provides a typeclass & wrapping pattern-matching syntacticsugar to test whether a collection is empty.

Control.Lens.Equality does seem typesystem magic to assert that given types are equivalent.

Control.Lens.Fold provides numerous common utilities around standard collection typeclasses to integrate Lens’ types.

Control.Lens.Indexed makes some typeclasses public & adds wrapping utility functions.

Control.Lens.Iso uses some tricks to implement function isomorphism so Lens works better with other libraries which don’t explicitly target it.

Control.Lens.Level provides utils for breadfirst traversal over tree-like structure.

Control.Lens.Plate provides a typeclass for monadically-iterating over a collection without returning a mutated version, with several wrapper functions.

Control.Lens.Prism makes a Prism typeclass public & (alongside other submodules) provides conversion & other utility functions wrapping it. Adds Prefixed & Suffixed typeclasses for various text types.

Control.Lens.Profunctor provides a OpticP synonym for conversion functions from one collection to another, and several conversion functions.

Control.Lens.Reified provides wrapper types so it can implement numerous typeclasses on them.

Control.Lens.Review is a more constrained prism variant.

Control.Lens.Traversal provides several more type synonyms so it can provide trivial wrapper functions & typeclass implementations.

Control.Lens.Unsound provides heavily requested utility functions that are hard to typecheck.

Control.Lens.Wrapped does some typesystem magic I’m not following.

And Control.Lens.Zoom provides typeclasses for converting between more & less specific types.


In the core type declarations module an accessor (a “Lens”) is a synonym for a function which takes a single-arg “functor” callback & returns another. Types may or may not be the same. There’s variants compatible with indexed traversal.

Equivalent to Lenses there’s “applicative” traversals. Functors & Applicatives are builtin typeclasses, many Haskell devs get really into these but I advise newbies to not worry about them. I don’t.

Same for setters, but their return implements Lens’s own Settable typeclass. Getters implement Contraviant & Functor.

Isomorphism (profunctor transform holding functors), review (optic-variants), & equality types are defined here. Prisms are choice transforms holding applicatives.

Folds are akin to lenses, getters, & setters but their return type implements Contravariant & Applicative.

There’s a typesynonym to simplify generic declarations. And there’s more generic types like “optics” which don’t require any particular typeclasses to be implemented.


The Control.Lens.Getter module provides various utility functions for combining setters & getters to mutate fields as easily as you can in imparative programming, whether or not they’re indexed.

Control.Lens.Getter also abstracts monadic types in ways useful to Lens.

Control.Lens.Setter defines several more type synonyms, & numerous utility functions & custom-operators around setters to make them comfy to use. Not entirely clear of more specific intention behind this code, its quite magical leaning heavily on typesystem.

Some allow combining getters & setters. Including variants of the numeric operators.

Control.Lens.Lens looks like more of the same.

Defining accessors for all fields of a tuple between 1 & 19 fields-long takes some special effort since each different length of tuple is treated as a unique (generic) type by GHC. So typeclasses are used so the same functionnames can be used on the different tuple lengths. Non-lazy variants of these accessors are defined.

And finally there’s compiletime code run via Template Haskell (similarly to a JIT) to compile a referenced Record declaration into AST for nicer accessors.

There’s several variants of the core logic in the public API, some differing (e.g. adding syntactic preprocessing) more than others. Accessors for the intermediate recordtypes are defined manually.

The logic dealing directly with generating Lens-accessors for the fields is implemented elsewhere, though it has several utils for AST manipulation.

After gathering/normalizing the fields this corelogic with or without wrapping in a typeclass instance, etc by populating appropriate template.


For the internal modules…

There’s Bazaar types whose purpose I don’t understand.

There’s ByteString traversal utils. Include conversions to various set types with compiler optimizations.

There’s various Indexed... typeclasses implemented by a new Context type wrapping a 1-arg function & arbitrary value. And by a Pretext type wrapping a more complex typesignature 1-arg function.

There’s a double ended queue implementation (following Banker’s Deque pure-functional algorithms).

There’s an exceptions system which can be used with prisms (yes, there’s a public module for this too…), defining a typeclass for handling those exceptions.

There’s internal modules for generating Haskell AST for record-fields, prisms, & more (all seperate modules) as directed by the Template Haskell APIs.

There’s several internal types (Folding, Traversed, TraversedF, Sequenced, NonEmptyDList) wrapping a generic type with typeclass implementations. In the same module is is implemented Leftmost & Rightmost types which roughly resemble linkedlist but where the tail might also hold a value.

The internal Getter module defines a AlongsideLeft type wrapping a generic type around a generic pair, with typeclass implementations.

There’s an internal Conjoined typeclass & Indexable subclass with Indexed & Indexing[64] typesynonyms; with various typeclass implementations.

There’s an Exchange type wrapping 2 single-arg functions, & Reversing typeclass with reversing :: t -> t method, both with various typeclass implementations.

There’s Level & Deepening datatypes aiding tree traversals.

There’s a handful of list utils.

There’s Magma, Molten, Mafic, & TakingWhile types for converting traversal operations into a computable data.

There’s an internal Market type wrapping two generic single-arg functions one of which returns either one of 2 types, with typeclass implementations, used for implementing Prisms.

There’s a WrappedPafb type wrapping a generic type which takes another generic type, implements Profunctor & Choice.

There’s a Reviewable subclass of Profunctor & Bifunctor.

There’s a Settable typeclass implemented by accessors via Template Haskell.

And for the sake of Zooming there’s Focusing & FocusingWith wrapping generic types taking tuples (2 or 3 items respectively). There’s FocusingPlus who’s wrappees also takes another generic type & FocusingOn which takes a generic type in place of a tuple. May wraps Maybe & has corresponding FocusingMay type, as per Err with Either & Freed with free monads. Effect wraps a generic type. EffectRWS wraps a function wraps a function returning a generic type taking a triple. All of these are defined in the submodule & implements a handful of typeclasses each.


Lens implements its typeclasses for: