Wayland is the protocol by which apps on your freedesktop communicates with your window manager for their main I/O. Weston was it’s reference implementation, but I believe they’ve decided that title is now meaningless.
To be clear nothing I’ll be describing here is special to Wayland (beyond not requiring a seperate daemon from your window manager), as X11 has implemented these features before Wayland started superceding it.
Core Protocol
At it’s core Wayland is a rudimentary RPC protocol over a UNIX domain socket, without any concept of return values (thereby encouraging efficient/non-blocking usage). This Unix DOMAIN socket is specified by environment variables.
The first (32bit) word of a message identifiers the receiving object, the next 16 bits identifies the message’s size, and the other 16 bits identifies the method/event.
The rest of the message is decoded according to a mnemonic type signature for the method/event.
In the reference (C) parser this is implemented around quadruple ring-buffered calls to sendmsg/recvmsg so it can also send/receive auxiliary file descriptors (for cross-app IPC or efficient image sharing).
These messages are decoded to/from Closure objects (mostly by directly copying the words over) according to a lightly-parsed type signature string, and in turn (maybe later) variadic function arguments (on the receive side this is done via libffi).
This code is mostly in connection.c
.
Of particular interest are the ‘o’bject & ‘n’ew types, which (alongside ‘s’tring, ‘a’rray, & ‘h’ file) may be augmented with ‘?’ to turn off null checks. ‘?’ is the sole reason for parsing type signature strings. A second type table indicates the expected interface for the object to implement.
The object IDs sent over the UNIX socket are mapped to objects in seperate arrays for client- and server-allocated objects, with bitflags (free & zombie/legacy) in the lower-2bits of the stored pointers.
These objects in turn (in addition to application-specific data) it’s ID to be sent over the socket, a method table to call via libffi, & an “interface” describing it’s name, version, and method/events type signatures in turn with their names, type signatures, & expected interfaces for any objects.
The map is a free list to use for allocating new object IDs.
Public API Generation
Wayland has a standard API exposed over the RPC protocol described above. This is specified in an XML format, which can be parsed (via Expat), verified (with the optional help of libxml & a DTD), and compiled to a public well-documented C API around the standard parser for both the client & server.
The compiled C code includes:
- Types by which to refer to objects of given interfaces.
- Any enumerations specified by the XML input (which requires each “entry” to have an explicit integer encoding). These are not type-checked against either at runtime or compiletime.
- A subset of the parsed data (in a corresponding .c file) for use by the de/en-coding logic. The expected “interfaces” of any objects being compacted into a single array from which slices are taken.
- Method table structures to which decoded incoming messages (“methods” for servers, “signals” for clients) should be dispatched via libffi, stored as type void* internally.
- Integer opcodes for each method or event, as C macros.
- Interface version numbers, as C macros.
- Relatively type-safe inlined functions wrapping an internal variadic function to write outgoing calls to the UNIX domain socket.
The client-side header file also includes type-safe inlined functions around internal functions which:
- Gets or set the untyped
user_data
property. - Gets the
version
property. - Set the method table for a given object.
- Within a lock & unless marked as a “wrapper”: Remove an object from an object ID map, flag it, decrement a refcount, & maybe free it’s memory.
The clientside routines for sending a method call to the server can allocate a single object, inserting it into both the object ID map & arguments array to that method call. More logic is needed in this code generator to expose this functionality through it’s type-safe wrappers.
Error handling
On either the client or server debugging messages can be written to stderr if an environment variable is set, though on the server “protocol loggers” can be registered into a linked-list.
On the server actual errors are usually reported to the client which triggered them via the wl_display (#0).error
event. And the clientside will usually store aside these or other errors in wl_display
properties.
Object ID de-allocation
On the clientside there’s two parts to this: When the server indicates it has freed an object via the wl_display (#0).delete_id
event it is looked up in the object ID map to see if it’s a zombie. If it is it’s freed & removed, f it isn’t it’s flagged as DELETED, & if it’s absent it logs a warning.
When the client destroys an object (typically as part of a “destructor” method call) if it’s flagged as DELETED it’s removed from the object ID map, if it’s client-allocated that object’s marked as a zombie, and if it’s server-allocated it’s NULLed. Afterwhich the object’s refcounted is decremented and possibly freed.
When the server receive’s that method, the caller of the wayland-server
library calls wl_resource_destroy
to deallocate it (in part via a caller-provided destructor), remove it from the object ID map associated with the client, and send the client a signal that it has done so.
Connection setup/teardown
The UNIX domain socket to open is specified by environment variables (flagged close-on-exec), which is initialized alongside a wl_display
object to hold it. On the server this socket is added to a simple epoll()
mainloop alongside any client connections received via accept()
, so it can handle multiple clients. From the connection between the server & a client is wrapped in a wl_connection
structure described above, and the wl_display
is given object ID 0.
Tearing down the connections on either client server involves closing all associated (in-transit) file descriptors & freeing all associated memory.
Clientside event features
- Allocation & refcounting of incoming objects.
- Autoclosing received file descriptors for zombie objects, for which a special replacement object might be allocated on upon object de-allocation.
- Parsed events are queued on a linked-list associated with the object (inherited from the object which constructed it) before dispatching/invoking them.
- There’s a readlock, including an atomic condition, count, and surrounding mutex.
- Uses
poll()
syscall to determine when’s a good time to read or write.
Special “wrapper” objects can be allocated to cause different queues to be inherited by new objects, and this is used by a blocking wrapper around the wl_display.sync
method.
Serverside signals
The server library may notify it’s callers whenever a client, wl_display
, or other object is created or destroyed using linked-list “signals”.
Serverside “global” objects
Global objects can be added to and removed from a linked list to be immediately immediated by all existing wl_registry
objects, and immediately upon creation of any new wl_registry
objects by the wl_display
object.
There’s also an implementation of the wl_shm
/wl_shm_pool
global included with libwayland-server (but in a seperate .c file), for wrapping mmap()
-related syscalls and constructing wl_shm_buffer
objects. This allows clients to write pixel data directly into shared memory without burdening the UNIX domain socket. The supported pixel formats are specified by the caller, and advertised via the wl_shm.format
event.
Kernel-side
The socket()
system call dynamically dispatches to an initializer based on the specified type
, wrapping the result with a filedescriptor that dynamically dispatches system calls (whether they’re socket-specific or not) to the socket’s dispatch table. PF_UNIX
+ SOCK_STREAM
sockets wrap a linked-queue to/from which a userspace/process-specific cmsg
structure is converted, reading/writing directly into Wayland’s ringbuffers. The server sockets accept()
ing new client connections are listed in a locked closed hashmap.
Memory Mapping
Linux manages a process’s “virtual memory” (the process by which a program is given the illusion it’s alone on the CPU) in an “Interval Tree”, which is basically just a Red-Black Tree. Using this Linux provides syscalls by which programs can “map” a file into memory possibly sharing it accross multiple processes, which Wayland finds useful for zero-copy transferring of image data from apps to the window manager. Wayland coordinates this via the wl_shm
global.
NOTE The real performance win here comes from passing these files to be shared over the UNIX domain socket described above. MMap is just a useful way to handle it.
The mmap()
syscall (exposed by Wayland’s wl_shm.create_pool()
RPC method) optionally allocates (in part via a configured callback) a target address, verifies that address & other input, updates the interval tree, & most importantly calls the specified file’s .mmap()
method with a new Interval Tree node which it’ll then insert. Other Linux subsystems may also impact mmap()
if enabled.
The mremap()
syscall (exposed by Wayland’s wl_shm_pool.resize()
RPC method) resizes or (if permitted by the caller) moves the specified Interval Tree node, which may also involve CPU and/or file specific logic.
The munmap()
syscall (exposed by Wayland’s wl_shm_pool.destroy()
RPC method, and possibly called by the above two syscalls) calls CPU-specific logic to remove that mapping from the hardware tables, after which it makes sure the Interval Tree nodes are removed & all necessary subsystems are reloaded.
Weston window manager
This project includes several sample clients, a couple sample window managers, a common library to do most of the logic (which I’m covering here), and a second one for supporting desktop-like windows.
Combining a doubly-linked-tree/linked-list of Wayland buffers (amongst several other things) with a dispatch table for outputs & multiplexing “launchers” with routines for feeding input events into Weston (where subsystems may temporarily redirect a device’s input through a dispatch table), gives us a convenient base on which to build the rest of Wayland’s standard RPC API!
Rendering that buffer tree involves:
- Flattening the tree (it’s usually very flat).
- Some “content protection” logic I don’t like…
- Planes are assigned to the views.
- Aggregate post-rendering callbacks.
- Aggregate the damage.
- Update the output transformation matrix.
- Tell the output to repaint.
- Updates the pointer grabs.
- Updates the frametime & calls any client or animation callbacks.
Underlying this base we can swap/mix in our own choice of output targets, input sources, compositors, & (for controlling who has access to the screen) launchers:
- Linux’s Direct Rendering Media oc
- Framebuffers o + Pixman c
- Headless oci - Headless input or compositing requires exactly no code.
- Remote Desktop Protocol oi via FreeRDP - Seemingly to address people complaining that Wayland isn’t network-transparant
- Wayland oicl
- X11 oic
- OpenGL c
- Direct l
- logind l
- libinput/udev i
Input Device Discovery
When you plug in a new mouse, keyboard, etc your OS needs to configure out and, if appropriate, inform your window manager about it. There’s a couple different components to this step.
udevd
UDev is a daemon for normalizing the “device files” Linux’s drivers expose to userspace. The kernel used to take responsibility for keeping this interface consistant, but found it was taking more-and-more effort so they split it out into a seperate project managed by systemd.
At it’s heart UDev is a simple conditional “configuration” Domain-Specific & Dynamic language interpreted over file descriptors provided by Linux over a AF_NETLINK socket in what to me looks like an overcomplicated API. And afterwhich it configures device files according to the data computed by that DSL, running on queued commands on that file once it’s in place.
Device files are received by a “manager” to be sent to a “worker” thread, both of which are event-driven.
That DSL is also conditional not unlike CSS (as I’ve implemented in Rhapsode), allowing you to say “if these tests match assign these outputs”.
There’s a directory-based database called “hwdb” UDev & one of it’s several built-in commands (in the exact same way “cd” is a built-in command) that configuration can refer to for this normalization.
There’s plenty of logging some of which is specifically targetted to systemd.
libudev
Other daemons like the Weston window manager might then connect to the NETLINK
socket just like udevd does to parse in any devices, attaching bloom filter register-based bytecode program to that socket to limit incoming messages. And they’ll probably traverse the existing device files using udevd’s intermediate “hwdb” database as an index.
libudev
is a shallow wrapper around libsystemd/sd_device
.
AF_NETLINK
sockets
The kernel tracks a hashmap of which NETLINK
sockets are connected to which sources, and when other kernel subsystems are sendmsg
calls broadcasts or unicasts a message it’ll consider whether the specified NETLINK
sockets are configured to receive or ignore that message. If they do wish to receive it the message is enqueued on it’s linked-list to be read to userspace.
NETLINK
sockets are very limited in the syscalls they support.
Input Device Drivers (libinput)
The first step of which is to allocate a libinput/udev context including an epoll list & method tables, storing the given libudev context. One method table implements enable/disable/destroy/seat-changes for libudev, adding necessary files to the epoll list & using UDev configuration to assign seats & displays.
The other method table is provided by Weston to integrate the configured “launcher”.
Next Weston extracts the epoll list from libinput to add to Wayland-Server’s epoll()
-based mainloop. When that’s triggered libinput runs what ammounts to it’s own mainloop, and Weston iterates over libinput’s events ringbuffer. Which it splits nicely into events arising ultimately from libudev or libevdev, & maps to Wayland RPC objects.
The dispatch of libevdev-originating events to the Wayland client is handled via a dispatch table (alongside the idle inhibator, button bindings, & maybe positioning the surface for the mouse cursor), which by default traverses the surface list to determine which clients should receive the event. Though that default can be temporarily swapped out to implement gestures like drag&drop, and touch input requires more interpretation.
Key encodings (libxkb)
Here I’d like to study libxkb, which Weston uses to determine the encoding of the keyboard input - forwarding that info on to clients. This info is determined by the Weston compositor backend (or it’s configuration) either directly or by name.
If it’s by name (leaving aside some currently useless dynamic dispatch) libxkb allocs/inits an object to hold this info, fills in default rules, and then does the lookup. That actual logic works by first looking up the file in a configured location, split the parameters by comma, interprets that conditional DSL (merging the lex/parse/execute steps), afterwhich it extracts the DSL’s output, validates, frees memory, & closes files. Following that is logic to parse & combine said response from the DSL.
So really not all that different from UDev?
Afterwhich Weston calls xkb_keymap_mod_get_index
for libxkb to lookup the modifier keys Weston needs in a parallel array. The strings are deduped through a global hashmap (making them “atoms”) for fast equality. xkb_keymap_led_get_index
works similarly.
And to serialize it for clients to parse naturally involves traversing the object tree with error-handled calls to vsnprintf()
.
libevdev
Here I’ll describe the userspace drivers for input devices like mice, keyboards, touchscreens, etc as implemented in libevdev. Starting with the evdev_device_create
constructor called by libinput when libudev discovers a new supported device.
First it checks with libudev to do some final filtering (including a libinput-specific flag) and opens the device file via a Weston-provided callback from it’s launcher backend.
From there libevdev checks it got the device file it expected, allocs/inits memory for it (as part of which it calls the real libevdev library, I thought it was embedded in libinput but no) reading necessary information from libudev’s parsing & linear scanning a libinput-specific flatfile database for “device quirks” to apply to libevdev, discards all currently-pending events, calls the IOCtl to set the clock ID, integrates libinput’s logging into libevdev, consults the “tags” libudev looked up (based on the configuration files provided by libinput) to determine which specific constructor & “pointer accelerator” it should call. I’ll have to push those off to the rest of the week.
Then it adds it to the mainloop & UDev-specified “group” & “seat”, & notifies any callers that a new device has been added.
When event(s) come in on that device file, libevdev first tries reading events from a ringbuffer, normalizing them & tracking device state. Once that ringbuffer is emptied it’ll read in more to that ringbuffer from the device file, returning the first event it read in live to be dispatched to libinput’s userspace driver.
So it appears that libinput’s “evdev” wrapper provides the userspace wrappers, and it’s underlying “libevdev” knows how to read input from the kernel drivers.
Userspace Drivers (Dell Totem Canvas)
This starts by validating that the kernel driver exports the necessary information, before alloc/initing the driver, allocing the “slots” based on libevdev’s count/selection, sets up dispatch tables including a “filter accelerator” for fine-tuning cursor movement (with an extra layer of indirection).
There’s also a dispatch table for pausing (closing)/resuming (reopening) the driver. The driver’s own seperate method table (called throughout the libudev/libevdev lifecycles) includes:
- Handling for libevdev events; specifically ABS (updates a slot), KEY (updates
button_state_now
property), MSC (ignored), & SYN (retrieves info to trigger higher-level events, before normalizing coordinates & calling other methods). - Handling suspends similarly to libevdev SYN events.
- No special handling for removal.
- Frees it’s slots’ & own memory on .destroy().
- On added or suspended verifies it’s the same device, and possibly wraps it if it doesn’t already wrap a touch device.
- On removed or resumed it disables the touch device, possibly normalizing coordinates & calling other methods in the processes as per SYN event or suspends.
- After being added it initializes the state of each slot triggering “proximity” & “tip” events, before normalizing coordinates & calling other methods.
- On touch arbitration toggle/update-rect (which are those events) it does nothing, as per retreiving the “switch state”.
Other drivers are minor variations upon this on, with the main difference being the datamodel. And touchpads have a number of gesture recognizers converting certain input sequences into events other devices might trigger.
But most devices (including all keyboards, touchscreens, & mice) share a userspace driver that just normalizes the input events without interpreting them.
Kernel-space Drivers (power button)
To start describing how the kernelspace input drivers work as implemented in Linux, I’ll start with the “apm-power” driver. Which I found very quickly (after realizing, with help, the presence of /drivers/input/ & /drivers/hid/), and serves as an excellent Hello World-style example. It just receives shutdown signals and queues it up in another kernel subsystem.
When this kernel module initializes & deinitializes it starts by passing a dispatch table to input_(un)register_handler
.
input_register_handler
starts by taking a lock & a list property on that “handler”. Before adding it to a input_handler_list
global, calling it’s .connect()
method on all matching devices from a list of devices that still needs to be matched, & updating procfs.
The matching first consults a bustype/vendor/product/version/bitflags ID table on the method table, before calling it’s .match()
method.
That first global list will be consulted for new devices, after adding it to the second.
The function for initializing new input devices & looking up their drivers is called numerous other device drivers (I’m seeing the count of 365 files excluding /arch/).
Whilst input_unregister_handler
function calls it’s .disconnect()
method for each previously connected device (the list of which was initialized in input_register_handler
) before removing itself from the global linked-list of drivers & updating procfs.
The “apm-power” driver meanwhile registers to handle any device that exposes an EV_PWR
event (as determined by a bitflag).
Upon .connect()
it allocs/inits a new “handle” to hold this method table & device, before registering the handle & opening the device. To register the handle it adds it to device’s handles linked-list under a lock (“filters” at front, others at back), before adding to the handler’s list & calling the (unimplemented) .start()
method.
To open the device it increments a refcount & calls the device’s .open()
and/or repeatedly .poller->poll()
methods.
On disconnect the driver closes the device (by calling the .start()
method for each connected handler, calling .close()
on the device, no longer calling .poller->poll()
, & decrementing refcounts), unregisters the handler (by removing from it’s linked lists), & freeing it’s memory.
The .event()
method checks which event it’s received & forwards it to apm_queue_event
.
Kernel-space Input Drivers (general)
Examining other drivers in Linux/drivers/input/ , there’s an even simpler Hello World which printk’s all events for all input devices. Another driver registers a device file (exposed by the /drivers/leds/ subsystem I don’t want to dig into now) for controlling LEDs on the input device, which are injected as events & tracked in bitflags.
There’s a more complex Joystick driver which applies to multiple device identifiers, includes some locking & state tracking, & it’s own character device file.
Most of the joystick logic is there to handle it’s (legacy?) device file interface, with events being queued up in a ringbuffer for it to report. The mouse driver is very much the same.
The events are originally sourced from various drivers following usually trivial hardware protocols usually over I2C or PCI channels & their probe
callback methods.
Kernel-space Input Drivers (device files)
Linux’s input subsystem exposes it’s device files via another “input handler” matching all devices which I’ll describe this morning.
Upon .connect()
it allocates new version numbers out of a bitmask, allocs/inits it’s own memory, assigns a label to the wrapped device incorporating the allocated number, adds itself to the device’s listeners, create a character device with it’s own method table, & stores it in a list both for that device & globally.
.disconnect()
as per usual cleans up everything .connect()
did. libudev is notified of these new character device files via their “device” property.
The device file’s ioctl
syscall/method as always retrieves specified properties on from the device into userspace, or stores userspace data into those properties. Other ioctls may call the device .flush()
method, read data, etc.
The .poll()
method considers waiting before setting output flags depending on it’s presence & it’s a in-queue.
The read
syscall/method for that device file reads the specified number of events from a ringbuffer and returns them to userspace.
The write
syscall/method meanwhile injects the provided events into the underlying device if it’s supported according to a bitmask. Which involves calling the .filter()
or .event()
method on each registered handler.
The open
syscall/method allocs/inits a “client” structure before opening the device (by incr’ing a count & maybe start a poller) & a stream.
“Opening the stream” meanwhile simply means flagging the file pointer as being a stream input.
The .release()
method on that device file as always releases the resources loaded upon open
.
Upon hangup
it sends a SIGURG interrupt to the appropriate processes & wakes them up.
The flush
syscall/method on the device file calls the .flush()
method on the underlying device.
The fasync
syscall/method on the device file adds it to a global list in another subsystem.
Upon receiving an event, the corresponding “input handler” writes it into each client’s kernelspace ringbuffer after applying any filters, adding EV_SYN
events upon overflow. Clients may need to be woken up.
Framebuffer Output
Amongst other targets (including the Direct Rendering Manager, which also exposes a framebuffer-compatible interface), Weston can output composited images to framebuffer devices. Or do the compositing in the framebuffer.
VGA16
Upon init, if it’s built-in to the kernel, it parses configuration. Then in either case it registers the driver, then allocates & “adds” the device.
Upon exit it unregisters the device then driver.
Digging into those (un)registration functions it looks like quite the tangent I don’t wish to dig into right now (it can be quite the struggle to stay focused when studying Linux).
But upon registering the driver it provides a method table with a textual name (vga16fb) & methods for probe & remove.
Upon probe it allocs/inits the framebuffer, it’s apertures, & it’s image data whilst outputting some debugging information. Then registers it & powers up the screen.
To register the framebuffer it first performs various correctness checks possibly removing conflicting framebuffers, allocates a node ID for it out of an array, initializes additional fields, creates & inits a device, & determines the primary framebuffer.
The VGA’s image memory is in a fixed memory location, though some CPU circuitry may need disabling.
Upon remove it powers off the device & unregisters the framebuffer with the device, console (+rendering), it’s driver, & it’s memory usage.
Upon probe the VGA16 framebuffer Linux driver registers a method table. I’ll figure out where these get called tomorrow (so not quite done today!), but:
Upon open it increments a refcount & possibly initializes itself with some bitflags & newly-allocated saved state/mmaped-fonts.
Upon release it decrements the refcount possibly restoring that state.
Upon destroy it frees the allocated memory.
Upon check_var
it performs checks & adjusts upon the given config, saving to it’s par
property.
Upon set_par
it adjusts it’s par property & the corresponding hardware registers.
Upon set_col_reg
it sets the hardware registers (different for VGA or EGA) mapping that pallete to RGB values.
To pan the display it reformats it’s input to write them to hardware registers.
To “blank”, it toggles various bits in the hardware registers depending on which type of screen is attached.
To fill a rectangle it sets various mode registers depending on the operation, falling back to memory writes.
To copy an area, it reformats the input to output them to hardware registers, very similarly to the fill rect method complete with an in-memory fallback.
And finally “blitting” an image works practically the same way, with seperate routines for whether or not there’s any colour depth.
Device File
Linux has a single device file for all framebuffers initialized on bootup (or when that module’s loaded); alongside a procfs device file, “graphics” class, & a virtual console for debugging messages.
The read
syscall/method first looks up the framebuffer for it’s minor number. If it hasn’t changed it call it’s fb_read
method, falling back to an optional call to fb_sync
& copying it’s memory to userpace.
The write
syscall/method calls the framebuffer’s fb_write
method falling back to copying data from userspace.
The ioctl
syscall/method as per usual copies the specified property to/from userspace. Though many of the setters requires changing the video mode at which point it’s recomputed, a list updated, and calls the fb_check_var
, fb_get_caps
, fb_set_par
, fb_pan_display
, and/or fb_setcmap
/fb_setcolreg
before notifying other components.
Various locks may be required there.
Upon setting a console to a framebuffer (via that ioctl
method/syscall) it looks up the input, configures and possibly fb_open
s the framebuffer, figures out which rendering routines to use, releases the old one possibly calling fb_release
and/or fb_set_par
, and initializes the cursor rendering.
You can also call the fb_blank
and other methods via the ioctl
syscall on a framebuffer device, and the framebuffer can provide method defining more ioctls.
The mmap
syscall/method forwards the call onto fb_mmap
method whilst making sure autoencryption doesn’t get in the way, falling back to the fix.smem/mmio_start/len
properties.
The open
syscall/method loads the appropriate module for the framebuffer, stores the looked-up method table, calls fb_open
, & adds an asynchronous I/O method if compiletime-enabled.
The release
method forwards on to the fb_release
method with some locking.
The get_unmapped_area
method if compile-time enabled returns a given from the screen_base
property.
The fsync
method/syscall if compile-time enabled forwards to the deferred_io
method on idle.
And the llseek
method/syscall updates f_pos
file property as specified.