System Databases

A lot of, if not most (save communicating this information to humans), computing is dedicated to maintaining databases. To updating and querying various collections of data. Some of the most core databases to making your computer function at all are discussed here.

Package Management (Apt)

To assemble all the software developed by Linux, GNU, FreeDesktop.Org, GNOME, KDE, etc, etc, etc into a functioning operating system “distros” (as we often insist on calling them) create “package manager” tools for installing this software from their curated “package repositories”. Programming languages often have their own dedicated ones, which I’d consider a great idea if those were decently curated.

For this Debian has the Advanced Packaging Tool “APT”.


Apt’s underlying library is predominantly apt-pkg as described here.

There’s a pkgRecords C++ class which converts a cached array into a package mapping indexed by ID.

There’s a pretty-printer.

There’s a routine to fetch each queued package update with user interaction. There’s a more general routine which applies various queued changes with user interaction.

There’s a datafile listing packager header keys in a standardized order.

There’s a class which parses a list of local files listing packages into an array.

There’s a dist-upgrade routines which flags all installed packages to be upgraded, checks whether to install all essential packages, reinstalls all previously installed packages to resolve conflicts, possibly flags held packages to be kept, & repairs any issues. Whilst updating a progressbar.

Another routine goes through all packages selecting which ones are interesting to upgrade. And a couple routines for upgrading packages with or without installing new ones. Another chooses one of these codepaths.

There’s a class which parses a list of index files with error reporting, & exposes results.

There’s a pkgPolicy class which fills in various default properties, including pinnings & priorities.

There’s a class which parses & compares version numbers.

I’m not clear what the metaIndex class is doing.

There’s routines for parsing configuration & system files to determine which packages can be installed & run on this computer.

There’s a class for tracking & reporting install progress.

There’s ofcourse some (line-by-line) parsing code in a dedicated file, and 2nd one for RFC-822 “tags” which includes seeking around the file.

There’s an abstraction around “index files”.

There’s a routine iterating over Apt’s directories deleting irrelevant files. Apt implements its own directory iterator.

The prioritized partial orderings are computed by a dedicated pkgOrderList class. There’s a class holding & validating info parsed from the cache.

There’s a pkgPackageManager class abstracting most of this, with methods for downloading archives for given packages, marking missing packages to be kept, flags a package & dependencies for immediate install, partial-order dependencies, a check for whether dependencies are irrelevant, check a list of dependencies for conflicts, check which packages may get broken by an install, run “configuration” over all non-configured packages, a couple methods to perform a wide choice of that “configuration” on individual packages including a couple repeated nested iterations & seeking out reverse-breakages as mentioned earlier, carefully flags packages to be removed with user-reporting, carefully flags/run removal of a package (actual removal implemented elsewhere), unarchive packages extremely carefully in a loop with version itarations before running the actual install/configuration, & another method for ordering dependencies this one recovering from failures.

There’s a class abstracting file copies, typically off CD or USB, with careful validation (involving pasing, hashing, etc) & progress reporting, with auxialiary methods to carefully compute filepaths.

The cache parser has a seperate class build its abstract syntax tree, normalizing & simplifying results. There’s an interpretor for querying the package cache. And a seperate class Aptitude-syntax.

There’s class abstracting a cachefile, and related classes.

There’s a class abstracting the configuration files away further, combining various field checks.

There’s a superclass for retrieving packages that implements user-reporting itself.

Debian supports reading packages off a USB or CD as a datasource, traversing all its directories not blocklisted by “.aptignr” treating an “i18n” directory specially. Results are scored or ignored based on keywords. Can mount & eject verbosely.

There’s a “worker” class applying configuration & running background commands specified there.

There’s a class handling various edgecases with error-recovery and verbose output, since you really want Apt not to fail on you because then your whole OS might fail on you!

There’s a class extracting various subsets from the package cache.

There’s a class providing parsing & serialization utilities for package descriptors.

There’s a class caching per-package dependency information & tracking package installation-state.

There’s a class maintaining a queue of local & (via subprocesses) remote files to open. As well as a class processing each individual item in that queue, including communicating with the subprocesses & logging.

Looking over some somewhat-auxiliary subsystems for apt-pkg I see:


Here I see:

Utilities for commandline UI.

Apt Misc.

Looking through the rest of Apt, I see:

Man Pages

The first package Linux From Scratch has you install during the main userspace build is “Man-Pages”. Because documentation is essential for software, can a feature truly be said to exist if noone knows how to use it?

This package mostly provides documentation for POSIX-standardized APIs/commands/etc implemented by GNU & Linux, written in a format where most lines start with a period & formatting controlcode. Organized into 8 numbered sections, each with an “intro” documentation file.

Man-Pages also provides some scripts they use (or have used?) to normalize the formatting of these man-pages, including:

These are for project management, they don't need to be installed.