If there’s one thing we associate with computing more than the concept of Turing Completeness (a.k.a. The Church-Turing Thesis) it is mathematical computation! CPUs have dedicated hardware to make this rapid & memory-efficient for some finite number of base2-digits (a.k.a. bits). This page documents facilities refining & exposing digital computers’ arithmatic expertise.
bc
Infix Maths
Sometimes you want to perform math on the commandline for which there’s the bc
& dc
commands, bc
being discussed here.
After initiliazing its I/O bc
parses its commandline flags & trailing args both given directly & via BC_ENV_ARGS
envvar. POSIXLY_CORRECT
& BC_LINE_LENGTH
are also checked. Some “collection” (including some common numbers & “load” func), nametree, & labels, are initialized.
Uses backwards bug-compatible emulated floating point.
Then bc
registers a SIGINT
signal handler, opens the first sourcefile or standardin, initializes possibly LibEdit or LibReadline if available & in REPL mode, runs a Bison/Flex immediately-evaluating (conditionally via a generate
or run_code
function) parser consulting those collections, outputs a newline if that was silenced, & successfully exits.
The execution is a straightforward opcode stackmachine interpretor, for the sake of function calls. Has a standard prelude.
Flex is configured to concatenate multiple files & possibly read from libReadline.
P.S. I gather I’d find the older implementation more interesting…
dc
Postfix Maths
Above I described the bc
command, so here I’ll study the corresponding dc
command. Both evaluating math.
dc
starts by initializers some common emulated numbers, virtual registers, & stubs for strings & arrays before parsing commandline flags & iterating over args carefully opened as files or stdin. For each it lexes (via a handwritten lexer) the file treating the tokens as immediately-evaluated opcodes. With seperate interpretors for strings vs numbers.
Outputs flushed on exit.
There’s support submodules in dc
for interpreting array, “system” & I/O, numbers (wrapping bc
’s implementation), stack (which gets incorporated into all the opcodes) & registers, & strings. Amongst a couple core utilities.
LibGNU MultiPrecision
Sometimes you want to avoid bugs caused by the finite number of bits a CPU-native number consists of. GNU provides a handful of libraries for dealing with this, & today I’ll start studying GNU MultiPrecision (LibGMP).
Amongst the main codebase (I’ll go over the subsystems starting tomorrow), I see:
- A handful of globals specifying memory-layout & version-numbers.
- Lookuptables.
- Utilities to get or set memory-allocation callbacks, offering reentrant or debugging values for them.* Raising SIGFPE exceptions.
- Default allocator callbacks.
- Scripts to generate lookuptables for division, factorials, jacobi symbol, fibonacci, etc.
- Wrappers around the division routines.
- Prime number testing (fastpath checks for 1, 3, 5, & 7 factors, otherwise uses slowpath loop for factors > 11).
- Another handful of routines to raise SIGFPE signals.
- Logarithms by testing against repeated bitshifts.
- Couple variations of dividing 1 by a number (“inversion”).
Some of those macros wraps Assembly code.
MiniGMP
Within GMP there’s a submodule named “Mini GMP” which initializes, normalizes, clears, conanicalizes, set fields, compare, & finalizes mpz_t
& mpq_t
structures. And ofcourse there’s mathematical operators. These are all straightforward with possibly a couple branches.
You can swap fields between 2 mpq_t
s.
Convert to or from doubles involves a loop taking an exponent of a power-of-2. Or it can convert to/from strings.
The actual logic with most of the loops is implemented as macros.
There’s another implementation file containing macros implemented similarly. As well as utilities for crashing, managing memory & managing those callbacks, memory-copying & -comparison loops, loop to compute normalized sizes. And ofcourse loops over the words in multiprecision number(s) to perform mathematical operators on them whilst propagating carries, Mini GMP implements these in C.
Divide is more complex involving multiple fastpath routines. Which are involved in stringifying.
There’s routines for computing greatest-common-divisor on single words or multiprecision numbers through repeated refinement. With divide-by-2 fastpaths. Similarly there’s Lowest-Common-Multiple routines.
There’s exponentiation routines which may involve repeated multiplication, or divide&conquer algorithms via division.
There’s a binary-search for squareroots with a fastpath.
There’s checks for perfect squares.
There’s repeated multiplication factorials
There’s repeated bitwise-operators computing Jacobi numbers.
Repeated addition/subtraction/multiplication/division Lucas Modulo operation, supporting its own stringifier.
Repeated exponentiation labelled “Miller-Rabin”.
Relatively simple prime-number testers.
Bittesting, absolutes, bit-subtraction, bitsetting, bitclearing, & more logical operators. Hamming distance via ops on each bit. Array-scanning. Sizing per base. Stringifying abstractions.
LibMPF
GNU MultiPrecision includes a “MPF” sublibrary”, which I’ll summarize today.
This operates upon an mpf_t
structure with the following utilities:
- Accessors (possibly involving validation or bittwiddling/adds/subtracts/absolutes)
- Wrappers around math operators adding data conversions, some optionally reimplementing the operator.
- Including a few for random-numbers
- Accessors for globals
- Subtract-then-divide with fastpath for 0, a “relative diff”
- Optimized repeated multiplication
- Cloning a number flagging it as negative
- Stringifying
- Some of these operators wraps MPN routines
- Others wrap iterations over the digits, or combines the 2
- Some of them get fairly convoluted with different fastpaths for different cases
- Swap fields between 2 MPF structs
- There’s tests on the fields
- Parsing
- Conversion back to floats
- Some headerfiles implementing templates to base other routines on
- Comparing fields for equality or more
- Subtraction is quite involved
- divide-by-power-of-2 fastpath routine
- rounding with logic to determine whether to increment
- multiply-by-power-of-2 fastpath routine
- Addition’s fairly involved too
- Multiply/divide is more straightforward handing off near directly to MPN
LibMPN
GNU MultiPrecision includes a LibMPN subsystem implemented in Assembly for a wide variety of CPUs. Normally I’d discuss the x86_64 implementation since that’s what I’m running, but instead I’ll study the ARM64 implementation since that’s simpler!
This includes routines for:
- Inverting a digit using bitwise-logic, adds, multiplies, & subtracts.
- Hamming distance mostly consists of bitwise logic with extensive controlflow, & an accumulator.
- Digit squaring with low, high, & mid branches.
- Tightloops supporting datatable lookups.
- Greatest-Common-Divisor tightloop.
- Data-copy tightloop routines.
- Lef-shift routine wrapped controlflow over initial bits.
- Various bitwise routines branching over initial bits to select a “top” or “mid” codepaths.
- Another Greatest-Common-Divisor tightloop, with fastpath for smaller numbers.
- Examine initial bits to choose “top” or “mid” codepaths for multiplication.
- Accumulator routines for computing remainders.
- Examine initial bits to choose “top” or “mid” codepaths for addition or subtraction or comparison routines.
- Various other routines are implemented similarly, though they have a couple “lo” codepaths necessitating altering the examination of the bits. I won’t continue going over them.
- This includes popcount, with a divide&conquer coldpath.
- There’s division on digits implemented using multiplication.
- Tightloop specifically for pi (judging by name) with even tighter trailing loop.
There’s dataheaders determining which opcodes to use on Cora53, Cora57, Cora72, Cora73, XGene1, & other ARM CPUs.
On Cora53 it LibMPN implements special variations of it’s comparison routines.
LibMPQ
GNU Multiprecision has a LibMPQ sublibrary implementing fractions, consisting of several mostly-trivial utilities around a struct holding a numerator & denominator as MPN & MPZ numbers. These utilities include:
- Swapping fields between 2
mpq_ptr
s. - Accessors for the 2 fields.
- Conversion from/to long pair (signed or not), MPF, double, string, or file. Most complex part.
- Copy from one MPQ to another.
- Swap numerator & denominator.
0/d is normalized to 0/1.
Also I see utilities for:
- Initialize multiple MPQs, wrapping a utility to initialize one.
- Negate numerator, cloning where necessary.
- Test fields for equality.
- Compare against long pair (signed or not).
- Clear a variadic-list of MPQs.
- Clear a MPQ.
- Normalize, e.g. dividing by GCD.
- Make numerator positive, cloning where necessary.
- Add/subtract multiplies by denominator product whilst normalizing so the op can be applied to numerators.
- Cross-multiply & compare with fastpaths.
- Combined/optimized invert & multiply-call with fastpaths, to implement division.
- Fastpath for multiplying by powers-of-2.
- Multiplying numerators & denominators whilst dividing by GCD.
LibMPZ
The sublibrary around the MPZ numeric datastructure includes:
- Wrappers around GMP random adding MPN normalization.
- Division by power-of-two with reallocation & MPN normalization.
- Wrappers around MPN division routines adding memory management & normalization, with it’s own fastpath.
- Swap fields of 2 MPZ structures.
- Test whether a given bit in the array is set.
- Wrapper around it’s power routines.
- Wrappers around addition/subtraction routines adding memory management with own fastpaths
- The “Strong Lucas Primality Check” with several fastpaths (including considering the Jacobi test) & a division loop
- A more sophisticated wrapper around GMP random adding memory management & MPN normalization
- Wrappers around MPN squareroot routines adding memory management & own fastpaths
- size-in-base wrapper
- Sophisticated bitwise XOR normalization
LibMPQ
Today I’ll continue studying MPQ, which largely serves to wrap various MPN routines with memory management & normalization. These include:
- Copying fields from one MPZ to another.
- Random (several wrappers)
- square-root + remainder (couple)
- test perfect-powers
- Modulo (multiple codepaths)
- Multiply wrapped in a loop for product.
- Multiplication (multiple wrappers, including for powers-of-2)
- Powers (multiple wrappers)
- Lowest-common multiple
- Jacobi (& Kronecker) primality test
Also there’s converters:
- From MPF
- From signed & unsigned longs
- From MPN
- From doubles
- From strings
- To strings (most of the complexity)
- To files
As well as:
- Reallocation (couple)
- Set a particular bit (multiple codepaths)
- Locate first 1 bit (+ & - codepaths)
- Locate first 0 bit
- Select appropriate exponentiation codepath
- Negating a number, possibly with a copy
- Modulo via
mpz_tdiv_r
- Get, modify, finish, & write “limbs” (digits) field routines
Regarding prime numbers:
- Return iteration count of a division
- Calculate primes with a given sieve size
- Iterate over primes via some of these other routines
- Probably-prime test with fastpaths
- Factorize a number
- Compute Lucas numbers
- Compute Lucas modulos
- Miller-Rabin primality test
LibMPZ
Reading through the rest of GNU Multiprecisions MPZ sublibrary, I see:
- Wrappers around initialization & conversion from
- doubles
- another MPZ struct
- signed longs
- unsigned longs
- strings with or without whitespace
- files
- Initialization multiple MPZs via variadic arguments
- Initialize a single MPZ struct by allocating its digits or set a dummy value
- Divide 1 by a given MPZ via
mpz_gcdext
with postprocessing - Inclusive bitwise-or, reallocating where ncessary as it turns it into a bitwise-and operation
- Compute odd factors of factorials (I’ve also just finished skimming other factorizing code) using prime sieves, other factorizing infrastructuring, & repeated multiplication
- Conversion to
- doubles
- signed types via generic template
- signed long ints
- strings
- Accessor for the base2 exponent field as a double
- Wrapper around MPN’s greatest-common-divisor implementation adding memory management
- Fibonacci numbers with fastpaths
- Wrapper around MPN’s alternate Fibonacci implementation
- Wrapper around MPN’s alternate greatest-common-divisor implementation
- Wrapper around MPN’s main greatest-common-divisor implementation with own fastpaths
- Multiple wrappers around MPN division with its own fastpaths
- Set digits of a MPZ struct
- Output to stdout via string serializer
- Test whether the MPZ number is divisible by a given factor, different functions for different factor types
- MPZ to a unsigned-long power
- Compute factorials via lookup-table optimizations
- Get digits
- Compute hamming distance, with own fastpaths
- Test “congruence” against a base2-exponential as a variation around comparisons, a couple routines for this
- Free digits from MPZs passed variadic arguments
- Clear digits of a given MPZ
- Wrappers around MPN comparison
- Clear a particular bit & normalize
- Flip a given bit
- Flip all bits
- Take the absolute value
- Initialize array of MPZs
- Compute a semi-factorial which fits in a single digit MPN digit via a selection of codepaths & possibly a lookuptable
- Multiple common template for wrapping MPN addition & subtraction to select an inner function based on sign & adding memory management
- One of these adds multiplication in the mix
- Bitwise and with memory management & multiple codepaths
- Wrappers taking unsigned long operands
Text serialization
Here I’m looking over GNU MultiPrecision’s textual serialization library.
This has several public API wrappers around the core logic for different function signatures. And methodtables of underlying utilities, specified by those public APIs.
There’s a replacement printf implementation repeatedly looking for “%” & examining subsequent character(s) calling the core logic where appropriate.
The core logic works similarly parsing the core logic into a “param” struct & calling the appropriate serialization function, some of which are defined here. With macros handling the serialization & functions handling the formatting.
Random-number generation
GNU MultiPrecision includes a random-number-generating sublibrary.
This includes a handful of wrapper functions converting the seed value from different types. Or calling methods. There’s some global variables.
There’s a routine which repeatedly generates numbers until its less than n, avoiding infinite loops. With a bitmask to improve odds.
There’s an abstraction around selecting a randomness-formula, or another to set it.
There’s an implementation of the Mersenne Twister PRNG formula upon MPZ. With a seperate methodtable for seedable MT-random. Includes a default-state lookuptable & plenty of bittwiddling I don’t understand.
Or there’s a more traditional Linear-Congruential random-number generator implemented upon MPN. Again, plenty of bittwiddling!
Includes a variant dealing in base2-exponentials.
Text Deserialization
GNU MultiPrecision includes a text deserialization sublibrary. This works basically the same way as the serialization sublibrary, if not simpler.
Handling public APIs & methodtables similarly.
There’s a core number scanner deferring to existing parsers (from MPF, MPQ, or MPZ subparsers), to be wrapped in a scanf replacement.
LibGMP Tuning
GNU Multiprecision cares a lot about performance (and correctness). As such there’s a subsystem which uses C macros to tweak various APIs to capture performance metrics. With the core “cycle-counting” code implemented in Assembly for a variety of CPUs to avoid incurring any overhead & generating misleading numbers.
There’s some noops implemented, presumably to test this infrastructure.
There’s utilities implemented around this to eg. try out some magic numbers to find the most optimal one.
There’s also utility commands to perform statistical analysis on the results & render charts via GNU PlotLib.
There’s some shared utility libraries for these commands, including one to tweak CPU settings. And a benchmark suite, with its own testrunner.
There’s perl-script to generate profiled versions of the MPN library.
And I’m not even going to dig through the testsuite…
##LibMPFR floatingpoint arithmatic LibMPFR implements floatingpoint numbers to a configurable amount of precision similar to how the Floating-Point-Unit in your CPU works. In short mostly this implements various mathematical formulas I barely understand often with a refinement loop & rounding. Whilst adding special-cases for NaN or infinity, frequently fastpaths, & often wrapper functions. Ocassionally lookuptables, some of which are CPU-specific, are consulted for some of these functions. The most complex logic in LibMPFR is involved in parsing & serializing strings.
In long, here’s my commentary as I skimmed through the entire toolbox:
- Various APIs performs type conversion on its arguments before deferring to the core logic. Signed types lower to unsigned wrapper-ops.
- Like most libraries, it can report a version number.
- There’s some sort of identity function, seamingly to deal with the C type system.
- Repated doubling or halving to implement
ceil_exp2
. Orceil_log2
. Or floor variants. - Multiplication with fastpaths calling down to LibMPN.
- Comparison routines, with a seperate one to adhear to total-ordering invariants. Or via LibMPZ.
- Many of the type-conversion wrappers includes fastpaths. Or power might get reimplemented more optimally for different exponent types with its own loops.
- A couple conversions from LibMPZ random to LibMPFR random.
- Unsigned tangents with fastpaths, normalization, & loop until the tangent is 0 or invalid.
- Riemann-Zeta formula with fastpaths & nested loops, via LibMPZ, for ints or MPFR.
- Swap fields between 2 MPFR structs.
- Compute tan() via repeated calls to combined sin/cos, with fastpaths.
- Numeric-stack.
- Signed & unsigned MPFR addition & subtraction. Has nextensive normalization amongst other cases. Possibly calls down to MPN.
- Subnormalization.
- Hyperbolic tangents with fastpaths, normalization, & repeated exponentiation.
- Optimized summation.
- Compute Bessel function of 2nd kind, with fastpaths & loop.
- A few implementations of sin() with fastpaths, normalization into supported range, & repeated “ziv” refinement handling 2nd iteration specially.
- Various squaring fastpaths in one routine, & it’s support functions.
- Combined hyperbolic sin() & cos() computation, by repeatedly refining sinh() & computing cosh() from it.
- Conversions from signed & unsigned ints of different bitsizes, strings of different bases with or without bitshifts, LibMPZ, LibMPQ, LibMPF, doubles of various bitsizes to LIBMPFR.
- Accessors for sign, precision (raw or not), or exponent. With or without handling 0 & NaN specially
- Accessor for global rounding mode, or default precision.
- Overwrite an MPFR struct with NaN, or the minimum or maximum representable value. Or infinity. Or 0.
- Square root computation with various fastpaths, & an approximation to refine a finite times.
- Multiply by 2^x fastpath.
- Stringifying via a Double128 type, with division lookuptables. Or same for Double64 type.
- String parsing.
- Combined sin_cos() function.
- Format-string parser for it’s own scanf reimplementation.
- Test whether we have a power-of-2
- Wrapper around powers adding special cases
- Repeated multiplication & squaring to implement powers-to-machineword. Seperate routines for signed ints
- Output numbers as strings in binary
- Serialize given rounding mode
- Compute relative difference, with special cases
- Public wrappers around string formatting
- Round according to a given mode
- Few more rounding formulas
- Compute remainders from division
- Wrapper around stringifier.
- Test whether a number is odd.
- Memory copying utilities.
- Random number generator wrapping LibGMP.
- Power-to-MPZ-exponent.
- Number negation, with special cases.
- Multiply by power-of-2. Seperate function for signed numbers. Or for other numeric types.
- Bitcounting.
- Support functions for tweaking rounding results.
- Divide&conquer roots with special cases.
- Inverse-squareroot via lookuptables, bittwiddling, special cases, & plenty of math.
- It has implementations of short-product, square, & division via Mulders’ divide&conquer algorithm. Calling into MPN.
- Inverse-squareroot is a recursive algorithm.
- Adapter to Mini GMP.
- Headerfiles.
- Several codepaths implementing multiplication & it’s rounding, possibly with debug output.
- Extract integral & fractional parts of an MPFR number, by wrapper accessors.
- Min & max functions with special cases.
- Memory management.
- Logging utilities.
- Fastpaths for log-base-10 with special cases, focusing on optimizing the division by log(10). Similar functions for an incremented argument. And/or for log-base-2.
- Fastpath with special cases for logs with an usigned-long base, calling down repeatedly to MPZ multiplication.
- Fastpath for incremented argument to log().
- Ofcourse a formula to repeatedly refine a logarithm from an estimate.
- Check whether normal (non NaN, inf, or 0) or whether 0.
- Bitshifting to approximate square & cube roots, with refinements to reduce the error to an acceptable amount.
- Check whether an MPFR number is an actual number, NaN, infinity.
- Checking whether an MPFR number is an integer is slightly more involved.
- Lookuptables & macros for computing inversesquareroots or inversion.
- jn & yn’s implementation shares a template implementing Abamowitz & Stegun’s formulas involving a refinement loop & iteration therein.
- Fastpath for log-base-2 rounded up.
- Compute bessel functions via multiple fastpaths or refinement loop with an inner-iteration.
- Integer parsing to a given base.
- A couple functions to initialize an MPFR argument, & wrapper variadic functions.
- Conversion to MPZ (2exp or not), signed or unsigned ints of various bitsizes, floats of various bitsizes, MPQ, MPF, or strings (most involved). These are nontrivial conversions.
- Accessors for the exponent.
- Random number generation via MPZ.
- Calculate Euclidean distance with special cases.
- Template implementation for inversing a function.
- Calculate gamma(1/3) or gamma(2/3) via repeated refinement with postprocessing.
- Stringify floating point numbers of various bitsizes.
- Wrapper turning multiply-add into multiply-subtract.
- Wrapper around modulo operator converting from unsigned long divisor.
- Combined accessor for floatingpoint mantissas & exponents, with special casees.
- Wrapper around a * b + c * d with addition fastpath.
- Various functions for freeing different caches.
- Tests whether the MPFR number fits in various fixed-bitwidth ints. These show a couple template implementations.
- Compute the fractional of an MPFR number, with special cases.
intmas
gets its own implementation. - Wrappers around MPFR ops converting 2nd operand from MPZ or MPQ. Occasionally these are reimplemented to be faster on these types.
- Support routine implementing the binary-splitting algorithm upon the MPZ & MPN fields.
- Take a factorial of an unsigned long as an MPFR result, using an multiply-iteration-then-round loop.
- Compute gamma numbers with special cases, & one of two repeated refinement loops within which is an iteration & error checks, with or without decrementing exponent.
- Wrapper around exponentiation providing hardcoded bases. Or reimplementations where optimizations are available.
- Various special cases for taking exponents, falling back to support square-multiply loops.
- Multiply-add with various fastpaths.
- Formulas relating to something called “dilogarithms”.
- Alternative random distributions.
- Serialize & deserialize binary data.
- Several functions for computing log(gamma(x)) fastpaths with their special cases, including template implementation involving various fastpaths some of which involves loops. And a post-processing loop with error checks, extensive computation, & iterations.
- Wrapper division taking a double divisor. Or another for an unsigned-long 2-exponent. Amongst others.
- Dot-product between 2 MPFR arrays.
- Wrapper around subtraction op converting 2nd operand from a double. Or for division.
- String serialization for debugging.
- Comparison & equality amongst the various fields.
- Subtraction wrapper which returns 0 instread of negative numbers.
- Compute error range via fastpaths, repeated refinement, & a support routine. And a 2nd variation of these.
- Maintain error bitflags.
- Multiplication with numerous fastpaths calling down to MPN.
- Copying a sign from one MPFR number to another.
- A formulae with loops possibly calling down to MPZ for computing & caching pi, Catalan’s Number, logn(2), & Euler’s constant.
- Some harcoded constants.
- Hyperbolic cosine, involving special cases & a repeated-division loop until we reach desired precision.
- Comparison against fixed-size ints or doubles.
- Comparison wrappers better handling special cases.
- Absolute comparison.
- Compuse cosines with special cases, a fastpath, & repeated refinement involving an inner iteration until we reach desired precision.
- Free an MPFR number, with a variadic wrapper.
- Validate an MPFR number.
- Optimized implementation of
cos(2pi*x/u)
, with wrapper forcos(pi*x)
- Optimized computation of (1+x)^n
- Cubed-root involving special cases, normalization, & relatively minor computation mainly involving a multiply or divide by a 2-exponential.
- Caching infrastructure.
- Compute the digamma function for a given MPFR number, with special cases special cases, pre-rounding, & a choice of two refinement loops (one of which has an innerloop) with their own preprocessing.
- The “beta” function, involving extensive special cases, integral fastpaths, preprocessing, & extensively-branching refinement loop wrapping the gamma operation.
- Compute just the integral component of an exponential.
- Division with specialcases, numerous fastpaths & simplifications all with their own loops, preprocessing of desired precision, fallingback to the divrem function with extensive rounding. Calling down to MPN heavily.
- Simpler division by fixed-size numbers involving special cases, a fastpath for power-of-two divisor, and several codepaths calling down to MPN before applying rounding.
- Functions returning buildflags.
- Wrapper around add for double operand. Or unsigned long ints.
- Compute nth Bernulli numbers, involving several iterations alongside straight arithmatic. Cached!
- Arithmatic-geometric-mean implementation with special cases, a fastpath, & heavily branched sqrt/div refinement loop with inner division loop.
- atan(x)*u/2pi with atan(x)/pi wrapper. Using a refinement loop matching that formula.
- Hyperbolic arctangents with specialcases & divide/log refinement loop.
- Inverse hyperbolic sine involving special cases & squaring/sqrt/log refinement loop.
- arctan2 of an MPFR number involving specialcases, fastpath, & one of 2 heavily-branched refinement loops calling down to atan().
- Refinement loop with special-cases implementing asin(x)*u/2pi, with wrapper for asin(x)/pi.
- arc-sine implementation, with special-cases & sqr/sqrt/div/atan refinement loop.
-
atan( y/x )*u/2pi refinement loop with special cases & fastpaths, & atan( y/x )/pi wrapper. - Add/subtract wrapper handling signs & specialcases.
- arc-cosine implementation with specialcases, a fastpath, & square/square-root/atan/divide refinement loop.
- Similar hyperbolic acos with heavily-branched logarithm/square-root refinement loop.
- acos(x)*u/2pi refinement loop with preprocessing, fastpaths, & specialcases, & acos(x)/pi wrapper.
- arc-tangent implementation involving specialcases, fastpaths, & a heavily-branched refinement loop itself involving a lookuptable, lots of MPZ calls, & multiple inner iterations.
- Abort the program with an MPFR error message.
- Several per-digit addition state-machine with controlflow I fail to follow, some calling down to MPN.
- CPU-specific hardcoded constants.
- The “Airy function” with special cases, preprocessing, & a choice of two preprocessed refinement loops involving several nested iterations around the gamma function.
- By far the most complex MPFR routine is a reimplementation of vasprintf, but since I’ve studied GNU Libc already…
LibMPC
LibMPC is a surprisingly simple wrapper around LibMPFR (which I described the other day) implementing multiprecision complex numbers.
Functions wrapper MPFR’s on a single property includes:
- sine (but not variants)
- Absolute (via trigonometry)
- Compute the “argument” via atan2
- Cosine (but not variants)
- Precision getter
- Imaginary setter
- Getter for real property
There’s a constant version number. There’s headerfiles ofcourse.
There’s memory-management utilities.
MPFR wrappers upon both properties include:
- Randomness
- Subtraction with different types.
- Setters
- Swap fields
- Configure precision
- Copy fields from another complex number
- Addition with different types
- Zero out fields
- Comparison with different types
- Another precision getter
- Multiple initializers
- Multiply by 2-exponential with different types
- Multiply by a signed or unsigned int
- Negation
- Division by 2-exponential with different types
Going from least complex to most of the remaining APIs:
- There’s a wrapper around the parser which validates no non-whitespace trailing text remains.
- Hyperbolic sine wraps normal MPC sine with data conversion. Similar for hyperbolic tangents.
- Data conversion is involved in MPFR-sum wrapper, converting both fields into seperate arrays.
- Ceiling-log2 counts halvings.
- There’s some type conversions around division (where it isn’t reusing APIs). Same for exponentiation.
- Signed exponentiation wraps unsigned exponentiation with branching.
- Exponentiation to an integral complex number has both type conversion & branching.
- Projecting imaginary numbers to a “Riemann Sphere” involves conditional setting of it’s fields.
- There’s a serializer wrapping MPFR’s with parentheses & whitespace.
- Multiplying by an MPFR number involves, with a few special cases, multiplying each field by the MPFR operand.
- LibMPC implements tangents itself involving several special cases & fastpaths as well as a refinement loop handling infinities, zero, & rounding.
- Squaring & multiplying fastpath loop for exponentiating to an int, signed or unsigned. Or it does the conversion.
- Wrapper around MPFR parsers stripping whitespace & possibly parens, populating both fields.
- Multiply by + or - i, using extensive branching, setters, & possibly negation.
- Compute primitive roots of unity involving extensive branching, squareroots, & possibly a multiply+sin/cos loop until fixpoint.
- Compute the squared norm of a complex number, involving special cases & a squaring/rounding loop with extensive postprocessing.
- Macro-generated logging wrapper APIs.
- Log/division/rounding loop implementing log-base-10, special case exact values in real part.
- Divide-by-MPFR involves a dataconversion before calling the division wrapper.
- Multiply-add has a multiply-then-sum fastpath, or a multiply-add loop iterating twice carefully avoiding error cases.
- Take the exponential of a complex via special-case handling or a post-processed exponent/sin-cos/multiply/round loop.
- Serialization utility which postprocesses the MPFR results to be prettier.
- Division by MPFR number converts to a complex number than calls the divide-wrapper.
- Dot-products converting into 2 arrays of products to be summed into a new complex number.
- Scan than parse a complex number from a stream, defaulting to stdin.
- Some “eta series” formula I don’t understand, involves plenty of branching surrounded by a loop.
- Logarithms involving special cases & a 2-iteration roudning-loop wrapper MPFR log. With a second more-robust error-recover loop, & atan2 postprocessing for imaginary part.
- Squaring involves special cases, fastpaths, error-correction loop, & postprocessing. The main fastpath involves heavy branching & multiplication.
- Hyperbolic arc-tangents swaps real & imaginary components to wrap it’s arc-tangent routine.
- Hyperbolic cosine involves swaps real & imaginary components to wrap it’s cosine routine.
- Absolute-comparison involves fastpaths & a normalize-compare loop with cleanup.
- Hyperbolic arc-sine involves swapping properties to wrap it’s arc-sine routine.
- Combined sine/cosine involves various fastpaths, precision-computation, a refinement-loop wrapping MPFR’s function, & extensive postprocessing.
- Square-root involves extensive special cases, precision computation, a refinement loop wrapping MPFR’s squaring/add/subtract/squareroots/divide, extensive heavily branched postprocessing, & validation.
- Division involves similar with extensive error handling.
- Multiplication may involve one of several codepaths, one of which involves loops around MPFR’s multiplication. Or more directly wraps MPFR.
- Hyperbolic arc-tangents involves a 0 special case & one of 2 codepaths wrapping it’s arc-tangent routine.
- Arc-cosine involves numerous special cases (one of which involves a loop), precision computation, & refinement loop around arc-sine.
- Various formulas to aid computing precision often involving branching.
- Various operations on
MPCR
types with their mantissas & exponents. With some reimplemented bitwiddling support routines. Same forMCB
“ball” arithmatic. - “AGM” (whatever that is?) involves several special cases followed by 2 division-refinement loops (and an inner iteration). During which it decides whether take the “angle-zero” fastpath, with it’s own simpler refinement loop.
- Arc-tangent involves several special cases one of which involves a squaring-refinement loop, precision computation, & a couple main refinement loops around MPFR atan2.
- Arc-sine involves several special cases, Taylor-series fastpath, & a complex refinement loop I don’t understand.
- Computer powers involves extensive branching some of which calls down to MPFR or even MPZ, a log/multiply refinement loop, postprocessing for real or imaginary component, & validation/cleanup.