SQLite3 Next Generation Query Planner

NOTE: This page is a recollection of what I've written previously. I do not guarantee it's accuracy.

To make it faster to query on certain columns SQLite3 allows you to define “indexes”, which are BTrees just like everything else in an SQLite database file. And to decide which indexes it’ll use, as well as more fundamental questions like “in which order should I traverse these tables being joined?”, SQLite has a sophisticated optimizer which I will describe here.

This optimizer will start with more manual optimizations like normalizing the operators into “disjunctive normal form” (that is moving OR to be the topmost operators), determining which operators might be satisfied by an index, etc. But there’s only so far such simplistic optimizations can go.

So next it looks through available indices and, if present, the stats table (which is generated manually for stronger performance guarantees) to figure out all the possible loops that could be part of the compilation, even considering generating temporary indexes. Already some of these loops may be discarded if another one is discovered that’s strictly better than it.

Once we have all those loops we need to figure out which should be the inner loops and which should be the outer loops. All it knows in advance is how many loops it’ll need to generate.

So for each slot it goes over all the options to find the ones for which we already have all necessary information and will give us new information. The few best of those will be kept for the next iteration to consider extending, because (as in life) the best solution may not be immediately obvious.

That then gives us all the information we need to translate the query into an optimal VDBE bytecode.