The Peregrinator indexes the full contents of documents, and records for each stem which sentences of which documents the stem appears in.
This arrangement allows queries based not just on keywords but on ``phrases'', or, more precisely, (unordered) sets of stems all of which should occur in the same sentence.
Several phrases may be specified: MathSearch returns only those documents which for each phrase contain at least one sentence in which the phrase occurs.
However, MathSearch also provides links to new queries for fewer phrases, and even for subphrases, provided the number of documents involved is not too large. This allows you to specify a query precisely, but to fall back to a broader query (without retyping) if the first one is unsatisfied.
Try a few queries to MathSearch to see how this works in practice.
At the start of November 1994, the .pag component of the DBM file suddenly increased in length from 30MB to 124MB, apparently because some threshold had been passed. The .pag file should be sparse, but when it is read by MathSearch the holes seem to fill up. So there was insufficient disk space to store the current version and a backup.
This problem was solved by moving all the URL data out of the DBM file, and storing it in two files, again with offsets and lengths in one and data in the other. The total storage required then dropped from 129MB to 9MB. The moral seems to be: avoid using DBM files for large amounts of data.