Here are a number of suggestions which WWW administrators might consider to make it easier for the Peregrinator to index their site, and for MathSearch to provide meaningful responses to queries. Some of these points may help other robots and search engines also.
You may wish to include a brief form of the name of your site in page titles. Unlike some other search engines, MathSearch displays the server's DNS name along with the title, but some other form of the site name may be more helpful to people making queries. Note that while such a site name can be useful in titles, it will often not be wanted in the page heading (enclosed in <H1>. . .</H1>).
Don't put large numbers of unrelated files in the same WWW server directory, especially not the top one.
Backups and old copies should preferably not be in the WWW hierarchy at all. If they are, exclude them via /robots.txt, or make sure that they are not accessible by following links from your home page: e.g., forbid server-generated directory indexing, or use other server features such as NCSA httpd's IndexIgnore.
Avoid having multiple paths leading to the same page: robots have difficulty detecting such redundancy, and the result is that several identical responses will appear in query results. A common example which is hard to avoid is that http://server/ and http://server/index.html are often identical. But it is quite easy to avoid less common cases such as soft links, or having /~USER/ and /home/USER/ point to the same directory.
To display angle brackets in HTML, e.g., in a mail address or Usenet message-id, escape them correctly using the ampersand notation: for < > put < > .
Similarly, CGI scripts should return an appropriate status code.