How search engines work

A search engine operates, in the following order: 1)pages, whereas some store every word of every
Crawling; 2) Deep Crawling Depth-first search (DFS); 3)page it finds, such as AltaVista. This cached page
Fresh Crawling Breadth-first search (BFS); 4) Indexing;always holds the actual search text since it is the one
5) Searching.that was actually indexed, so it can be very useful
Web search engines work by storing informationwhen the content of the current page has been
about a large number of web pages, which theyupdated and the search terms are no longer in it. This
retrieve from the WWW itself. These pages areproblem might be considered to be a mild form of
retrieved by a web crawler (also known as a spider)linkrot, and Google's handling of it increases usability by
— an automated web browser which follows everysatisfying user expectations that the search terms will
link it sees, exclusions can be made by the use ofbe on the returned web page. This satisfies the
robots.txt. The contents of each page are thenprinciple of least astonishment since the user normally
analyzed to determine how it should be indexed. Dataexpects the search terms to be on the returned
about web pages is stored in an index database forpages. Increased search relevance makes these
use in later queries. Some search engines, such ascached pages very useful, even beyond the fact that
Google, store all or part of the source page (referredthey may contain data that may no longer be available
to as a cache) as well as information about the webelsewhere.