| Article Source: MiNeeds.com, where consumers get | | | | data that may no longer be available elsewhere. |
| competitive bids from Web Designers and Computer | | | | When a user comes to the search engine and makes |
| Programmers. Read reviews, compare offers & | | | | a query, typically by giving keywords, the engine looks |
| save. It’s free! | | | | up the index and provides a listing of best-matching |
| Article Link: | | | | web pages according to its criteria, usually with a short |
| Tags: SEO , Search Engines , Spiders , Web Crawlers | | | | summary containing the document's title and |
| , Google , MSN , Yahoo | | | | sometimes parts of the text. Most search engines |
| A search engine operates, in the following order: 1) | | | | support the use of the boolean terms AND, OR and |
| Crawling; 2) Deep Crawling Depth-first search (DFS); 3) | | | | NOT to further specify the search query. An |
| Fresh Crawling Breadth-first search (BFS); 4) Indexing; | | | | advanced feature is proximity search, which allows |
| 5) Searching. | | | | you to define the distance between keywords. |
| Web search engines work by storing information | | | | The usefulness of a search engine depends on the |
| about a large number of web pages, which they | | | | relevance of the results it gives back. While there may |
| retrieve from the WWW itself. These pages are | | | | be millions of Web pages that include a particular word |
| retrieved by a web crawler (also known as a spider) - | | | | or phrase, some pages may be more relevant, popular, |
| an automated web browser which follows every link it | | | | or authoritative than others. Most search engines |
| sees, exclusions can be made by the use of robots.txt. | | | | employ methods to rank the results to provide the |
| The contents of each page are then analyzed to | | | | "best" results first. How a search engine decides which |
| determine how it should be indexed. Data about web | | | | pages are the best matches, and what order the |
| pages is stored in an index database for use in later | | | | results should be shown in, varies widely from one |
| queries. Some search engines, such as Google, store | | | | engine to another. The methods also change over time |
| all or part of the source page (referred to as a cache) | | | | as Internet usage changes and new techniques |
| as well as information about the web pages, whereas | | | | evolve. |
| some store every word of every page it finds, such | | | | Most web search engines are commercial ventures |
| as AltaVista. This cached page always holds the | | | | supported by advertising revenue and, as a result, |
| actual search text since it is the one that was actually | | | | some employ the controversial practice of allowing |
| indexed, so it can be very useful when the content of | | | | advertisers to pay money to have their listings ranked |
| the current page has been updated and the search | | | | higher in search results. |
| terms are no longer in it. This problem might be | | | | The vast majority of search engines are run by |
| considered to be a mild form of linkrot, and Google's | | | | private companies using proprietary algorithms and |
| handling of it increases usability by satisfying user | | | | closed databases, the most popular currently being |
| expectations that the search terms will be on the | | | | Google, MSN Search, and Yahoo! Search. However, |
| returned web page. This satisfies the principle of least | | | | Open source search engine technology does exist, |
| astonishment since the user normally expects the | | | | such as Dig, Nutch, Senas, Egothor, OpenFTS, |
| search terms to be on the returned pages. Increased | | | | DataparkSearch and many others. |
| search relevance makes these cached pages very | | | | Was the Article Useful? |
| useful, even beyond the fact that they may contain | | | | I hope you enjoyed the article! |