logo

natomy of a Large-Scale Hypertextual Web Search Engine

PDF Publication Title:

natomy of a Large-Scale Hypertextual Web Search Engine ( natomy-large-scale-hypertextual-web-search-engine )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 013

idea of how a change in the ranking function affects the search results. 5 Results and Performance The most important measure of a search engine is the quality of its search results. While a complete user evaluation is beyond the scope of this paper, our own experience with Google has shown it to produce better results than the major commercial search engines for most searches. As an example which illustrates the use of PageRank, anchor text, and proximity, Figure 4 shows Google’s results for a search on "bill clinton". These results demonstrates some of Google’s features. The results are clustered by server. This helps considerably when sifting through result sets. A number of results are from the whitehouse.gov domain which is what one may reasonably expect from such a search. Currently, most major commercial search engines do not return any results from whitehouse.gov, much less the right ones. Notice that there is no title for the first result. This is because it was not crawled. Instead, Google relied on anchor text to determine this was a good answer to the query. Similarly, the fifth result is an email address which, of course, is not crawlable. It is also a result of anchor text. All of the results are reasonably high quality pages and, at last check, none were broken links. This is largely because they all have high PageRank. The PageRanks are the percentages in red along with bar graphs. Finally, there are no results about a Bill other than Clinton or about a Clinton other than Bill. This is because we place heavy importance on the proximity of word occurrences. Of course a true test of the quality of a search engine would involve an extensive user study or results analysis which we do not have room for here. Instead, we invite the reader to try Google for themselves at http://google.stanford.edu. 5.1 Storage Requirements Query: bill clinton http://www.whitehouse.gov/ 100.00% (no date) (0K) http://www.whitehouse.gov/ Office of the President 99.67% (Dec 23 1996) (2K) http://www.whitehouse.gov/WH/EOP/OP/html/OP_Home.html Welcome To The White House 99.98% (Nov 09 1997) (5K) http://www.whitehouse.gov/WH/Welcome.html Send Electronic Mail to the President 99.86% (Jul 14 1997) (5K) http://www.whitehouse.gov/WH/Mail/html/Mail_President.html mailto:president@whitehouse.gov 99.98% mailto:President@whitehouse.gov 99.27% The "Unofficial" Bill Clinton 94.06% (Nov 11 1997) (14K) http://zpub.com/un/un-bc.html Bill Clinton Meets The Shrinks 86.27% (Jun 29 1997) (63K) http://zpub.com/un/un-bc9.html President Bill Clinton - The Dark Side 97.27% (Nov 10 1997) (15K) http://www.realchange.org/clinton.htm $3 Bill Clinton 94.73% (no date) (4K) http://www.gatewy.net/~tjohnson/clinton1.html Figure 4. Sample Results from Google

PDF Image | natomy of a Large-Scale Hypertextual Web Search Engine

natomy-large-scale-hypertextual-web-search-engine-013

PDF Search Title:

natomy of a Large-Scale Hypertextual Web Search Engine

Original File Name Searched:

google-anatomy-of-a-search-engine.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP