ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

PDF Publication Title:

ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION ( algorithms-for-pagerank-sensitivity-dissertation )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 022

2 1 ⋅ introduction instead of a potential “spam” webpage. The contributions of this thesis fall into the category of new link analysis algorithms and metrics. ranking regression The text and link analyses are used in a ranking function that determines the final order of the results. This function is often generated by a machine learning approach, which selects features that produce rankings corresponding to the pages people think are most important. The details of these ranking functions are not readily available: they are the real trade secrets of the search engines. produce rankings Of course, “the last step” is to integrate all the pre- vious analyses on a set of documents that contain the words in the query—and to produce this list in milliseconds. PageRank, then, is one of a series of link analysis algorithms employed by a search algorithm to help with a single component of the search engine.3 It is time to define PageRank informally. Consider someone browsing the web. At every page, the surfer either follows a link on the page to another page or does something else.4 When following a link, without any other information, the surfer picks a link from the page at random and follows that one. When “doing something else,” the surfer moves to a random page on the web and restarts the surfing. A technical term for “doing something else” is teleporting or resetting by analogy with teleporting to a location after entering a url or resetting the browser by closing and opening it. To generate a simple model, we assume both of these behaviors even though they may seem ridiculous. When stated mathematically, this random surfer model is called a Markov chain because the behavior of the surfer only depends upon the current page and not the history of previous pages. Let α be the probability that the surfer follows a link; then 1 − α is the probability that the surfer “does something else.” Pictorially, the model is illustrated by figure 1.1, where the big black circle represents the current page. 3 As a personal aside, I hope this setup properly contextualizes my research for those who ask if my plan is to “start the next Google” once I tell them I’m working on PageRank. I’m not. 4 Modern web-browsers open the pos- sibility of doing both of these activities through the use of tabbed browsing. Figure 1.1 – A pictorial illustration of the PageRank model. With probability α the surfer follows one of the three links, represented as arrows, to the bottom three pages, represented as circles. With probability 1 − α the surfer moves to one of the pages in the blob.5 Jumping into the blob is also known as teleporting or resetting because it can move the surfer anywhere in the web. 5 For those deeply familiar with the PageRank model, this explanation is slightly inaccurate but it contains the essential pieces. We will formalize everything in due course.

PDF Image | ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

PDF Search Title:

ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

Original File Name Searched:

gleich.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)