PDF Publication Title:
Text from PDF Page: 108
88 4 ⋅ random alpha pagerank 4.8.2 PageRank on a large graph The graphs in the previous section are small compared with the size of the true web graph. Now we address computing the quantities on a graph with 78 million nodes and just under 3 billion edges: the uk-2006 web spam test graph [Castillo et al., 2006].29 Our distributions of interest are A1 ∼ Beta(2, 16, [0, 1]) and A2 ∼ Beta(1, 1, [0, 1]). We chose the former because E[A1] = 0.85, the canonical value of α, and the latter because E[A2] = 0.5, a recently proposed alternative value of α. Both of these distributions have small a and support that extends all the way to 1. This makes computing the solution with path damping a difficult proposition, so we choose to use Gaussian quadrature. For A1 we used a 25-point rule, and for A2 we used a 10- point rule. The error bounds on quadrature state that these results may have considerable error from the quadrature approximation. But for big problems, running hundreds of Gauss points is not feasible.30 While the Matlab codes given throughout this chapter handle this graph through the bvgraph package, working in Matlab is roughly half the speed of an optimized computation. Consequently, we used a C++ implementation of the inner-outer iteration to solve the PageRank systems and compute the aggregated solution using a bvgraph structure to hold the graph in mem- ory [Boldi and Vigna, 2004]. The time required for our deterministic solves (tolerance 10−12) was α = 0.85 204 minutes, α = 0.5 51 minutes. Computing the expectation and standard deviation in the RAPr model re- quired A1 6199 minutes, A2 1569 minutes. Our codes solved each PageRank vector to a weighted tolerance of 10−12. This accuracy is far more than required when given the intrinsic error in the quadrature approximation mentioned above. Nevertheless, we might as well get something accurate with these computations when we can. To analyze the output, we use two schemes. First, we apply the truncated τ measure to the expectation and standard deviation vectors (table 4.4). The comparison shows that E [x(A)] ≈ x(E [A]) in terms of ranking and that the standard deviation vectors behave differently under this measure. Interest- ingly, the standard deviation vector for A2 appears to invert the orderings of all other measures and the magnitude of its anti-correlation is much stronger than for A1. 29 Even this graph is tiny compared with the real web graph. 30 In chapter 7, we discuss a few ideas to make the codes more scalable.PDF Image | MODELS AND ALGORITHMS FOR PAGERANK SENSITIVITY
PDF Search Title:
MODELS AND ALGORITHMS FOR PAGERANK SENSITIVITYOriginal File Name Searched:
gleich-pagerank-thesis.pdfDIY PDF Search: Google It | Yahoo | Bing
Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info
Cruising Review Topics and Articles More Info
Software based on Filemaker for the travel industry More Info
The Burgenstock Resort: Reviews on CruisingReview website... More Info
Resort Reviews: World Class resorts... More Info
The Riffelalp Resort: Reviews on CruisingReview website... More Info
CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)