logo

ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

PDF Publication Title:

ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION ( algorithms-for-pagerank-sensitivity-dissertation )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 157

nodes. Although gigantic, the network and its subsets have considerable structure. Both libbvg and bvgraph provide an interface for web graphs compressed in the Boldi-Vigna (BV) scheme [Boldi and Vigna, 2005]. With this scheme, web graphs often use fewer than three bits per link. Standard graph storage techniques need more than four bytes per link.20 The Boldi-Vigna compres- sion scheme exploits two empirically observed properties of web graphs to obtain such remarkable compression rates. First, when the URLs of each node are ordered lexicographically, then the link distances (∣i − j∣ where i and j are the indices of the URLs of the link) follow a power-law. Boldi and Vigna designed a special coding scheme to compress these power-law distributed numbers. Second, many URLs within a site repeat the same links. Allow- ing nodes to “copy” links from previous nodes (within a limited window of previous nodes) allows them to compress these structures. Together, these techniques are incredibly effective and stable—they have compressed web graphs collected between 2001 and 2007 at nearly the same rate (three bits per edge). The only problem with their compression scheme is that it makes random access to the out-edges of a node less efficient than sequential (or streaming) access. This occurs because the desired out-edges may reside in a node that copies its links from a previous node, which also copies its links from a previ- ous node, and so on. To combat the “infinite” copying, the implementation imposes an optional limit on the maximum copy depth. Efficient access to large graphs is a powerful experimental tool for algo- rithms like PageRank. In the remainder of this section, we describe the two libraries and how they let us compute on enormous graphs with limited re- sources. 6.5.1 libbvg In libbvg, we provide a C99 interface for web graphs compressed in the Boldi-Vigna scheme. The library enables both in-core and out-of-core access to these graphs and intends to mirror the Java WebGraph framework [Vigna, 2008]. The following program uses the library to extract the degree for each node in a graph and store them in the array degs: bvgraph g = {0}; bvgraph_load(&g, graphfilename, strlen(graphfilename), -1); /* load out-of-core */ bvgraph_iterator iter; unsigned int deg; double *degs = malloc(sizeof(double)*g.n); int rval = bvgraph_nonzero_iterator(&g, &iter); for (; bvgraph_iterator_valid(&iter); bvgraph_iterator_next(&iter)) { bvgraph_iterator_outedges(&iter, NULL, °); *(degs++) = (double)deg; } bvgraph_iterator_free(&iter); The two types bvgraph and bvgraph_iterator encompass most of the func- tionality of the library. The bvgraph type stores the data about the graph and the bvgraph_iterator type sequentially accesses vertices and their edges. 20 This estimate assumes that graphs are stored with compressed sparse row arrays with 32-bit indices (four bytes) and without any compression. 6.5 ⋅ libbvg and bvgraph’s in matlab 137

PDF Image | ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

algorithms-for-pagerank-sensitivity-dissertation-157

PDF Search Title:

ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION

Original File Name Searched:

gleich.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP