
PDF Publication Title:
Text from PDF Page: 157
nodes. Although gigantic, the network and its subsets have considerable structure. Both libbvg and bvgraph provide an interface for web graphs compressed in the Boldi-Vigna (BV) scheme [Boldi and Vigna, 2005]. With this scheme, web graphs often use fewer than three bits per link. Standard graph storage techniques need more than four bytes per link.20 The Boldi-Vigna compres- sion scheme exploits two empirically observed properties of web graphs to obtain such remarkable compression rates. First, when the URLs of each node are ordered lexicographically, then the link distances (∣i − j∣ where i and j are the indices of the URLs of the link) follow a power-law. Boldi and Vigna designed a special coding scheme to compress these power-law distributed numbers. Second, many URLs within a site repeat the same links. Allow- ing nodes to “copy” links from previous nodes (within a limited window of previous nodes) allows them to compress these structures. Together, these techniques are incredibly effective and stable—they have compressed web graphs collected between 2001 and 2007 at nearly the same rate (three bits per edge). The only problem with their compression scheme is that it makes random access to the out-edges of a node less efficient than sequential (or streaming) access. This occurs because the desired out-edges may reside in a node that copies its links from a previous node, which also copies its links from a previ- ous node, and so on. To combat the “infinite” copying, the implementation imposes an optional limit on the maximum copy depth. Efficient access to large graphs is a powerful experimental tool for algo- rithms like PageRank. In the remainder of this section, we describe the two libraries and how they let us compute on enormous graphs with limited re- sources. 6.5.1 libbvg In libbvg, we provide a C99 interface for web graphs compressed in the Boldi-Vigna scheme. The library enables both in-core and out-of-core access to these graphs and intends to mirror the Java WebGraph framework [Vigna, 2008]. The following program uses the library to extract the degree for each node in a graph and store them in the array degs: bvgraph g = {0}; bvgraph_load(&g, graphfilename, strlen(graphfilename), -1); /* load out-of-core */ bvgraph_iterator iter; unsigned int deg; double *degs = malloc(sizeof(double)*g.n); int rval = bvgraph_nonzero_iterator(&g, &iter); for (; bvgraph_iterator_valid(&iter); bvgraph_iterator_next(&iter)) { bvgraph_iterator_outedges(&iter, NULL, °); *(degs++) = (double)deg; } bvgraph_iterator_free(&iter); The two types bvgraph and bvgraph_iterator encompass most of the func- tionality of the library. The bvgraph type stores the data about the graph and the bvgraph_iterator type sequentially accesses vertices and their edges. 20 This estimate assumes that graphs are stored with compressed sparse row arrays with 32-bit indices (four bytes) and without any compression. 6.5 ⋅ libbvg and bvgraph’s in matlab 137PDF Image | ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATION
PDF Search Title:
ALGORITHMS FOR PAGERANK SENSITIVITY DISSERTATIONOriginal File Name Searched:
gleich.pdfDIY PDF Search: Google It | Yahoo | Bing
Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info
Cruising Review Topics and Articles More Info
Software based on Filemaker for the travel industry More Info
The Burgenstock Resort: Reviews on CruisingReview website... More Info
Resort Reviews: World Class resorts... More Info
The Riffelalp Resort: Reviews on CruisingReview website... More Info
| CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP |