PDF Publication Title:
Text from PDF Page: 143
of the parallel code to manipulate the graph stored by out-edges (the natural order) or by in-edges (the Gauss-Seidel order). Boldi and Vigna’s bvgraph structure [Boldi and Vigna, 2004] efficiently iterates over the edges emanating from a vertex for a fixed ordering of the nodes. These could be either out- or in-edges, depending on how the graph is stored. To implement the parallel matrix-vector multiply for p processors, we make one pass through the file and store p − 1 locations in the structure that roughly divide the edges of the graph evenly between p processors. When the graph is stored by out-edges, the serial matrix-vector operation is xi=x[i]/degree(i); for (j in edges of i) { y[j]+=xi; } which writes to arbitrary and possibly overlapping locations in memory. In the OpenMP version, the update of y becomes the atomic operation xi=x[i]/degree(i); for (j in edges of i) { atomic(y[j]+=xi); }. When the graph is stored by in-edges, the original serial update works without modification as the processors never write to the same location in the vector y. We evaluated a few variants of the matrix-vector multiplication when the graph is stored by out-edges and found the atomic operation variant has similar performance to storing a separate y vector for each processor and aggregating the vectors at the end of the operation. Variants that replaced separate y vectors with separate hash tables were slower in our tests. Figure 5.3 demonstrates the scalability of the codes when the graph is stored by out- and in-edges. In the figure, we distinguish between relative and true speedup. Relative speedup is the time on p processors compared with the time on 1 processor for the same algorithm. True speedup is the time 88 767 4 66 55 44 5 3 33 22 11 42 88 00 1234567812345678 Number of processors Number of processors Figure 5.3 – Parallel performance of the inner-outer iteration. Parallel scalability for the three large graphs arabic-2005, sk-2005, and uk-2007 with the matrix stored by out-edges (left) and in-edges (right). Light gray or teal lines are the relative speedup compared with the same algorithm on one processor. Black lines are the true speedup compared with the best single processor code. At the right of each sub-figure is a small enlargement of the 8 processor results. The parameters wereα = 0.99,β = 0.5,and η = 10−2. In this plot, speedup is the ratio vr/vt. 5.6 ⋅ numerical results 121 linear power relative inout relative 1e−3 power 1e−3 inout 1e−5 power 1e−5 inout 1e−7 power 1e−7 inout linear power relative inout relative 1e−3 power 1e−3 inout 1e−5 power 1e−5 inout 1e−7 power 1e−7 inout Speedup relative to best 1 processor Speedup relative to best 1 processorPDF Image | Instagram Cheat Sheet
PDF Search Title:
Instagram Cheat SheetOriginal File Name Searched:
pagerank-sensitivity-thesis-online.pdfDIY PDF Search: Google It | Yahoo | Bing
Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info
Cruising Review Topics and Articles More Info
Software based on Filemaker for the travel industry More Info
The Burgenstock Resort: Reviews on CruisingReview website... More Info
Resort Reviews: World Class resorts... More Info
The Riffelalp Resort: Reviews on CruisingReview website... More Info
CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)