High performance sorting and searching using graphics processors
Download
1 / 115

high performance sorting and searching - PowerPoint PPT Presentation


  • 318 Views
  • Updated On :

High Performance Sorting and Searching using Graphics Processors. Naga K. Govindaraju Microsoft Concurrency. Sorting and Searching. “I believe that virtually every important aspect of programming arises somewhere in the context of sorting or searching !” -Don Knuth. Sorting and Searching.

Related searches for high performance sorting and searching

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'high performance sorting and searching ' - ryanadan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
High performance sorting and searching using graphics processors l.jpg

High Performance Sorting and Searching using Graphics Processors

Naga K. GovindarajuMicrosoft Concurrency


Sorting and searching l.jpg
Sorting and Searching

“I believe that virtually every important aspect of programming arises somewhere in the context of sorting or searching!”

-Don Knuth


Sorting and searching3 l.jpg
Sorting and Searching

  • Well studied

    • High performance computing

    • Databases

    • Computer graphics

    • Programming languages

    • ...

  • Google map reduce algorithm

  • Spec benchmark routine!


Massive databases l.jpg
Massive Databases

  • Terabyte-data sets are common

    • Google sorts more than 100 billion terms in its index

    • > 1 Trillion records in web indexed!

  • Database sizes are rapidly increasing!

    • Max DB sizes increases 3x per year (http://www.wintercorp.com)

    • Processor improvements not matching information explosion


Cpu vs gpu l.jpg

CPU(3 GHz)

AGP Memory(512 MB)

CPU vs. GPU

GPU (690 MHz)

Video Memory(512 MB)

2 x 1 MB Cache

System Memory(2 GB)

PCI-E Bus(4 GB/s)

GPU (690 MHz)

Video Memory(512 MB)


Massive data handling on cpus l.jpg
Massive Data Handling on CPUs

  • Require random memory accesses

    • Small CPU caches (< 2MB)

    • Random memory accesses slower than even sequential disk accesses

  • High memory latency

    • Huge memory to compute gap!

  • CPUs are deeply pipelined

    • Pentium 4 has 30 pipeline stages

    • Do not hide latency - high cycles per instruction (CPI)

  • CPU is under-utilized for data intensive applications


Massive data handling on cpus7 l.jpg
Massive Data Handling on CPUs

  • Sorting is hard!

  • GPU a potentially scalable solution to terabyte sorting and scientific computing

    • We beat the sorting benchmark with GPUs and provide a scaleable solution


G raphics p rocessing u nits gpu s l.jpg
Graphics Processing Units (GPUs)

  • Commodity processor for graphics applications

  • Massively parallel vector processors

  • High memory bandwidth

    • Low memory latency pipeline

    • Programmable

  • High growth rate

  • Power-efficient


Gpu commodity processor l.jpg
GPU: Commodity Processor

Laptops

Consoles

Cell phones

PSP

Desktops


G raphics p rocessing u nits gpu s10 l.jpg
Graphics Processing Units (GPUs)

  • Commodity processor for graphics applications

  • Massively parallel vector processors

    • 10x more operations per sec than CPUs

  • High memory bandwidth

    • Better hides memory latency pipeline

    • Programmable

  • High growth rate

  • Power-efficient


Parallelism on gpus l.jpg
Parallelism on GPUs

Graphics FLOPS

GPU – 1.3 TFLOPS

CPU – 25.6 GFLOPS


G raphics p rocessing u nits gpu s12 l.jpg
Graphics Processing Units (GPUs)

  • Commodity processor for graphics applications

  • Massively parallel vector processors

  • High memory bandwidth

    • Better hides latency pipeline

    • Programmable

    • 10x more memory bandwidth than CPUs

  • High growth rate

  • Power-efficient


Graphics pipeline l.jpg

Low pipeline depth

Graphics Pipeline

56 GB/s

programmable vertex

processing (fp32)

vertex

polygon setup,

culling, rasterization

setup

polygon

rasterizer

Hides memory latency!!

programmable per-

pixel math (fp32)

pixel

per-pixel texture,

fp16 blending

texture

Z-buf, fp16 blending,

anti-alias (MRT)

memory

image


Non graphics pipeline abstraction l.jpg
NON-Graphics Pipeline Abstraction

programmable MIMD

processing (fp32)

data

Courtesy: David Kirk,Chief Scientist, NVIDIA

SIMD

“rasterization”

setup

lists

rasterizer

programmable SIMD

processing (fp32)

data

data fetch,

fp16 blending

data

predicated write, fp16

blend, multiple output

memory

data


G raphics p rocessing u nits gpu s15 l.jpg
Graphics Processing Units (GPUs)

  • Commodity processor for graphics applications

  • Massively parallel vector processors

  • High memory bandwidth

    • Better hides latency pipeline

    • Programmable

  • High growth rate

  • Power-efficient


Slide16 l.jpg

GPU Growth Rate

CPU Growth Rate

Exploiting Technology Moving

Faster than Moore’s Law


G raphics p rocessing u nits gpu s17 l.jpg
Graphics Processing Units (GPUs)

  • Commodity processor for graphics applications

  • Massively parallel vector processors

  • High memory bandwidth

    • Better hides latency pipeline

    • Programmable

  • High growth rate

  • Power-efficient


Cpu vs gpu henry moreton nvidia aug 2005 l.jpg
CPU vs. GPU(Henry Moreton: NVIDIA, Aug. 2005)


Gpus for sorting and searching issues l.jpg
GPUs for Sorting and Searching: Issues

  • No support for arbitrary writes

    • Optimized CPU algorithms do not map!

  • Lack of support for general data types

  • Cache-efficient algorithms

    • Small data caches

    • No cache information from vendors

  • Out-of-core algorithms

    • Limited GPU memory


Outline l.jpg
Outline

  • Overview

  • Sorting and Searching on GPUs

  • Applications

  • Conclusions and Future Work


Sorting on gpus l.jpg
Sorting on GPUs

  • Adaptive sorting algorithms

    • Extent of sorted order in a sequence

  • General sorting algorithms

  • External memory sorting algorithms


Adaptive sorting on gpus l.jpg
Adaptive Sorting on GPUs

  • Prior adaptive sorting algorithms require random data writes

  • GPUs optimized for minimum depth or visible surface computation

    • Using depth test functionality

  • Design adaptive sorting using only minimum computations


Adaptive sorting algorithm l.jpg

N. Govindaraju, M. Henson, M. Lin and D. Manocha, Proc. Of ACM I3D, 2005

Adaptive Sorting Algorithm

  • Multiple iterations

  • Each iteration uses a two pass algorithm

    • First pass – Compute an increasing sequence M

    • Second pass - Compute the sorted elements in M

    • Iterate on the remaining unsorted elements


Increasing sequence l.jpg
Increasing Sequence

Given a sequence S={x1,…, xn}, an element xi belongs to M if and only if xi ≤ xj, i<j, xj in S


Increasing sequence25 l.jpg

Increasing Sequence

X1 X2 X3… Xi-1 Xi Xi+1 … Xn-2 Xn-1 Xn

M is an increasing sequence


Increasing sequence computation l.jpg

Compute

Increasing SequenceComputation

X1 X2 … Xi-1 Xi Xi+1 … Xn-1 Xn


Slide27 l.jpg

Compute

Xn≤ ∞

Increasing SequenceComputation

Xn


Slide28 l.jpg

Compute

Yes. Prepend xi to MMin = xi

xi≤Min?

Increasing SequenceComputation

XiXi+1 … Xn-1 Xn


Slide29 l.jpg

Compute

Increasing SequenceComputation

X1 X2 … Xi-1 Xi Xi+1 … Xn-1 Xn

x1≤{x2,…,xn}?


Computing sorted elements l.jpg
Computing Sorted Elements

Theorem 1:Given the increasing sequence M, rank of an element xi in M is determined if xi < min (I-M)


Computing sorted elements31 l.jpg
Computing Sorted Elements

X1 X2 X3… Xi-1 Xi Xi+1 … Xn-2 Xn-1 Xn


Computing sorted elements32 l.jpg

Computing Sorted Elements

X1X3… XiXi+1 … Xn-2 Xn-1

X2 … Xi-1 … Xn


Computing sorted elements33 l.jpg
Computing Sorted Elements

  • Linear-time algorithm

    • Maintaining minimum


Computing sorted elements34 l.jpg

Compute

Computing Sorted Elements

X1X3… XiXi+1 … Xn-2 Xn-1

X2 … Xi-1 … Xn


Computing sorted elements35 l.jpg

Compute

No. Update min

Xi in M?

Yes. Append Xito sorted list

Xi≤ min?

Computing Sorted Elements

X1 X2 … Xi-1Xi


Computing sorted elements36 l.jpg

Compute

Computing Sorted Elements

X1 X2 … Xi-1 Xi Xi+1 … Xn-1Xn


Algorithm analysis l.jpg
Algorithm Analysis

Knuth’s measure of disorder:

Given a sequence I and its longest increasing sequence LIS(I), the sequence of disordered elements Y = I - LIS(I)

Theorem 2: Given a sequence I and LIS(I), our adaptive algorithm sorts in at most (2 ||Y|| + 1) iterations


Pictorial proof l.jpg
Pictorial Proof

X1 X2 …Xl Xl+1 Xl+2 …Xm Xm+1 Xm+2 ...Xq Xq+1 Xq+2 ...Xn


Pictorial proof39 l.jpg

2 iterations

2 iterations

2 iterations

Pictorial Proof

X1 X2 …XlXl+1 Xl+2 …XmXm+1 Xm+2 ...XqXq+1 Xq+2 ...Xn


Example l.jpg

Example

8

1 2 3 4 5 6 7 9


Example41 l.jpg

Sorted

Example

8

9

1 2 3 4 5 6 7



Advantages l.jpg
Advantages

  • Linear in the input size and sorted extent

    • Works well on almost sorted input

  • Maps well to GPUs

    • Uses depth test functionality for minimum operations

    • Useful for performing 3D visibility ordering

    • Perform transparency computations on dynamic 3D environments

  • Cons:

    • Expected time: O(n2 – 2 n√n) on random sequences


Video transparent powerplant l.jpg
Video: Transparent PowerPlant

  • 790K polygons

  • Depth complexity ~ 13

  • 1600x1200 resolution

  • NVIDIA GeForce 6800

  • 5-8 fps



General sorting on gpus l.jpg

N. Govindaraju,N. Raghuvanshi and D. Manocha, Proc. Of ACM SIGMOD, 2005

General Sorting on GPUs

  • General datasets

  • High performance


General sorting on gpus47 l.jpg
General Sorting on GPUs

  • Design sorting algorithms with deterministic memory accesses – “Texturing” on GPUs

    • 56 GB/s peak memory bandwidth

    • Can better hide the memory latency!!

  • Require minimum and maximum computations – “Blending functionality” on GPUs

    • Low branching overhead

  • No data dependencies

    • Utilize high parallelism on GPUs


Gpu based sorting networks l.jpg
GPU-Based Sorting Networks

  • Represent data as 2D arrays

  • Multi-stage algorithm

    • Each stage involves multiple steps

  • In each step

    • Compare one array element against exactly one other element at fixed distance

    • Perform a conditional assignment (MIN or MAX) at each element location



2d memory addressing l.jpg
2D Memory Addressing

  • GPUs optimized for 2D representations

    • Map 1D arrays to 2D arrays

    • Minimum and maximum regions mapped to row-aligned or column-aligned quads



1d 2d mapping52 l.jpg

MIN

1D – 2D Mapping

Effectively reduce instructions per element


Sorting on gpu pipelining and parallelism l.jpg
Sorting on GPU: Pipelining and Parallelism

Input Vertices

Texturing, Caching and 2D Quad

Comparisons

Sequential Writes


Comparison with gpu based algorithms l.jpg
Comparison with GPU-Based Algorithms

3-6x faster than prior GPU-based algorithms!


Gpu vs high end multi core cpus l.jpg
GPU vs. High-End Multi-Core CPUs

2-2.5x faster thanIntel high-end processors

Single GPU performance comparable tohigh-end dual core Athlon

Hand-optimized CPU code from Intel Corporation!


Gpu vs high end multi core cpus56 l.jpg
GPU vs. High-End Multi-Core CPUs

2-2.5x faster thanIntel high-end processors

Single GPU performance comparable tohigh-end dual core Athlon

Slash Dot and Toms Hardware Guide Headlines, June 2005


Gpu cache model l.jpg

N. Govindaraju,S. Larsen,J. Gray, and D. Manocha, Proc. Of ACM SuperComputing, 2006

GPU Cache Model

  • Small data caches

    • Better hide the memory latency

    • Vendors do not disclose cache information – critical for scientific computing on GPUs

  • We design simple model

    • Determine cache parameters (block and cache sizes)

    • Improve sorting, FFT and SGEMM performance


Cache evictions l.jpg
Cache Evictions

Cache

Cache Eviction

Cache Eviction

Cache Eviction


Cache issues l.jpg
Cache issues

h

Cache misses per step = 2 W H/ (h B)

Cache

Cache Eviction

Cache Eviction

Cache Eviction


Analysis60 l.jpg
Analysis

  • lg n possible steps in bitonic sorting network

  • Step k is performed (lg n – k+1) times and h = 2k-1

  • Data fetched from memory = 2 n f(B) where f(B)=(B-1) (lg n -1) + 0.5 (lg n –lg B)2






Super moore s law growth l.jpg
Super-Moore’s Law Growth

50 GB/s on a single GPU

Peak Performance: Effectively hide memory latency with 15 GOP/s


External memory sorting l.jpg

N. Govindaraju,J. Gray, R. Kumar and D. Manocha, Proc. of ACM SIGMOD 2006 (to appear)

External Memory Sorting

  • Performed on Terabyte-scale databases

  • Two phases algorithm [Vitter01, Salzberg90, Nyberg94, Nyberg95]

    • Limited main memory

    • First phase – partitions input file into large data chunks and writes sorted chunks known as “Runs”

    • Second phase – Merge the “Runs” to generate the sorted file


External memory sorting67 l.jpg
External Memory Sorting

  • Performance mainly governed by I/O

    Salzberg Analysis: Given the main memory size M and the file size N, if the I/O read size per run is T in phase 2, external memory sorting achieves efficient I/O performance if the run size R in phase 1 is given by R ≈ √(TN)


Salzberg analysis l.jpg
Salzberg Analysis

  • If N=100GB, T=2MB, then R ≈ 230MB

  • Large data sorting is inefficient on CPUs

    • R » CPU cache sizes – memory latency


External memory sorting69 l.jpg
External memory sorting

  • External memory sorting on CPUs can have low performance due to

    • High memory latency

    • Or low I/O performance

  • Our algorithm

    • Sorts large data arrays on GPUs

    • Perform I/O operations in parallel on CPUs



I o performance l.jpg
I/O Performance

Salzberg Analysis: 100 MB Run Size


I o performance72 l.jpg
I/O Performance

Salzberg Analysis: 100 MB Run Size

Pentium IV: 25MB Run Size

Less work and only 75% IO efficient!


I o performance73 l.jpg
I/O Performance

Salzberg Analysis: 100 MB Run Size

Dual 3.6 GHz Xeons: 25MB Run size

More cores, less work but only 85% IO efficient!


I o performance74 l.jpg
I/O Performance

Salzberg Analysis: 100 MB Run Size

7800 GT: 100MB run size

Ideal work, and 92% IO efficient with single CPU!


Task parallelism l.jpg
Task Parallelism

Performance limited by IO and memory


Overall performance l.jpg
Overall Performance

Faster and more scalable than Dual Xeon processors (3.6 GHz)!


Performance l.jpg
Performance/$

1.8x faster than current Terabyte sorter

World’s best performance/$ system


Advantages78 l.jpg
Advantages

  • Exploit high memory bandwidth on GPUs

    • Higher memory performance than CPU-based algorithms

  • High I/O performance due to large run sizes


Advantages79 l.jpg
Advantages

  • Offload work from CPUs

    • CPU cycles well-utilized for resource management

  • Scalable solution for large databases

  • Best performance/price solution for terabyte sorting


Searching for quantiles l.jpg

N. Govindaraju,B. Lloyd, W. Wang, M. Lin and D. Manocha, Proc. Of ACM SIGMOD, 2004

Searching for Quantiles

  • Given a set of values in the GPU, compute the Kth-largest number

  • Traditional CPU algorithms require arbitrary data writes - require new algorithm without

    • Data rearrangement

    • Data readback to CPU

  • Our solution – search for the Kth-largest number


K th largest number l.jpg
K-th Largest Number

  • Let vk denote the k-th largest number

  • How do we generate a number m equal to vk?

    • Without knowing vk’s value

    • Count the number of values ≥ some given value

    • Starting from the most significant bit, determine the value of each bit at a time


K th largest number82 l.jpg
K-th Largest Number

  • Given a set S of values

    • c(m) —number of values ≥ m

    • vk — the k-th largest number

  • We have

    • If c(m) ≥ k, then m ≤ vk

    • If c(m) < k, then m > vk

  • c(m) computed using occlusion queries


2 nd largest in 9 values l.jpg
2nd Largest in 9 Values

0011

1011

1101

m = 0000

v2 = 1011

0111

0101

0001

0111

1010

0010


Draw a quad at depth 8 compute c 1000 l.jpg
Draw a Quad at Depth 8 Compute c(1000)

0011

1011

1101

m = 1000

v2 = 1011

0111

0101

0001

0111

1010

0010


1 st bit 1 l.jpg
1st bit = 1

0011

1011

1101

m = 1000

v2 = 1011

0111

0101

0001

c(m) = 3

0111

1010

0010


Draw a quad at depth 12 compute c 1100 l.jpg
Draw a Quad at Depth 12 Compute c(1100)

0011

1011

1101

m = 1100

v2 = 1011

0111

0101

0001

0111

1010

0010


2 nd bit 0 l.jpg
2nd bit = 0

0011

1011

1101

m = 1100

v2 = 1011

0111

0101

0001

c(m) = 1

0111

1010

0010


Draw a quad at depth 10 compute c 1010 l.jpg
Draw a Quad at Depth 10 Compute c(1010)

0011

1011

1101

m = 1010

v2 = 1011

0111

0101

0001

0111

1010

0010


3 rd bit 1 l.jpg
3rd bit = 1

0011

1011

1101

m = 1010

v2 = 1011

0111

0101

0001

c(m) = 3

0111

1010

0010


Draw a quad at depth 11 compute c 1011 l.jpg
Draw a Quad at Depth 11 Compute c(1011)

0011

1011

1101

m = 1011

v2 = 1011

0111

0101

0001

0111

1010

0010


4 th bit 1 l.jpg
4th bit = 1

0011

1011

1101

m = 1011

v2 = 1011

0111

0101

0001

c(m) = 2

0111

1010

0010


Our algorithm l.jpg
Our algorithm

  • Initialize m to 0

  • Start with the MSB and scan all bits till LSB

  • At each bit, put 1 in the corresponding bit-position of m

  • If c(m) < k, make that bit 0

  • Proceed to the next bit



Median l.jpg
Median

3x performance improvement per year!


Application collision detection l.jpg
Application: Collision Detection

  • Compute the overlapping geometric primitives

  • Fundamental to

    • Robotics and motion planning, computer graphics, computational geometry, computer animations, virtual reality, etc.

  • Overlap computation is a sorting problem!


Application collision detection96 l.jpg
Application: Collision Detection

  • Exploit interactivity in graphics applications

    • Objects do not move significantly – use coherence

    • Adaptive visibility ordering for dynamic data sets!

  • Define precedence based on visibility

    • An object O precedes a set of objects S if it is completely in front of S


Visibility of objects l.jpg

View

O

Visibility of Objects

O1

O2


Visibility for collisions geometric interpretation l.jpg

View

O

Visibility for Collisions: Geometric Interpretation

Sufficient but not a necessary condition for existence of separating surface with unit depth complexity

More culling efficiency than bounding volumes

O1

O2


Interactive cloth simulation l.jpg

Govindaraju et al., Proc. Of ACM SIGGRAPH, 2005

Interactive Cloth Simulation

  • 450 ms per frame

  • Over 100x performance improvement over prior algorithms


Numerical algorithms on gpus l.jpg
Numerical Algorithms on GPUs

  • Fundamental to scientific, data mining, multimedia and high performance computing applications

  • Highly compute- and memory-intensive

    • LU, QR and singular value decomposition O(n3)

    • SORT O(n log2n)

    • FFT O(n logn)

    • SGEMM O(n3)


Lu decompostion l.jpg

N. Galoppo, N. Govindaraju,M. Henson and D. Manocha, Proc. of ACM SuperComputing 2005

LU-Decompostion

Faster than Atlas!

(Aug 2004)

(Jun 2005)


Numerical algorithms on gpus102 l.jpg
Numerical Algorithms on GPUs

Slash Dot News Headlines, May 2006


Numerical libraries l.jpg
Numerical Libraries

  • GPUSORT – http://gamma.cs.unc.edu/GPUSORT(Over 1200 downloads since 07/2005)

  • GPUFFTW – http://gamma.cs.unc.edu/GPUFFTW(Over 900 downloads since 06/2006)

  • LUGPU – http://gamma.cs.unc.edu/LUGPULIB(Over 300 downloads since 11/2005)


Conclusions l.jpg
Conclusions

  • Designed new sorting algorithms on GPUs

    • Adaptive, general and external memory sorting algorithms

    • Applied to interactive graphics applications

  • Presented novel GPU memory models

    • Memory efficient sorting algorithm with peak memory performance of (50 GB/s) on GPUs

    • Applicable to all scientific computing algorithms on GPUs and GPU-like architectures


Conclusions105 l.jpg
Conclusions

  • Novel external memory sorting algorithm as a scalable solution

    • Achieves peak I/O performance on CPUs

    • Best performance/price solution – world’s fastest sorting system

  • High performance growth rate characteristics

    • Improve 2-3 times/yr


Future work power and heat l.jpg
Future Work: Power and Heat

  • Designed high performance/price solutions

    • High wattage of CPUs - major scalability bottleneck!

    • Effective cooling solutions for CPUs

  • Design high performance/watt solutions using GPUs

    • Two orders of magnitude more power efficient!

    • Improve the reliability, fault tolerance and stability

  • Adaptive power management algorithms on GPUs


Future work scientific libraries l.jpg
Future Work: Scientific libraries

  • Scientific libraries utilizing high parallelism and memory bandwidth

    • Scientific routines on LU, QR, SVD, FFT, etc.

    • BLAS library on GPUs

    • Build GPU-LAPACK and Matlab routines

  • Multi-cores are emerging

    • Exploit parallelism in multiple CPUs and GPUs;

    • Quad-GPU solutions provide > 200GB/s


Future work l.jpg
Future Work

  • Design and analyze new scientific and HPC algorithms

    • Exploit processor parallelism

    • Lower memory latency


Acknowledgements l.jpg
Acknowledgements

Research Sponsors:

  • Army Research Office

  • Defense and Advanced Research Projects Agency

  • National Science Foundation

  • Naval Research Laboratory

  • Intel Corporation

  • RDECOM


Acknowledgements110 l.jpg
Acknowledgements

  • NVIDIA, ATI and Microsoft Corporation

    • Architecture

      • David Kirk, Michael Doggett, Steven Molnar

    • Drivers

      • Evan Hart, Paul Keller, Nick Triantos

    • Developer relations

      • Mark Harris, Mark Segal

    • Microsoft DirectX

      • David Blythe, Craig Peeper, Peter-Pike Sloan


Acknowledgements111 l.jpg
Acknowledgements

Collaborators:

  • Anastassia Ailamaki (CMU)

  • Ian Buck (NVIDIA)

  • Jim Gray (MSR)

  • Ming Lin (UNC Chapel Hill)

  • David Luebke (NVIDIA)

  • Dinesh Manocha (UNC)

  • John Owens (UC Davis)

  • Timothy Purcell (NVIDIA)

  • Stephane Redon (INRIA)

  • Rasmus Tamstorf (Walt Disney Animation Feature)


Acknowledgements112 l.jpg
Acknowledgements

  • Supporters:

    • Fred Brooks (UNC Chapel Hill)

    • Pat Hanrahan (Stanford University)

    • Michael Macedonia (ARDA)

    • Jan Prins (UNC Chapel Hill)

    • David Tarditi (Microsoft Research)

    • Jingren Zhou (Microsoft Research)

    • Faculty members of UNC Chapel Hill


Acknowledgements113 l.jpg
Acknowledgements

  • Ritesh Kumar

  • Brandon Lloyd

  • Nikunj Raghuvanshi

  • Avneesh Sud

  • Talha Zaman

  • Students:

    • Nico Galoppo

    • Russel Gayle

    • Michael Henson

    • Nitin Jain

    • Ilknur Kabul

  • UNC Systems, GAMMA and Walkthrough groups


Thank you l.jpg
Thank You

  • Questions or Comments?

[email protected]

http://www.cs.unc.edu/~naga


Page rank update l.jpg
Page rank update

  • http://www.webmasterworld.com/forum3/2657.htmhttp://www.dnlodge.com/dnl-resource-collection/1291-google-page-rank-update-history.html


ad