PageRank_Hadoop

Goal: Implement PageRank in MapReduce to explore the behavior of an iterative graph algorithm.

Overall Workflow Summary

Pre-processing Job: Turns the input Wikipedia data into a graph represented as adjacency lists.
PageRank Job: 10 iterations of PageRank.
Top-k Job: From the output of the last PageRank iteration, get the 100 pages with the highest PageRank and output them, along with their ranks, from highest to lowest.