?Introduction Before the inception of Google in the late 1990s, the results obtained from the typical search engine left one to sift through large amounts of irrelevant web pages the just happened to match the search text. Google search listings always seemed to deliver more pertinent results up front. The genius behind the world’s most dominant search engine is its PageRank algorithm, which quantitatively values the relative importance of each webpage. This allows Google to rank the pages, and consequently present the most relevant and useful ones first.Page and Brin describe PageRank a model of user behavior.
They begin with the assumption that a random web surfer who is given a random Web page begins clicking on links, never hitting the “back” button. Whenever he gets bored, he jumps to another random page. The probability that this random surfer visits a specific page is its PageRank. At each page, the algorithm also analyses the probability that the random surfer will get bored and jump to another random page. The unsystematic process that illustrates the surfer’s behavior is known as a Markov Chain.The intrinsic characteristics of Markov’s theorem imply that regardless of the starting point, the probability that our random web surfer lands on a specific page is the same.
Calculating PageRank The searchable web currently has an immense number of nodes (pages) and edges (links). Pages can have both forward links and backlinks. Google takes advantage of this link structure to produce a global ranking of each webpage’s importance. It can be generally implied that a site with a high number of back links is quite important. However, PageRank implements a more sophisticated method for link counting and weighing.
In order to elaborate, we will consider the following simplified web with only four pages: In web of n pages, where each page is indexed by an interger k, where 1 ? k ? n, an arrow from on page A to page B indicates a link from A to B. Xk will be used to denote the importance score of page k in the web. All scores are non-negative (the least possible importance score is 0), and Xj ? Xk indicates that page j has a more important ranking than page k. We could approach arriving at a value for Xk by counting the back links for page k.
Considering the figure here at the left, we have