Rank-stability and rank-similarity of link-based Web ranking algorithms in authority-connected graphs
Abstract
Web search algorithms that rank Web pages by examining the link structure of the Web are attractive from both theoretical and practical aspects. Today's prevailing link-based ranking algorithms rank Web pages by using the dominant eigenvector of certain matrices-like the co-citation matrix or variations thereof. Recent analyses of ranking algorithms have focused attention on the case where the corresponding matrices are irreducible, thus avoiding singularities of reducible matrices. Consequently, rank analysis has been concentrated on authority connected graphs, which are graphs whose co-citation matrix is irreducible (after deleting zero rows and columns). Such graphs conceptually correspond to thematically related collections, in which most pages pertain to a single, dominant topic of interest. A link-based search algorithm A is rank-stable if minor changes in the link structure of the input graph, which is usually a subgraph of the Web, do not affect the ranking it produces; algorithms A,B are rank-similar if they produce similar rankings. These concepts were introduced and studied recently for various existing search algorithms. This paper studies the rank-stability and rank-similarity of three link-based ranking algorithms-PageRank, HITS and SALSA-in authority connected graphs. For this class of graphs, we show that neither HITS nor PageRank is rank stable. We then show that HITS and PageRank are not rank similar on this class, nor is any of them rank similar to SALSA. © 2005 Springer Science + Business Media, Inc.