Consider a directed rooted tree T=(V,E) of maximal degree d representing a collection V of web pages connected via a set E of links all reachable from a source home page, represented by the root of T. Each leaf web page carries a weight representative of the frequency with which it is visited. By adding hotlinks, shortcuts from a node to one of its descendents, we are interested in minimizing the expected number of steps needed to visit the leaf pages from the home page. We give an O(N2) time algorithm for assigning hotlinks so that the expected number of steps to reach the leaves from the root of the tree is at most H(p)/(log(d+1)-(d/(d+1))logd)+(d+1)/d, where H(p) is the entropy of the probability (frequency) distribution p=〈p1,p 2,⋯,pN〉 on the N leaves of the given tree, i.e., pi is the weight on the ith leaf. The best known lower bound for this problem is H(p)/log(d+1). We also show how to adapt our algorithm to complete trees of a given degree d and in this case we prove it is optimal, asymptotically in d.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Signal Processing
- Information Systems
- Computer Science Applications
- Probability distribution