Evaluating top-k queries over web-accessible databases

Nicolas Bruno, Luis Gravano, Amélie Marian

Research output: Contribution to conferencePaperpeer-review

219 Scopus citations


A query to a web search engine usually consists of a list of keywords, to which the search engine responds with the best or "top" k pages for the query. This top-k query model is prevalent over multimedia collections in general, but also over plain relational data for certain applications. For example, consider a relation with information on available restaurants, including their location, price range for one diner, and overall food rating. A user who queries such a relation might simply specify the user's location and target price range, and expect in return the best 10 restaurants in terms of some combination of proximity to the user, closeness of match to the target price range, and overall food rating. Processing such top-k queries efficiently is challenging for a number of reasons. One critical such reason is that, in many web applications, the relation attributes might not be available other than through external web-accessible form interfaces, which we will have to query repeatedly for a potentially large set of candidate objects. In this paper, we study how to process top-k queries efficiently in this setting, where the attributes for which users specify target values might be handled by external, autonomous sources with a variety of access interfaces. We present several algorithms for processing such queries, and evaluate them thoroughly using both synthetic and real web-accessible data.

Original languageEnglish (US)
Number of pages12
StatePublished - 2002
Externally publishedYes
Event18th International Conference on Data Engineering - San Jose, CA, United States
Duration: Feb 26 2002Mar 1 2002


Other18th International Conference on Data Engineering
Country/TerritoryUnited States
CitySan Jose, CA

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Information Systems


Dive into the research topics of 'Evaluating top-k queries over web-accessible databases'. Together they form a unique fingerprint.

Cite this