中文摘要 |
Fusion and clustering are two approaches to improving the effectiveness of
information retrieval. In fusion, ranked lists are combined together by various
means. The motivation is that different IR systems will complement each other,
because they usually emphasize different query features when determining
relevance and retrieve different sets of documents. In clustering, documents are
clustered either before or after retrieval. The motivation is that similar documents
tend to be relevant to the same query so that this approach is likely to retrieve
more relevant documents by identifying clusters of similar documents. In this
paper, we present a novel fusion technique that can be combined with clustering to
achieve consistent improvements over conventional approaches. Our method
involves three steps: (1) clustering similar documents, (2) re-ranking retrieval
results, and (3) combining retrieval results. |