摘要

Resource Selection is an important step in a federated search environment. The goal of this work was to improve the collection selection process by selecting collections in terms of relevance and diversity, to best answer a user's query. Sampled documents from the Central Sample Database are first ranked by Indri retrieval algorithm and later re-ranked by a Mean-Standard deviation method that reduces uncertainty and improves diversity of collection sources. A comparative evaluation with the R-based diversification metrics shows that the proposed method significantly outperforms the baseline diversification methods; ReDDE+MMR, ReDDE+MAP-IA and state-of-the-art resource selection methods (ReDDE and CORI) in all metrics.