摘要

We introduce a new combinatorial problem referred to as the component set identification problem, arising in the context of knowledge discovery, information integration, and knowledge source/service composition. The main motivation for studying this problem is the widespread proliferation of digital knowledge sources and services. Considering a granular knowledge domain consisting of a large number of individual bits and pieces of domain knowledge (properties) and a large number of knowledge sources and services that provide mappings between sets of properties, the objective of the component set identification problem is to select a minimum cost combination of knowledge sources that can provide a joint mapping from a given set of initially available properties (initial knowledge) to a set of initially unknown properties (target knowledge). We position the component set identification problem relative to other combinatorial problems and provide a classification scheme for the different variations of the problem. The problem is next modeled on a directed graph and analyzed in terms of its complexity. The directed graph representation is then augmented and transformed into a time-expanded network representation that is subsequently used to develop an exact solution procedure based on integer programming and branch-and-bound. We enhance the solver by developing preprocessing techniques. All findings are supported by computational experiments.