摘要

The issues of data integration and interoperability pose significant challenges in scientific hydrological and environmental studies, due largely to the inherent semantic and structural heterogeneities of massive datasets and non-uniform autonomous data sources. To address these data integration challenges, we propose a unified data integration framework, called Hydrological Integrated Data Environment (HIDE). HIDE is based on a labeled-tree data integration model referred to as Datallode tree. Using this framework, characteristics of datasets gathered from diverse data sources - with different logical and access organizations - can be extracted and classified as Time-Space-Attribute (TSA) labels and are subsequently arranged in a Datallode tree. The uniqueness of our approach is that it effectively combines the semantic aspects of the scientific domain with diverse datasets having different logical organizations to form a unified view. Further, we also adopt a metadata-based approach for specifying the TSA-Datallode tree in order to achieve flexibility and extensibility. The search engine of our HIDE prototype system evaluates a simple user query systematically on the TSA-Datallode tree, presenting integrated results in a standardized format that facilitates both effective and efficient data integration.

  • 出版日期2010-12-15
  • 单位美国弗吉尼亚理工大学(Virginia Tech)