Approach to Classifying Freight Data Elements Across Multiple Data Sources

作者:Seedah Dan P K*; Sankaran Bharathwaj; O'Brien William J
来源:Transportation Research Record, 2015, 2529(2529): 56-65.
DOI:10.3141/2529-06

摘要

Multiple freight data sources, both public and private, are available to practitioners for understanding freight demand and evaluating current and future freight transportation capacity. The challenge of working with multiple data sources is dealing with the syntactic and semantic heterogeneity in these sources. To assist practitioners in addressing this challenge, a unified perspective from which data elements from multiple sources can be examined is proposed. The role-based classification schema (RBCS) organizes and classifies data elements within their respective parent databases such that similar data elements across multiple sources can be grouped. RBCS is based on two levels of classification: a primary group that characterizes data elements according to the type of object that they describe and a secondary group that differentiates between elements that identify objects and those that describe features related to the objects. When similar data elements are ascertained, the subsequent process of resolving syntactic and semantic heterogeneity becomes much clearer, especially with hundreds of data elements. The proposed schema was validated by classifying 1,624 data elements from 28 freight data sources, and it was compared with the existing mnemonic CODMRT, which defined key attributes of freight-related shipments: commodity, origin, destination, mode, route, and time. Examples of applications in the areas of data bridging and multidatabase querying are also presented.

  • 出版日期2015