An Automatic Approach to Extracting Geographic Information From Internet

作者:Zhang, Ying; Gao, Minghe; Zhang, Xin*; Yang, Puhai; Ma, Qunfei; Wang, Cheng; He, Hui; Hu, Xiang
来源:IEEE Access, 2018, 6: 36732-36743.
DOI:10.1109/ACCESS.2018.2844470

摘要

Geographic information, especially represented by points of interest (POIs), is critical for identifying locations and provides the basis for various location-based services. Currently, geospatial data of POI are available through some open map services (e.g., Google Maps, OpenStreetMap, etc.). However, the data supporting these services are either collected through the expensive commercial purchasing and company investment or gathered by the volunteered contribution of high uncertainty. With the rapid geospatial data growing on the Web, we propose an automatic approach of extracting geographic information for building up POI resources-based on the results obtained by the Web search engines to mitigate the negative effect from the traditional means. According to the approach, we first put the types of POIs extracted from Google Maps and the street names obtained from OpenStreetMap into the Google search engine, and then retrieve the potential addresses of POIs through parsing the search results. Second, the Google search engine is employed again with the retrieved addresses of POIs to extract the potential place names. Finally, the Google search engine is employed for a third time with learning both the place names and the corresponding addresses to verify whether the place names are correct. The contributed output of the work is a place-name data set. We respectively select 20 blocks in Chicago and Houston in USA to execute our approach for verifying the research contribution. In the experiments, we choose Google Map that is of high data quality as the reference and compare the results with those from OpenStreetMap and Wikimapia. The final results indicate that the proposed approach could effectively produce the place-name data sets on a par with Google Maps and outperform OpenStreetMap and Wikimapia.