摘要

This study examines the landfalling tropical cyclones (TCs) over China using state-of-the-art data mining methods (i.e. Finite Mixture Model (FMM) based cluster algorithm and the Classification and Regression Tree (CART)). Using the 1951-2012 TC best track dataset released by the Shanghai Typhoon Institute of the Chinese Meteorological Administration, the tracks of TCs landfalling over the Chinese coast were classified into three clusters through an FMM. Several climate indices were analysed using the CART algorithm for the three clusters. The prediction model built by CART for summer track frequency was based on a random sampling of the data for 46 years (about 75% of the total years) as the training set with a training accuracy of 100% (Cluster-1), 89.96% (Cluster-2) and 100% (Cluster-3). Data for the remaining 16 years (about 25%) were used for testing with a prediction accuracy of 87.5% (Cluster-1), 62.5% (Cluster-2) and 68.75% (Cluster-3). This study focuses on Cluster-1 of summer TCs landfalling over China for its high frequency, strong intensity, severe impacts and long lifespan. Furthermore, it suggests that the FMM algorithm is effective for track classification of TCs landing over China. In addition, the CART algorithm, which was used to build the prediction model of Cluster-1 for the classification of track frequency, showed high accuracy and its results can be explained and understood easily. It provides a novel framework for forecasting the frequency of TCs landfalling over China.