摘要

Crash rates are discrete-continuous mixed quantities and left-censored at zero. Past studies have contributed in many aspects to connecting crash rates with factors, but the distributional characteristics of crash rates, especially in the positive domain, and the capability of capturing heterogeneous influential patterns of factors are still deficient. This study proposes a hurdle finite mixture lognormal (HUFML) model that allows each observation to come from more than one distribution. The proposed model can take account of the potentially heterogeneous patterns of influence of factors on crash rates, as well as provide a more flexible distribution framework for fitting the positive crash rates. To estimate parameters for this model, an expectation-maximization (E-M) algorithm is also formulated. The HUFML model is applied to an observed crash data set with the number of component distributions ranging from one to ten. Under proper component numbers, the HUFML model is superior to the traditional hurdle lognormal (HUL) model. More importantly, the finite-mixture setting is found to be effective for capturing heterogeneous patterns of influence for certain variables that cannot be captured by the traditional HUL model.