英文摘要 |
Cellular data have oscillation and data missing problems hindering their applicability. In light of the similarity on travel trajectories of most travelers, this study proposes an interpolation technique based on the similarity of travel trajectories and the longest common subsequence method, integrating linear interpolation and clustering interpolation with self-adapted weights upon the length of missing data. In applying the interpolated data to model classification, four inputs, including trip speed, trip length, trajectory similarity to the locations of railway stations, and similarity to bus trajectories, are used to classify five modes based on three supervised machine learning classification algorithms: decision tree (DT), random forest (RF), and back-propagation network (BPN). The results show that the BPN model based on the data interpolated by the proposed integrated interpolation method outperforms other mode classification models based on original or linear interpolated cellular data, suggesting the applicability of the proposed model. |