Huge amounts of geo-referenced spatial location data and moving object trajectory data are being generated at ever increasing rates. Patterns discovered from these data are valuable in understanding human mobility and facilitating traffic mitigation. In this study, we propose a new approach to mining frequent patterns from large-scale GPS trajectory data after mapping GPS traces to road network segments. Different from applying association rule-based frequent sequence mining algorithms directly, which generally have high computation overhead and are not scalable, our approach utilizes hierarchies of road networks. After contracting nodes and creating shortcuts by contraction hierarchy algorithms, the original road segment sequences are transformed into sequences of shortcuts with much smaller data volumes. By using computed shortest paths as simulated GPS trajectories, our experiments on 17,558 taxi trip records in NYC in 2009 have shown that runtimes of frequent sequence mining on shortcut sequences are orders of magnitude faster than on original road segment sequences. In addition, frequent subsequences in shortcuts are more informative and interpretable based on the betweenness centralities of the shortcuts than visualizing betweenness centralities of individual road segments.
Jianting Zhang (2013). Efficient Frequent Sequence Mining on Taxi
Trip Records Using Road Network Shortcuts. Technical Report [PDF].