Outlier Detection and Comparison of Origin-Destination Flows Using Data Depth

Authors Myeong-Hun Jeong , Junjun Yin , Shaowen Wang



PDF
Thumbnail PDF

File

LIPIcs.GISCIENCE.2018.6.pdf
  • Filesize: 1.41 MB
  • 14 pages

Document Identifiers

Author Details

Myeong-Hun Jeong
  • Department of Civil Engineering, Chosun University, Gwangju, Republic of Korea
Junjun Yin
  • Social Science Research Institute
  • Institute for CyberScience, Penn State University, PA, USA
Shaowen Wang
  • CyberGIS Center for Advanced Digital and Spatial Studies
  • Department of Geography and Geographic Information Science, University of Illinois at Urbana-Champaign, IL, USA

Cite AsGet BibTex

Myeong-Hun Jeong, Junjun Yin, and Shaowen Wang. Outlier Detection and Comparison of Origin-Destination Flows Using Data Depth. In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 6:1-6:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.GISCIENCE.2018.6

Abstract

Advances in location-aware technology have resulted in massive trajectory data. Origin-destination (OD) trajectories provide rich information on urban flow and transport demand. This study describes a new method for detecting OD flows outliers and conducting hypothesis testing between two OD flow datasets in terms of the variations of spatial extent, that is, spread. The proposed method is based on data depth, which measures the centrality and outlyingness of a point with respect to a given dataset in R^d. Based on the center-outward ordering property, the proposed method analyzes the underlying characteristics of OD flows, such as location, outlyingness, and spread. The ability of the method to detect OD anomalies is compared with that of the Mahalanobis distance approach, and an F-test is used to verify the difference in scale. Empirical evaluation has demonstrated that our method effectively identifies OD flows outliers in an interactive way. Furthermore, the method can provide new perspectives such as spatial extent by considering the overall structure of data when comparing two different OD flows in terms of scale.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Anomaly detection
Keywords
  • Movement Analysis
  • Trajectory Data Mining
  • Data Depth
  • Outlier Detection

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marina Alberti, John M Marzluff, Eric Shulenberger, Gordon Bradley, Clare Ryan, and Craig Zumbrunnen. Integrating humans into ecology: Opportunities and challenges for studying urban ecosystems. AIBS Bulletin, 53(12):1169-1179, 2003. Google Scholar
  2. Maike Buchin, Somayeh Dodge, and Bettina Speckmann. Similarity of trajectories taking into account geographic context. Journal of Spatial Information Science, 2014(9):101-124, 2014. Google Scholar
  3. Chao Chen, Daqing Zhang, Zhi-Hua Zhou, Nan Li, Tülin Atmaca, and Shijian Li. B-planner: Night bus route planning using large-scale taxi GPS traces. In 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom), pages 225-233. IEEE, 2013. Google Scholar
  4. Srinivas Devarakonda, Parveen Sevusu, Hongzhang Liu, Ruilin Liu, Liviu Iftode, and Badri Nath. Real-time air quality monitoring through mobile sensing in metropolitan areas. In Proc. 2nd ACM SIGKDD International Workshop on Urban Computing, page 15. ACM, 2013. Google Scholar
  5. Matt Duckham, Marc van Kreveld, Ross Purves, Bettina Speckmann, Yaguang Tao, Kevin Verbeek, and Jo Wood. Modeling checkpoint-based movement with the earth mover’s distance. In International Conference on Geographic Information Science, pages 225-239. Springer, 2016. Google Scholar
  6. Andy Field, Jeremy Miles, and Zoë Field. Discovering statistics using R. Sage, London, UK, 2012. Google Scholar
  7. Vitor Cunha Fontes, Lucas Andre de Alencar, Chiara Renso, and Vania Bogorny. Discovering trajectory outliers between regions of interest. In Proc. XIV GeoInfo, pages 49-60, 2013. Google Scholar
  8. Yizhao Gao, Ting Li, Shaowen Wang, Myeong-Hun Jeong, and Kiumars Soltani. A multidimensional spatial scan statistics approach to movement pattern comparison. International Journal of Geographical Information Science, 0(0):1-22, 2018. Google Scholar
  9. Diansheng Guo and Xi Zhu. Origin-destination flow data smoothing and mapping. IEEE Transactions on Visualization and Computer Graphics, 20(12):2043-2052, 2014. Google Scholar
  10. Myeong-Hun Jeong, Yaping Cai, Clair J Sullivan, and Shaowen Wang. Data depth based clustering analysis. In Proc. 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, page 29. ACM, 2016. Google Scholar
  11. Mei-Po Kwan. Space-time and integral measures of individual accessibility: A comparative analysis using a point-based framework. Geographical Analysis, 30(3):191-216, 1998. Google Scholar
  12. Tatjana Lange, Karl Mosler, and Pavlo Mozharovskyi. Fast nonparametric classification based on data depth. Statistical Papers, 55(1):49-69, 2014. Google Scholar
  13. Jae-Gil Lee, Jiawei Han, and Xiaolei Li. Trajectory outlier detection: A partition-and-detect framework. In IEEE 24th International Conference on Data Engineering, pages 140-149. IEEE, 2008. Google Scholar
  14. Liangxu Liu, Shaojie Qiao, Yongping Zhang, and JinSong Hu. An efficient outlying trajectories mining approach based on relative distance. International Journal of Geographical Information Science, 26(10):1789-1810, 2012. Google Scholar
  15. Regina Y Liu. On a notion of data depth based on random simplices. The Annals of Statistics, pages 405-414, 1990. Google Scholar
  16. Regina Y Liu and Kesar Singh. A quality index based on data depth and multivariate rank tests. Journal of the American Statistical Association, 88(421):252-260, 1993. Google Scholar
  17. Jean Damascène Mazimpaka and Sabine Timpf. Exploring the potential of combining taxi GPS and flickr data for discovering functional regions. In AGILE 2015, pages 3-18. Springer, 2015. Google Scholar
  18. Jean Damascène Mazimpaka and Sabine Timpf. Trajectory data mining: A review of methods and applications. Journal of Spatial Information Science, 2016(13):61-99, 2016. Google Scholar
  19. Karl Mosler. Robustness and Complex Data Structures, chapter Depth Statistics, pages 17-34. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013. Google Scholar
  20. Bei Pan, Yu Zheng, David Wilkie, and Cyrus Shahabi. Crowd sensing of traffic anomalies based on human mobility and social media. In Proc. 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 344-353. ACM, 2013. Google Scholar
  21. Peter J Rousseeuw and Ida Ruts. Algorithm AS 307: Bivariate location depth. Journal of the Royal Statistical Society. Series C (Applied Statistics), 45(4):516-526, 1996. Google Scholar
  22. Peter J Rousseeuw, Ida Ruts, and John W Tukey. The bagplot: A bivariate boxplot. The American Statistician, 53(4):382-387, 1999. Google Scholar
  23. Ran Tao and Jean-Claude Thill. Spatial cluster detection in spatial flow data. Geographical Analysis, 48(4):355-372, 2016. Google Scholar
  24. John W Tukey. Mathematics and the picturing of data. In Proc. International Congress of Mathematicians, volume 2, pages 523-531, 1975. Google Scholar
  25. Rand R Wilcox. Approximating Tukey’s depth. Communications in Statistics-Simulation and Computation, 32(4):977-985, 2003. Google Scholar
  26. Rand R Wilcox. Two-sample, bivariate hypothesis testing methods based on Tukey’s depth. Multivariate Behavioral Research, 38(2):225-246, 2003. Google Scholar
  27. Rand R Wilcox. Introduction to robust estimation and hypothesis testing. Academic Press, 2012. Google Scholar
  28. Hans Peter Wolf and Uni Bielefeld. aplpack: Another Plot PACKage: stem.leaf, bagplot, faces, spin3r, plotsummary, plothulls, and some slider functions, 2014. R package version 1.3.0. URL: https://CRAN.R-project.org/package=aplpack.
  29. Junjun Yin, Yizhao Gao, Zhenhong Du, and Shaowen Wang. Exploring multi-scale spatiotemporal twitter user mobility patterns with a visual-analytics approach. ISPRS International Journal of Geo-Information, 5(10):187, 2016. Google Scholar
  30. Junjun Yin, Aiman Soliman, Dandong Yin, and Shaowen Wang. Depicting urban boundaries from a mobility network of spatial interactions: A case study of great britain with geo-located twitter data. International Journal of Geographical Information Science, 31(7):1293-1313, 2017. Google Scholar
  31. Guan Yuan, Shixiong Xia, Lei Zhang, Yong Zhou, and Cheng Ji. Trajectory outlier detection algorithm based on structural features. Journal of Computational Information Systems, 7(11):4137-4144, 2011. Google Scholar
  32. Daqing Zhang, Nan Li, Zhi-Hua Zhou, Chao Chen, Lin Sun, and Shijian Li. iBAT: Detecting anomalous taxi trajectories from GPS traces. In Proc. 13th International Conference on Ubiquitous Computing, pages 99-108. ACM, 2011. Google Scholar
  33. Yu Zheng. Trajectory data mining: An overview. ACM Transactions on Intelligent Systems and Technology, 6(3):29, 2015. Google Scholar
  34. Yijun Zuo and Robert Serfling. General notions of statistical depth functions. The Annals of Statistics, 28:461-482, 2000. Google Scholar