Document Open Access Logo

Why Is Greenwich so Common? Quantifying the Uniqueness of Multivariate Observations (Short Paper)

Authors Andrea Ballatore , Stefano Cavazzi



PDF
Thumbnail PDF

File

LIPIcs.GIScience.2023.15.pdf
  • Filesize: 3.7 MB
  • 6 pages

Document Identifiers

Author Details

Andrea Ballatore
  • Department of Digital Humanities, King’s College London, UK
Stefano Cavazzi
  • Ordnance Survey, Southampton, UK

Cite AsGet BibTex

Andrea Ballatore and Stefano Cavazzi. Why Is Greenwich so Common? Quantifying the Uniqueness of Multivariate Observations (Short Paper). In 12th International Conference on Geographic Information Science (GIScience 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 277, pp. 15:1-15:6, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.GIScience.2023.15

Abstract

The concept of uniqueness can play an important role when the assessment of an observation’s distinctiveness is essential. This article introduces a distance-based uniqueness measure that quantifies the relative rarity or commonness of a multi-variate observation within a dataset. Unique observations exhibit rare combinations of values, and not necessarily extreme values. Taking a cognitive psychological perspective, our measure defines uniqueness as the sum of distances between a target observation and all other observations. After presenting the measure u and its corresponding standardised version u_z, we propose a method to calculate a p value through a probability density function. We then demonstrate the measure’s behaviour in a case study on the uniqueness of Greater London boroughs, based on real-world socioeconomic variables. This initial investigation indicates that u can support exploratory data analysis.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Multivariate statistics
  • Information systems → Geographic information systems
Keywords
  • uniqueness
  • distinctiveness
  • similarity
  • outlier detection
  • multivariate data

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Andrea Ballatore, Michela Bertolotto, and David C. Wilson. An evaluative baseline for geo-semantic relatedness and similarity. GeoInformatica, 18:747-767, 2014. URL: https://doi.org/10.1007/s10707-013-0197-8.
  2. Andrea Ballatore, Michela Bertolotto, and David C Wilson. A structural-lexical measure of semantic similarity for geo-knowledge graphs. ISPRS International Journal of Geo-Information, 4(2):471-492, 2015. URL: https://doi.org/10.3390/ijgi4020471.
  3. Andrea Ballatore, Stefano Cavazzi, and Jeremy Morley. The context of outdoor walking: A classification of user‐generated routes. The Geographical Journal, 2023. URL: https://doi.org/10.1111/geoj.12511.
  4. K Robert Clarke and Richard M Warwick. A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology, 35(4):523-531, 1998. Google Scholar
  5. Pamela J Ludford, Dan Cosley, Dan Frankowski, and Loren Terveen. Think different: increasing online community participation using uniqueness and group dissimilarity. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 631-638, New York, 2004. ACM. Google Scholar
  6. Bennet B Murdock Jr. The distinctiveness of stimuli. Psychological review, 67(1):16-31, 1960. Google Scholar
  7. Yazhou Ren, Jingyu Pu, Zhimeng Yang, Jie Xu, Guofeng Li, Xiaorong Pu, Philip S Yu, and Lifang He. Deep clustering: A comprehensive survey. arXiv preprint, 2022. URL: https://arxiv.org/abs/2210.04142.
  8. Haggai Roitman, David Carmel, Yosi Mass, and Iris Eiron. Modeling the uniqueness of the user preferences for recommendation systems. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 777-780, New York, 2013. ACM. Google Scholar
  9. Jonathan Schifferes. Mapping heritage. RSA Journal, 161(5563):10-13, 2015. Google Scholar
  10. Hongzhi Wang, Mohamed Jaward Bah, and Mohamed Hammad. Progress in outlier detection techniques: A survey. IEEE Access, 7:107964-108000, 2019. Google Scholar
  11. HG Washington. Diversity, biotic and similarity indices: A review with special relevance to aquatic ecosystems. Water Research, 18(6):653-694, 1984. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail