Towards Statistically Significant Taxonomy Aware Co-Location Pattern Detection (Short Paper)

Authors Subhankar Ghosh , Arun Sharma , Jayant Gupta , Shashi Shekhar

Document Identifiers

Author Details

Subhankar Ghosh
  • Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
Arun Sharma
  • Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
Jayant Gupta
  • Oracle Inc., Nashua, NH, USA
Shashi Shekhar
  • Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA

Subhankar Ghosh, Arun Sharma, Jayant Gupta, and Shashi Shekhar. Towards Statistically Significant Taxonomy Aware Co-Location Pattern Detection (Short Paper). In 16th International Conference on Spatial Information Theory (COSIT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 315, pp. 25:1-25:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Given a collection of Boolean spatial feature types, their instances, a neighborhood relation (e.g., proximity), and a hierarchical taxonomy of the feature types, the goal is to find the subsets of feature types or their parents whose spatial interaction is statistically significant. This problem is for taxonomy-reliant applications such as ecology (e.g., finding new symbiotic relationships across the food chain), spatial pathology (e.g., immunotherapy for cancer), retail, etc. The problem is computationally challenging due to the exponential number of candidate co-location patterns generated by the taxonomy. Most approaches for co-location pattern detection overlook the hierarchical relationships among spatial features, and the statistical significance of the detected patterns is not always considered, leading to potential false discoveries. This paper introduces two methods for incorporating taxonomies and assessing the statistical significance of co-location patterns. The baseline approach iteratively checks the significance of co-locations between leaf nodes or their ancestors in the taxonomy. Using the Benjamini-Hochberg procedure, an advanced approach is proposed to control the false discovery rate. This approach effectively reduces the risk of false discoveries while maintaining the power to detect true co-location patterns. Experimental evaluation and case study results show the effectiveness of the approach.

Subject Classification

ACM Subject Classification
  • Information systems → Data mining
  • Computing methodologies → Spatial and physical reasoning
  • Co-location patterns
  • spatial data mining
  • taxonomy
  • hierarchy
  • statistical significance
  • false discovery rate
  • family-wise error rate


