Towards Formalizing Concept Drift and Its Variants: A Case Study Using Past COSIT Proceedings (Short Paper)

Authors Meilin Shi , Krzysztof Janowicz, Zilong Liu , Kitty Currier



PDF
Thumbnail PDF

File

LIPIcs.COSIT.2024.23.pdf
  • Filesize: 0.78 MB
  • 8 pages

Document Identifiers

Author Details

Meilin Shi
  • Department of Geography and Regional Research, University of Vienna, Austria
Krzysztof Janowicz
  • Department of Geography and Regional Research, University of Vienna, Austria
Zilong Liu
  • Department of Geography and Regional Research, University of Vienna, Austria
Kitty Currier
  • Department of Geography, University of California, Santa Barbara, CA, USA

Cite AsGet BibTex

Meilin Shi, Krzysztof Janowicz, Zilong Liu, and Kitty Currier. Towards Formalizing Concept Drift and Its Variants: A Case Study Using Past COSIT Proceedings (Short Paper). In 16th International Conference on Spatial Information Theory (COSIT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 315, pp. 23:1-23:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.COSIT.2024.23

Abstract

In the classic Philosophical Investigations, Ludwig Wittgenstein suggests that the meaning of words is rooted in their use in ordinary language, challenging the idea of fixed rules determining the meaning of words. Likewise, we believe that the meaning of keywords and concepts in academic papers is shaped by their usage within the articles and evolves as research progresses. For example, the terms natural hazards and natural disasters were once used interchangeably, but this is rarely the case today. When searching for archived documents, such as those related to disaster relief, choosing the appropriate keyword is crucial and requires a deeper understanding of the historical context. To improve interoperability and promote reusability from a Research Data Management (RDM) perspective, we examine the dynamic nature of concepts, providing formal definitions of concept drift and its variants. By employing a case study of past COSIT (Conference on Spatial Information Theory) proceedings to support these definitions, we argue that a quantitative formalization can help systematically detect subsequent changes and enhance the overall interpretation of concepts.

Subject Classification

ACM Subject Classification
  • Information systems → Digital libraries and archives
  • Information systems → Similarity measures
  • Computing methodologies → Information extraction
Keywords
  • Concept Drift
  • Semantic Aging
  • Research Data Management

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Iz Beltagy, Kyle Lo, and Arman Cohan. SciBERT: A pretrained language model for scientific text. In Conference on Empirical Methods in Natural Language Processing, 2019. URL: https://doi.org/10.18653/v1/D19-1371.
  2. Klaus Berberich, Srikanta J. Bedathur, Mauro Sozio, and Gerhard Weikum. Bridging the terminology gap in web archive search. In International Workshop on the Web and Databases, 2009. URL: https://api.semanticscholar.org/CorpusID:6709650.
  3. Giuseppe Capobianco, Danilo Cavaliere, and Sabrina Senatore. Ontodrift: a semantic drift gauge for ontology evolution monitoring. In CEUR Workshop Proceedings, 2020. URL: https://api.semanticscholar.org/CorpusID:233432249.
  4. Antske Fokkens, Serge ter Braake, Isa Maks, and Davide Ceolin. On the semantics of concept drift: Towards formal definitions of concept drift and semantic change. In Drift-a-LOD@EKAW, 2016. URL: https://ceur-ws.org/Vol-1799/Drift-a-LOD2016_paper_2.pdf.
  5. Prashant Gupta and Mark Gahegan. Categories are in flux, but their computational representations are fixed: That’s a problem. Transactions in GIS, 24:291-314, 2020. URL: https://doi.org/10.1111/tgis.12602.
  6. William L. Hamilton, Jure Leskovec, and Dan Jurafsky. Diachronic word embeddings reveal statistical laws of semantic change. ArXiv, abs/1605.09096, 2016. URL: https://api.semanticscholar.org/CorpusID:5480561.
  7. Francis Harvey, Werner Kuhn, Hardy Pundt, Yaser Bishr, and Catharina Riedemann. Semantic interoperability: A central issue for sharing geographic information. The annals of regional science, 33:213-232, 1999. URL: https://doi.org/10.1007/s001680050102.
  8. Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. Statistically significant detection of linguistic change. In Proceedings of the 24th international conference on world wide web, pages 625-635, 2015. URL: https://doi.org/10.1145/2736277.2741627.
  9. Jie Lu, Anjin Liu, Fan Dong, Feng Gu, João Gama, and Guangquan Zhang. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering, 31(12):2346-2363, 2019. URL: https://doi.org/10.1109/TKDE.2018.2876857.
  10. Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In International Conference on Learning Representations, 2013. URL: https://api.semanticscholar.org/CorpusID:5959482.
  11. Martin Raubal. Representing concepts in time. In Spatial Cognition VI. Learning, Reasoning, and Talking about Space: International Conference Spatial Cognition 2008, Freiburg, Germany, September 15-19, 2008. Proceedings 6, pages 328-343. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-87601-4_24.
  12. Christoph Schlieder. Digital heritage: Semantic challenges of long-term preservation. Semantic Web, 1(1-2):143-147, 2010. URL: https://doi.org/10.3233/SW-2010-0013.
  13. Thanos G Stavropoulos, Stelios Andreadis, Efstratios Kontopoulos, and Ioannis Kompatsiaris. Semadrift: A hybrid method and visual tools to measure semantic drift in ontologies. Journal of Web Semantics, 54:87-106, 2019. URL: https://doi.org/10.1016/j.websem.2018.05.001.
  14. Wolfgang G Stock. Concepts and semantic relations in information science. Journal of the American Society for Information Science and Technology, 61(10):1951-1969, 2010. URL: https://doi.org/10.1002/asi.21382.
  15. Tabea Tietz, Mehwish Alam, Harald Sack, and Marieke van Erp. Challenges of knowledge graph evolution from an nlp perspective. In WHiSe@ ESWC, pages 71-76, 2020. URL: https://ceur-ws.org/Vol-2695/paper8.pdf.
  16. Stella Verkijk, Ritten Roothaert, Romana Pernisch, and Stefan Schlobach. Do you catch my drift? on the usage of embedding methods to measure concept shift in knowledge graphs. In Proceedings of the 12th Knowledge Capture Conference 2023, pages 70-74, 2023. URL: https://doi.org/10.1145/3587259.3627555.
  17. Shenghui Wang, Stefan Schlobach, and Michel Klein. Concept drift and how to identify it. Journal of Web Semantics, 9(3):247-265, 2011. URL: https://doi.org/10.1016/j.websem.2011.05.003.
  18. Gerhard Widmer and Miroslav Kubat. Learning in the presence of concept drift and hidden contexts. Machine learning, 23:69-101, 1996. URL: https://doi.org/10.1007/BF00116900.
  19. Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al. The fair guiding principles for scientific data management and stewardship. Scientific data, 3(1):1-9, 2016. URL: https://doi.org/10.1038/sdata.2016.18.
  20. Ludwig Wittgenstein. Philosophical Investigations. Blackwell, Oxford, 1953. Google Scholar
  21. Yating Zhang, Adam Jatowt, Sourav S Bhowmick, and Katsumi Tanaka. The past is not a foreign country: Detecting semantically similar terms across time. IEEE Transactions on Knowledge and Data Engineering, 28(10):2793-2807, 2016. URL: https://doi.org/10.1109/TKDE.2016.2591008.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail