Predicting Distance and Direction from Text Locality Descriptions for Biological Specimen Collections

Liao, Ruoxuan; Das, Pragyan P.; Jones, Christopher B.; Aflaki, Niloofar; Stock, Kristin

doi:10.4230/LIPIcs.COSIT.2022.4

Abstract

A considerable proportion of records that describe biological specimens (flora, soil, invertebrates), and especially those that were collected decades ago, are not attached to corresponding geographical coordinates, but rather have their location described only through textual descriptions (e.g. North Canterbury, Selwyn River near bridge on Springston-Leeston Rd). Without geographical coordinates, millions of records stored in museum collections around the world cannot be mapped. We present a method for predicting the distance and direction associated with human language location descriptions which focuses on the interpretation of geospatial prepositions and the way in which they modify the location represented by an associated reference place name (e.g. near the Manawatu River). We study eight distance-oriented prepositions and eight direction-oriented prepositions and use machine learning regression to predict distance or direction, relative to the reference place name, from a collection of training data. The results show that, compared with a simple baseline, our model improved distance predictions by up to 60% and direction predictions by up to 31%.

S. Atdağ and V. Labatut. A comparison of named entity recognition tools applied to biographical texts. In 2nd International Conference on Systems and Computer Science, pages 228-233, August 2013.
Arthur D Chapman and John R Wieczorek. Georeferencing Best Practices. GBIF Secretariat, Copenhagen, 2020. URL: https://doi.org/10.15468/doc-gg7h-s853.
Hao Chen, Stephan Winter, and Maria Vasardani. Georeferencing places from collective human descriptions using place graphs. Journal of Spatial Information Science, 0(17):31-62, 2018.
Guillem Collell, Luc Van Gool, and Marie-Francine Moens. Acquiring common sense spatial knowledge through implicit spatial templates. In Thirty-second AAAI conference on artificial intelligence, 2018.
Curdin Derungs and Ross Purves. Mining nearness relations from an n-grams web corpus in geographical space. Spatial Cognition and Computation, 16, October 2016.
André Dittrich, Maria Vasardani, Stephan Winter, Timothy Baldwin, and Fei Liu. A classification schema for fast disambiguation of spatial prepositions. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on GeoStreaming, pages 78-86. ACM, 2015.
M.J. Egenhofer. Reasoning about binary topological relations. In Second Symposium on Large Spatial Databases, volume 525 of Lecture Notes in Computer Science, pages 143-160. Springer-Verlag, 1991.
M. Gahegan. Proximity operators for qualitative spatial reasoning. In Spatial Information Theory A Theoretical Basis for GIS, pages 31-44. Springer Berlin / Heidelberg, 1995.
K.P. Gapp. Angle, distance, shape and their relationship to projective relations. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pages 112-117, 1995.
Milan Gritta, Mohammad Taher Pilevar, and Nigel Collier. A pragmatic guide to geoparsing evaluation: Toponyms, named entity recognition and pragmatics. Language Resources and Evaluation, 54, September 2019.
Hans W. Guesgen. Reasoning about distance based on fuzzy sets. Applied Intelligence, 17:265-270, 2002.
Q. Guo, Y. Liu, and J. Wieczorek. Georeferencing locality descriptions and computing associated uncertainty using a probabilistic approach. International Journal of Geographical Information Science, 22(10):1067-1090, 2008.
Mark Hall, Philip Smart, and Christopher B. Jones. Interpreting spatial language in image captions. Cognitive processing, 12(1):67-94, 2011.
Mark M. Hall and Christopher B. Jones. Generating geographical location descriptions with spatial templates: a salient toponym driven approach. International Journal of Geographical Information Science, 36(1):55-85, 2021.
Kota Hara, Raviteja Vemulapalli, and Rama Chellappa. Designing deep convolutional neural networks for continuous object orientation estimation. arXiv preprint, 2017. URL: http://arxiv.org/abs/1702.01499.
Annette Herskovits. Semantics and pragmatics of locative expressions. Cognitive science, 9(3):341-378, 1985.
Morteza Karimzadeh. Performance evaluation measures for toponym resolution. In Proceedings of the 10th workshop on geographic information retrieval, pages 1-2, 2016.
Morteza Karimzadeh, Scott Pezanowski, Alan MacEachren, and Jan Oliver Wallgrün. Geotxt: A scalable geoparsing system for unstructured text geolocation: Geotxt: A scalable geoparsing system. Transactions in GIS, 23, January 2019.
J.D. Kelleher and F.J. Costello. Applying computational models of spatial prepositions to visually situated dialog. Computational Linguistics, 35(2):271-306, 2009.
Anna-Katharina Lautenschütz, Clare Davies, Martin Raubal, Angela Schwering, and Eric Pederson. The influence of scale, context and spatial preposition in linguistic topology. In International Conference on Spatial Cognition, pages 439-452. Springer, 2006.
G.D. Logan and D.D. Sadler. A computational analysis of the apprehension of spatial relations. Language and space, pages 493-529, 1996.
Mateusz Malinowski and Mario Fritz. A pooling approach to modelling spatial relations for image retrieval and annotation. arXiv preprint, 2014. URL: http://arxiv.org/abs/1411.5190.
George A Miller. WordNet: An electronic lexical database. MIT press, 1998.
Reinhard Moratz and Thora Tenbrink. Spatial reference in linguistic human-robot interaction: Iterative, empirically supported development of a model of projective relations. Spatial cognition and computation, 6(1):63-107, 2006.
Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532-1543, 2014.
V.B. Robinson. Individual and multipersonal fuzzy spatial relations acquired using human-machine interaction. Fuzzy Sets and Systems, 113(1):133-145, 2000.
J.R.J. Schirra. A contribution to reference semantics of spatial prepositions: The visualization problem and its solution in VITRA. The Semantics of prepositions: from mental processing to natural language processing, page 471, 1993.
Michael Spranger and Luc Steels. Co-acquisition of syntax and semantics: An investigation in spatial language. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 1909-1915. AAAI Press, 2015.
Kristin Stock and Javid Yousaf. Context-aware automated interpretation of elaborate natural language descriptions of location through learning from empirical data. International Journal of Geographical Information Science, 32(6):1087-1116, 2018.
Jan Oliver Wallgrün, Alexander Klippel, and Timothy Baldwin. Building a corpus of spatial relational expressions extracted from web documents. In Proceedings of the 8th workshop on geographic information retrieval, GIR '14, New York, NY, USA, 2014. Association for Computing Machinery. URL: https://doi.org/10.1145/2675354.2675702.
M. Worboys. Nearness relations in environmental space. International Journal of Geographic Information Science, 15(7):633-651, 2001.
Xiaobai Yao and Jean-Claude Thill. How far is too far? - A statistical approach to context-contingent proximity modeling. Transactions in GIS, 9(2):157-178, 2005.

Predicting Distance and Direction from Text Locality Descriptions for Biological Specimen Collections

Authors Ruoxuan Liao, Pragyan P. Das, Christopher B. Jones , Niloofar Aflaki, Kristin Stock

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Predicting Distance and Direction from Text Locality Descriptions for Biological Specimen Collections

Authors Ruoxuan Liao, Pragyan P. Das, Christopher B. Jones , Niloofar Aflaki, Kristin Stock

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message