Predicting Distance and Direction from Text Locality Descriptions for Biological Specimen Collections

Authors Ruoxuan Liao, Pragyan P. Das, Christopher B. Jones , Niloofar Aflaki, Kristin Stock

Thumbnail PDF


  • Filesize: 0.84 MB
  • 15 pages

Document Identifiers

Author Details

Ruoxuan Liao
  • Massey Geoinformatics Collaboratory, Massey University, Auckland, New Zealand
Pragyan P. Das
  • Massey Geoinformatics Collaboratory, Massey University, Auckland, New Zealand
Christopher B. Jones
  • School of Computer Science and Informatics, Cardiff University, UK
Niloofar Aflaki
  • Massey Geoinformatics Collaboratory, Massey University, Auckland, New Zealand
Kristin Stock
  • Massey Geoinformatics Collaboratory, Massey University, Auckland, New Zealand

Cite AsGet BibTex

Ruoxuan Liao, Pragyan P. Das, Christopher B. Jones, Niloofar Aflaki, and Kristin Stock. Predicting Distance and Direction from Text Locality Descriptions for Biological Specimen Collections. In 15th International Conference on Spatial Information Theory (COSIT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 240, pp. 4:1-4:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


A considerable proportion of records that describe biological specimens (flora, soil, invertebrates), and especially those that were collected decades ago, are not attached to corresponding geographical coordinates, but rather have their location described only through textual descriptions (e.g. North Canterbury, Selwyn River near bridge on Springston-Leeston Rd). Without geographical coordinates, millions of records stored in museum collections around the world cannot be mapped. We present a method for predicting the distance and direction associated with human language location descriptions which focuses on the interpretation of geospatial prepositions and the way in which they modify the location represented by an associated reference place name (e.g. near the Manawatu River). We study eight distance-oriented prepositions and eight direction-oriented prepositions and use machine learning regression to predict distance or direction, relative to the reference place name, from a collection of training data. The results show that, compared with a simple baseline, our model improved distance predictions by up to 60% and direction predictions by up to 31%.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Machine learning
  • geospatial prepositions
  • biological specimen collections
  • georeferencing
  • natural language processing
  • locative expressions
  • locality descriptions
  • geoparsing
  • geocoding
  • geographic information retrieval
  • regression
  • machine learning


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. S. Atdağ and V. Labatut. A comparison of named entity recognition tools applied to biographical texts. In 2nd International Conference on Systems and Computer Science, pages 228-233, August 2013. Google Scholar
  2. Arthur D Chapman and John R Wieczorek. Georeferencing Best Practices. GBIF Secretariat, Copenhagen, 2020. URL:
  3. Hao Chen, Stephan Winter, and Maria Vasardani. Georeferencing places from collective human descriptions using place graphs. Journal of Spatial Information Science, 0(17):31-62, 2018. Google Scholar
  4. Guillem Collell, Luc Van Gool, and Marie-Francine Moens. Acquiring common sense spatial knowledge through implicit spatial templates. In Thirty-second AAAI conference on artificial intelligence, 2018. Google Scholar
  5. Curdin Derungs and Ross Purves. Mining nearness relations from an n-grams web corpus in geographical space. Spatial Cognition and Computation, 16, October 2016. Google Scholar
  6. André Dittrich, Maria Vasardani, Stephan Winter, Timothy Baldwin, and Fei Liu. A classification schema for fast disambiguation of spatial prepositions. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on GeoStreaming, pages 78-86. ACM, 2015. Google Scholar
  7. M.J. Egenhofer. Reasoning about binary topological relations. In Second Symposium on Large Spatial Databases, volume 525 of Lecture Notes in Computer Science, pages 143-160. Springer-Verlag, 1991. Google Scholar
  8. M. Gahegan. Proximity operators for qualitative spatial reasoning. In Spatial Information Theory A Theoretical Basis for GIS, pages 31-44. Springer Berlin / Heidelberg, 1995. Google Scholar
  9. K.P. Gapp. Angle, distance, shape and their relationship to projective relations. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pages 112-117, 1995. Google Scholar
  10. Milan Gritta, Mohammad Taher Pilevar, and Nigel Collier. A pragmatic guide to geoparsing evaluation: Toponyms, named entity recognition and pragmatics. Language Resources and Evaluation, 54, September 2019. Google Scholar
  11. Hans W. Guesgen. Reasoning about distance based on fuzzy sets. Applied Intelligence, 17:265-270, 2002. Google Scholar
  12. Q. Guo, Y. Liu, and J. Wieczorek. Georeferencing locality descriptions and computing associated uncertainty using a probabilistic approach. International Journal of Geographical Information Science, 22(10):1067-1090, 2008. Google Scholar
  13. Mark Hall, Philip Smart, and Christopher B. Jones. Interpreting spatial language in image captions. Cognitive processing, 12(1):67-94, 2011. Google Scholar
  14. Mark M. Hall and Christopher B. Jones. Generating geographical location descriptions with spatial templates: a salient toponym driven approach. International Journal of Geographical Information Science, 36(1):55-85, 2021. Google Scholar
  15. Kota Hara, Raviteja Vemulapalli, and Rama Chellappa. Designing deep convolutional neural networks for continuous object orientation estimation. arXiv preprint, 2017. URL:
  16. Annette Herskovits. Semantics and pragmatics of locative expressions. Cognitive science, 9(3):341-378, 1985. Google Scholar
  17. Morteza Karimzadeh. Performance evaluation measures for toponym resolution. In Proceedings of the 10th workshop on geographic information retrieval, pages 1-2, 2016. Google Scholar
  18. Morteza Karimzadeh, Scott Pezanowski, Alan MacEachren, and Jan Oliver Wallgrün. Geotxt: A scalable geoparsing system for unstructured text geolocation: Geotxt: A scalable geoparsing system. Transactions in GIS, 23, January 2019. Google Scholar
  19. J.D. Kelleher and F.J. Costello. Applying computational models of spatial prepositions to visually situated dialog. Computational Linguistics, 35(2):271-306, 2009. Google Scholar
  20. Anna-Katharina Lautenschütz, Clare Davies, Martin Raubal, Angela Schwering, and Eric Pederson. The influence of scale, context and spatial preposition in linguistic topology. In International Conference on Spatial Cognition, pages 439-452. Springer, 2006. Google Scholar
  21. G.D. Logan and D.D. Sadler. A computational analysis of the apprehension of spatial relations. Language and space, pages 493-529, 1996. Google Scholar
  22. Mateusz Malinowski and Mario Fritz. A pooling approach to modelling spatial relations for image retrieval and annotation. arXiv preprint, 2014. URL:
  23. George A Miller. WordNet: An electronic lexical database. MIT press, 1998. Google Scholar
  24. Reinhard Moratz and Thora Tenbrink. Spatial reference in linguistic human-robot interaction: Iterative, empirically supported development of a model of projective relations. Spatial cognition and computation, 6(1):63-107, 2006. Google Scholar
  25. Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532-1543, 2014. Google Scholar
  26. V.B. Robinson. Individual and multipersonal fuzzy spatial relations acquired using human-machine interaction. Fuzzy Sets and Systems, 113(1):133-145, 2000. Google Scholar
  27. J.R.J. Schirra. A contribution to reference semantics of spatial prepositions: The visualization problem and its solution in VITRA. The Semantics of prepositions: from mental processing to natural language processing, page 471, 1993. Google Scholar
  28. Michael Spranger and Luc Steels. Co-acquisition of syntax and semantics: An investigation in spatial language. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 1909-1915. AAAI Press, 2015. Google Scholar
  29. Kristin Stock and Javid Yousaf. Context-aware automated interpretation of elaborate natural language descriptions of location through learning from empirical data. International Journal of Geographical Information Science, 32(6):1087-1116, 2018. Google Scholar
  30. Jan Oliver Wallgrün, Alexander Klippel, and Timothy Baldwin. Building a corpus of spatial relational expressions extracted from web documents. In Proceedings of the 8th workshop on geographic information retrieval, GIR '14, New York, NY, USA, 2014. Association for Computing Machinery. URL:
  31. M. Worboys. Nearness relations in environmental space. International Journal of Geographic Information Science, 15(7):633-651, 2001. Google Scholar
  32. Xiaobai Yao and Jean-Claude Thill. How far is too far? - A statistical approach to context-contingent proximity modeling. Transactions in GIS, 9(2):157-178, 2005. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail