DROPS

Artifact

Dataset

Evaluating the Ability of Large Language Models to Reason about Cardinal Directions -- Dataset

Authors: Anthony G Cohn and Robert E Blackwell

Abstract

Cite as

Anthony G Cohn, Robert E Blackwell. Evaluating the Ability of Large Language Models to Reason about Cardinal Directions -- Dataset (Dataset). Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@misc{dagstuhl-artifact-22498,
   title = {{Evaluating the Ability of Large Language Models to Reason about Cardinal Directions -- Dataset}}, 
   author = {Cohn, Anthony G and Blackwell, Robert E},
   note = {Dataset, version 1.0., This work was supported by the Fundamental Research priority area of The Alan Turing Institute. This work was supported by the Fundamental Research priority area of The Alan Turing Institute. Cohn, Anthony G: AGC thanks the Turing’s Defence and Security programme through a partnership with the UK government in accordance with the framework agreement between GCHQ and The Alan Turing Institute, and for support provided by the Economic and Social Research Council (ESRC) under grant ES/W003473/1., swhId: \href{https://archive.softwareheritage.org/swh:1:dir:37c617e865cfba41c74743123b5d3785379caacc;origin=https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions;visit=swh:1:snp:7629d8b01a3d5e05c8ea9cf7956480d3b94b40fd;anchor=swh:1:rev:f80b374d4b36dc616425175a99844d94cd36d62d}{\texttt{swh:1:dir:37c617e865cfba41c74743123b5d3785379caacc}} (visited on 2024-11-28)},
   url = {https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions},
   doi = {10.4230/artifacts.22498},
}

@misc{dagstuhl-artifact-22498,
   title = {{Evaluating the Ability of Large Language Models to Reason about Cardinal Directions -- Dataset}}, 
   author = {Cohn, Anthony G and Blackwell, Robert E},
   note = {Dataset, version 1.0., This work was supported by the Fundamental Research priority area of The Alan Turing Institute. This work was supported by the Fundamental Research priority area of The Alan Turing Institute. Cohn, Anthony G: AGC thanks the Turing’s Defence and Security programme through a partnership with the UK government in accordance with the framework agreement between GCHQ and The Alan Turing Institute, and for support provided by the Economic and Social Research Council (ESRC) under grant ES/W003473/1., swhId: \href{https://archive.softwareheritage.org/swh:1:dir:37c617e865cfba41c74743123b5d3785379caacc;origin=https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions;visit=swh:1:snp:7629d8b01a3d5e05c8ea9cf7956480d3b94b40fd;anchor=swh:1:rev:f80b374d4b36dc616425175a99844d94cd36d62d}{\texttt{swh:1:dir:37c617e865cfba41c74743123b5d3785379caacc}} (visited on 2024-11-28)},
   url = {https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions},
   doi = {10.4230/artifacts.22498},
}

Document

Short Paper

DOI: 10.4230/LIPIcs.COSIT.2024.28

Evaluating the Ability of Large Language Models to Reason About Cardinal Directions (Short Paper)

Authors: Anthony G Cohn and Robert E Blackwell

Published in: LIPIcs, Volume 315, 16th International Conference on Spatial Information Theory (COSIT 2024)

Abstract

We investigate the abilities of a representative set of Large language Models (LLMs) to reason about cardinal directions (CDs). To do so, we create two datasets: the first, co-created with ChatGPT, focuses largely on recall of world knowledge about CDs; the second is generated from a set of templates, comprehensively testing an LLM’s ability to determine the correct CD given a particular scenario. The templates allow for a number of degrees of variation such as means of locomotion of the agent involved, and whether set in the first , second or third person. Even with a temperature setting of zero, Our experiments show that although LLMs are able to perform well in the simpler dataset, in the second more complex dataset no LLM is able to reliably determine the correct CD, even with a temperature setting of zero.

Cite as

Anthony G Cohn and Robert E Blackwell. Evaluating the Ability of Large Language Models to Reason About Cardinal Directions (Short Paper). In 16th International Conference on Spatial Information Theory (COSIT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 315, pp. 28:1-28:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{cohn_et_al:LIPIcs.COSIT.2024.28,
  author =	{Cohn, Anthony G and Blackwell, Robert E},
  title =	{{Evaluating the Ability of Large Language Models to Reason About Cardinal Directions}},
  booktitle =	{16th International Conference on Spatial Information Theory (COSIT 2024)},
  pages =	{28:1--28:9},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-330-0},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{315},
  editor =	{Adams, Benjamin and Griffin, Amy L. and Scheider, Simon and McKenzie, Grant},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.COSIT.2024.28},
  URN =		{urn:nbn:de:0030-drops-208432},
  doi =		{10.4230/LIPIcs.COSIT.2024.28},
  annote =	{Keywords: Large Language Models, Spatial Reasoning, Cardinal Directions}
}

Search Results

Documents authored by Blackwell, Robert E

Evaluating the Ability of Large Language Models to Reason about Cardinal Directions -- Dataset

Abstract

Cite as

Evaluating the Ability of Large Language Models to Reason About Cardinal Directions (Short Paper)

Abstract

Cite as

Thanks for your feedback!

Could not send message