Discrepancies Between Database- and Pragmatically Driven NLG: Insights from QUD-Based Annotations

Authors Christoph Hesse, Maurice Langner, Anton Benz, Ralf Klabunde

Thumbnail PDF


  • Filesize: 465 kB
  • 9 pages

Document Identifiers

Author Details

Christoph Hesse
  • Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, Germany
Maurice Langner
  • Department of Linguistics, Ruhr-University Bochum, Germany
Anton Benz
  • Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, Germany
Ralf Klabunde
  • Department of Linguistics, Ruhr-University Bochum, Germany

Cite AsGet BibTex

Christoph Hesse, Maurice Langner, Anton Benz, and Ralf Klabunde. Discrepancies Between Database- and Pragmatically Driven NLG: Insights from QUD-Based Annotations. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 32:1-32:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


We present annotation findings when using an annotated corpus of driving reports as informational texts with an elaborated pragmatics for the automatic generation of corresponding texts. The generation process requires access to a database providing the technical details of the vehicles, as well as an annotated corpus for sophisticated, pragmatically motivated text planning. We focus on the annotation results since they are the basic framework for linking text planning with database queries and microplanning. We show that the annotations point to a variety of linguistic phenomena that have received little or no attention in the literature so far, and they raise corresponding questions regarding the access to information from databases for the generation process.

Subject Classification

ACM Subject Classification
  • Applied computing → Arts and humanities
  • NLG
  • question-under-discussion analysis
  • information structure
  • database retrieval


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Christin Beck, Hannah Booth, Mennatallah El-Assady, and Miriam Butt. Representation problems in linguistic annotations: Ambiguity, variation, uncertainty, error and bias. In Proceedings of the 14th Linguistic Annotation Workshop, pages 60-73, Barcelona, Spain, 2020. Association for Computational Linguistics. URL: https://www.aclweb.org/anthology/2020.law-1.6.
  2. Jean Carletta. Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics, 22 (2):249-254, 1996. Google Scholar
  3. Lauri Carlson. Dialogue Games: An Approach to Discourse Analysis. Reidel, Dordrecht, 1983. Google Scholar
  4. Kordula De Kuthy, Nils Reiter, and Arndt Riester. QUD-based annotation of discourse structure and information structure: Tool and evaluation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 2018. European Language Resources Association (ELRA). URL: https://www.aclweb.org/anthology/L18-1304.
  5. Albert Gatt and Emiel Krahmer. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61:65-170, 2018. Google Scholar
  6. Jonathan Ginzburg. Interrogatives: Questions, facts, and dialogue. In Shalom Lappin, editor, Handbook of Contemporary Semantic Theory, pages 359-423. Blackwell, Oxford, 1996. Google Scholar
  7. Daniel Glatz and Ralf Klabunde. Focus as perspectivation. Linguistics, 5(41):947-977, 2003. Google Scholar
  8. Zachary Kimo Stine and Nitin Agarwal. Comparative discourse analysis using topic models: Contrasting perspectives on china from reddit. In International Conference on Social Media and Society, SMSociety'20, page 73–84, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3400806.3400816.
  9. Wolfgang Klein and Christiane von Stutterheim. Quaestio und referentielle Bewegung in Erzählungen. Linguistische Berichte, 109:163-183, 1987. Google Scholar
  10. Angelika Kratzer and Elisabeth Selkirk. New vs. given. In Daniel Altshuler and Jessica Rett, editors, The Semantics of Plurals, Focus, Degrees and Times, pages 157-162. Springer, 2019. Google Scholar
  11. Manfred Krifka. Association with focus phrases. In Valerie Molnár and Susanne Winkler, editors, The architecture of focus, pages 105-136. Mouton de Gruyter, 2006. Google Scholar
  12. Kordula De Kuthy, Nils Reiter, and Arndt Riester. QUD-Based Annotation of Discourse Structure and Information Structure: Tool and Evaluation. In Nicoletta Calzolari et al., editor, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, France, May 2018. European Language Resources Association (ELRA). Google Scholar
  13. Knud Lambrecht. Information Structure and Sentence Form. Cambridge University Press, 1994. Google Scholar
  14. Arndt Riester. Constructing QUD trees. In Malte Zimmermann, Klaus von Heusinger, and Edgar Onea, editors, Questions in Discourse: Pragmatics, volume 2, pages 403-443. Brill, Leiden, 2019. Google Scholar
  15. Arndt Riester, Lisa Brunetti, and Kordula De Kuthy. Annotation guidelines for questions under discussion and information structure. In Katharina Haude Evangelia Adamou and Martine Vanhove, editors, Information Structure in Lesser-described Languages. Studies in prosody and syntax, pages 403-443. John Benjamins, Amsterdam, 2018. Google Scholar
  16. Craige Roberts. Information structure in discourse: Toward an integrated formal theory of pragmatics. In Jar Hak Yoon and Andreas Kathol, editors, OSU Working Papers in Linguistics, volume 49, pages 91-136. The Ohio State University, Department of Linguistics, Ohio, 1996. Google Scholar
  17. Judith Tonhauser. Diagnosing (not-)at-issue content. In E. Bogal-Allbritten, editor, Proceedings of the Sixth Conference on the Semantics of Under-represented Languages in the Americas and SULA-Bar, Anherst, 2012. GLSA Publications. Google Scholar
  18. Jan van Kuppevelt. Discourse structure, topicality, and questioning. Journal of Linguistics, 31:109-147, 1995. Google Scholar
  19. Christiane von Stutterheim. Einige Prinzipien des Textaufbaus: Empirische Untersuchungen zur Produktion mündlicher Texte, volume 184 of Reihe Germanistische Linguistik. Niemeyer Verlag, Tübingen, 1997. Google Scholar
  20. S. Williams and Ehud Reiter. Generating basic skills reports for low-skilled readers. Natural Language Engineering, 14 (4):495-525, 2008. Google Scholar
  21. Antoine Widlöcher Yann Mathet and Jean-Philippe Métivier. The unified and holistic method gamma (γ) for inter-annotator agreement measure and alignment. Computational Linguistics, 3(41):437-479, 2015. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail