Search Results

Documents authored by Bosch, Sonja


Document
Short Paper
Translation-Based Dictionary Alignment for Under-Resourced Bantu Languages

Authors: Thomas Eckart, Sonja Bosch, Dirk Goldhahn, Uwe Quasthoff, and Bettina Klimek

Published in: OASIcs, Volume 70, 2nd Conference on Language, Data and Knowledge (LDK 2019)


Abstract
Despite a large number of active speakers, most Bantu languages can be considered as under- or less-resourced languages. This includes especially the current situation of lexicographical data, which is highly unsatisfactory concerning the size, quality and consistency in format and provided information. Unfortunately, this does not only hold for the amount and quality of data for monolingual dictionaries, but also for their lack of interconnection to form a network of dictionaries. Current endeavours to promote the use of Bantu languages in primary and secondary education in countries like South Africa show the urgent need for high-quality digital dictionaries. This contribution describes a prototypical implementation for aligning Xhosa, Zimbabwean Ndebele and Kalanga language dictionaries based on their English translations using simple string matching techniques and via WordNet URIs. The RDF-based representation of the data using the Bantu Language Model (BLM) and - partial - references to the established WordNet dataset supported this process significantly.

Cite as

Thomas Eckart, Sonja Bosch, Dirk Goldhahn, Uwe Quasthoff, and Bettina Klimek. Translation-Based Dictionary Alignment for Under-Resourced Bantu Languages. In 2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, pp. 17:1-17:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{eckart_et_al:OASIcs.LDK.2019.17,
  author =	{Eckart, Thomas and Bosch, Sonja and Goldhahn, Dirk and Quasthoff, Uwe and Klimek, Bettina},
  title =	{{Translation-Based Dictionary Alignment for Under-Resourced Bantu Languages}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  pages =	{17:1--17:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2019.17},
  URN =		{urn:nbn:de:0030-drops-103819},
  doi =		{10.4230/OASIcs.LDK.2019.17},
  annote =	{Keywords: Cross-language dictionary alignment, Bantu languages, translation, linguistic linked data, under-resourced languages}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail