AAA4LLL - Acquisition, Annotation, Augmentation for Lively Language Learning

Authors Bartholomäus Wloka, Werner Winiwarter

Thumbnail PDF


  • Filesize: 1.53 MB
  • 15 pages

Document Identifiers

Author Details

Bartholomäus Wloka
  • University of Vienna, Centre for Translation Studies, Vienna, Austria
Werner Winiwarter
  • University of Vienna, CSLEARN - Educational Technologies, Vienna, Austria

Cite AsGet BibTex

Bartholomäus Wloka and Werner Winiwarter. AAA4LLL - Acquisition, Annotation, Augmentation for Lively Language Learning. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 29:1-29:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


In this paper we describe a method for enhancing the process of studying Japanese by a user-centered approach. This approach includes three parts: an innovative way of acquiring learning material from topic seeds, multifaceted sentence analysis to present sentence annotations, and the browser-integrated augmentation of perusing Wikipedia pages of special interest for the learner. This may result in new topic seeds to yield additional learning content, thus repeating the cycle.

Subject Classification

ACM Subject Classification
  • Information systems → Browsers
  • Computing methodologies → Lexical semantics
  • Applied computing → E-learning
  • Web-based language learning
  • augmented browsing
  • natural language annotation
  • corpus alignment
  • Japanese computing
  • semantic representation


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Mikel Artetxe and Holger Schwenk. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions ACL, 7:597-610, 2019. URL:
  2. Laura Banarescu et al. Abstract meaning representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 178-186. ACL, 2013. URL:
  3. Marta Bañón et al. ParaCrawl: Web-scale acquisition of parallel corpora. In Proceedings of the 58th Annual Meeting of the ACL, pages 4555-4567. ACL, 2020. URL:
  4. Hendrik Heuer and Daniel Buschek. Methods for the design and evaluation of HCI+NLP systems. arXiv, 2102.13461 [cs.CL], 2021. URL:
  5. Ryu Iida, Kentaro Torisawa, Jong-Hoon Oh, Canasai Kruengkrai, and Julien Kloetzer. Intra-sentential subject zero anaphora resolution using multi-column convolutional neural network. In Proceedings EMNLP 2016, pages 1244-1254. ACL, 2016. URL:
  6. Maki Kubota. Post study abroad investigation of kanji knowledge in Japanese as a second language learners. System, 69:143-152, 2017. URL:
  7. Taku Kudo and Yuji Matsumoto. Fast methods for kernel-based text analysis. In Proceedings of the 41st Annual Meeting of the ACL, pages 24-31. ACL, 2003. URL:
  8. Taku Kudo, Kaoru Yamamoto, and Yuji Matsumoto. Applying conditional random fields to Japanese morphological analysis. In Proceedings EMNLP 2004, ACL 2004, pages 230-237. ACL, 2004. URL:
  9. George A. Miller. WordNet: A lexical database for English. Commun. ACM, 38(11):39-41, 1995. URL:
  10. Makoto Morishita, Jun Suzuki, and Masaaki Nagata. JParaCrawl: A large scale web-based English-Japanese parallel corpus. In Proceedings of LREC 2020, pages 3603-3609. ELRA, 2020. Google Scholar
  11. Joakim Nivre et al. Universal Dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of LREC 2020, pages 4034-4043. European Language Resources Association, 2020. URL:
  12. Stephan Oepen et al. MRP 2020: The second shared task on cross-framework and cross-lingual meaning representation parsing. In Proceedings of the CoNLL 2020 Shared Task: Cross-Framework Meaning Representation Parsing, pages 1-22. ACL, 2020. URL:
  13. Simon Paxton. Kanji matters in a multilingual Japan. The Journal of Rikkyo University Language Center, 42:29-41, 2019. Google Scholar
  14. Harald Wahl and Werner Winiwarter. A technological overview of an intelligent integrated computer-assisted language learning (iiCALL) environment. In Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications. AACE, 2011. Google Scholar
  15. Werner Winiwarter. Mastering Japanese through augmented browsing. In Proceedings of iiWAS 2013, iiWAS '13, pages 179-188. ACM, 2013. Google Scholar
  16. Werner Winiwarter. JAMRED: a Japanese Abstract Meaning Representation EDitor. In Proceedings of iiWAS 2015, pages 11:1-11:5. ACM, 2015. URL:
  17. Werner Winiwarter. JILL: Japanese Incidental Language Learning. In Proceedings of iiWAS 2015, pages 9:1-9:9. ACM, 2015. URL:
  18. Bartholomäus Wloka. Identifying bilingual topics in Wikipedia for efficient parallel corpus extraction and building domain-specific glossaries for the Japanese-English language pair. In Proceedings of LREC 2018. ELRA, 2018. Google Scholar
  19. Bartholomäus Wloka. Automated Creation of Domain-Specific Bilingual Corpora for Machine Translation, focusing on Dissimilar Language Pairs. PhD thesis, University of Vienna, 2020. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail