HISTORIAE, History of Socio-Cultural Transformation as Linguistic Data Science. A Humanities Use Case

Authors Florentina Armaselu , Elena-Simona Apostol , Anas Fahad Khan , Chaya Liebeskind , Barbara McGillivray , Ciprian-Octavian Truică , Giedrė Valūnaitė Oleškevičienė

Florentina Armaselu
  • Centre for Contemporary and Digital History (Csuperscript2DH), University of Luxembourg, Luxembourg
Elena-Simona Apostol
  • Department of Computer Science and Engineering, Faculty of Automatic Control and Computer, University Politehnica of Bucharest, Romania
Anas Fahad Khan
  • Institute for Computational Linguistics lessless{}A. Zampolligreatergreater{}, National Research Council of Italy, Pisa, Italy
Chaya Liebeskind
  • Department of Computer Science, Jerusalem College of Technology, Israel
Barbara McGillivray
  • Theoretical and Applied Linguistics, Faculty of Modern and Medieval Languages and Linguistics, University of Cambridge, UK
  • The Alan Turing Institute, London, UK
Ciprian-Octavian Truică
  • Department of Computer Science and Engineering, Faculty of Automatic Control and Computer, University Politehnica of Bucharest, Romania
Giedrė Valūnaitė Oleškevičienė
  • Institute of Humanities, Mykolas Romeris University, Vilnius, Lietuva

Florentina Armaselu, Elena-Simona Apostol, Anas Fahad Khan, Chaya Liebeskind, Barbara McGillivray, Ciprian-Octavian Truică, and Giedrė Valūnaitė Oleškevičienė. HISTORIAE, History of Socio-Cultural Transformation as Linguistic Data Science. A Humanities Use Case. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 34:1-34:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)


The paper proposes an interdisciplinary approach including methods from disciplines such as history of concepts, linguistics, natural language processing (NLP) and Semantic Web, to create a comparative framework for detecting semantic change in multilingual historical corpora and generating diachronic ontologies as linguistic linked open data (LLOD). Initiated as a use case (UC4.2.1) within the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, the study will explore emerging trends in knowledge extraction, analysis and representation from linguistic data science, and apply the devised methodology to datasets in the humanities to trace the evolution of concepts from the domain of socio-cultural transformation. The paper will describe the main elements of the methodological framework and preliminary planning of the intended workflow.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Semantic networks
  • Computing methodologies → Ontology engineering
  • Computing methodologies → Temporal reasoning
  • Computing methodologies → Lexical semantics
  • Computing methodologies → Language resources
  • Computing methodologies → Information extraction
  • linguistic linked open data
  • natural language processing
  • semantic change
  • diachronic ontologies
  • digital humanities


