Search Results

Documents authored by Goosen, Twan


Document
Opening Digitized Newspapers Corpora: Europeana’s Full-Text Data Interoperability Case

Authors: Nuno Freire, Antoine Isaac, Twan Goosen, Daan Broeder, Hugo Manguinhas, and Valentine Charles

Published in: OASIcs, Volume 70, 2nd Conference on Language, Data and Knowledge (LDK 2019)


Abstract
Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe’s Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana’s objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana’s newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a "full-text profile" for the Europeana Data Model, which is being applied to Europeana’s newspaper corpus.

Cite as

Nuno Freire, Antoine Isaac, Twan Goosen, Daan Broeder, Hugo Manguinhas, and Valentine Charles. Opening Digitized Newspapers Corpora: Europeana’s Full-Text Data Interoperability Case. In 2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, pp. 22:1-22:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{freire_et_al:OASIcs.LDK.2019.22,
  author =	{Freire, Nuno and Isaac, Antoine and Goosen, Twan and Broeder, Daan and Manguinhas, Hugo and Charles, Valentine},
  title =	{{Opening Digitized Newspapers Corpora: Europeana’s Full-Text Data Interoperability Case}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  pages =	{22:1--22:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2019.22},
  URN =		{urn:nbn:de:0030-drops-103869},
  doi =		{10.4230/OASIcs.LDK.2019.22},
  annote =	{Keywords: Metadata, Full-text, Interoperability, Data aggregation, Cultural Heritage, Research Infrastructures}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail