Representing the Under-Represented: a Dataset of Post-Colonial, and Migrant Writers

Authors Marco Antonio Stranisci , Viviana Patti , Rossana Damiano

Thumbnail PDF


  • Filesize: 0.76 MB
  • 14 pages

Document Identifiers

Author Details

Marco Antonio Stranisci
  • Department of Computer Science, University of Turin, Italy
Viviana Patti
  • Department of Computer Science, University of Turin, Italy
Rossana Damiano
  • Department of Computer Science, University of Turin, Italy

Cite AsGet BibTex

Marco Antonio Stranisci, Viviana Patti, and Rossana Damiano. Representing the Under-Represented: a Dataset of Post-Colonial, and Migrant Writers. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 7:1-7:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


In today’s media and in the Web of Data, non-Western people still suffer a lack of representation. In our work, we address this issue by presenting a pipeline for collecting and semantically encoding Wikipedia biographies of writers who are under-represented due to their non-Western origins, or their legal status in a country. The two main components of the ontology will be described, together with a framework for mapping textual biographies to their corresponding semantic representations. A description of the data set, and some examples of biographical texts conversion to the Ontology Classes, will be provided.

Subject Classification

ACM Subject Classification
  • Information systems → Ontologies
  • Ontologies
  • Knowledge Graph
  • Language Resources
  • Migrations


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Bill Ashcroft, Gareth Griffiths, and Helen Tiffin. The empire writes back: Theory and practice in post-colonial literatures. Routledge, 2003. Google Scholar
  2. Susan Brown, Patricia Clements, Isobel Grundy, Sharon Balazs, and Jeffrey Antoniuk. An introduction to the orlando project. Tulsa Studies in Women’s Literature, 26(1):127-134, 2007. Google Scholar
  3. Susan Windisch Brown, Claire Bonial, Leo Obrst, and Martha Palmer. The rich event ontology. In Proceedings of the Events and Stories in the News Workshop, pages 87-97, 2017. Google Scholar
  4. L. E. Bruni. Cultural narrative identities and the entanglement of value systems. In Differences, Similarities and Meanings: The Interplay of Differences and Similarities in Communication and Semiotics. De Gruyter Mouton, In press. Google Scholar
  5. Haibo Ding, Tianyu Jiang, and Ellen Riloff. Why is an event affective? classifying affective events based on human needs. In AAAI Workshops, pages 8-15, 2018. Google Scholar
  6. George Doddington, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel, and Ralph Weischedel. The automatic content extraction (ACE) program - tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC'04), Lisbon, Portugal, 2004. European Language Resources Association (ELRA). URL:
  7. Kristie Dotson. Tracking epistemic violence, tracking practices of silencing. Hypatia, 26(2):236-257, 2011. Google Scholar
  8. Laura Doyle and Laura Anne Doyle. Bordering on the body: The racial matrix of modern fiction and culture. Oxford University Press on Demand, 1994. Google Scholar
  9. David K Elson. Detecting story analogies from annotations of time, action and agency. In Proceedings of the LREC 2012 Workshop on Computational Models of Narrative, Istanbul, Turkey, pages 91-99, 2012. Google Scholar
  10. Leela Gandhi. Postcolonial theory: A critical introduction. Columbia University Press, 2019. Google Scholar
  11. Aldo Gangemi and Valentina Presutti. Ontology design patterns. In Handbook on ontologies, pages 221-243. Springer, 2009. Google Scholar
  12. Graham Huggan. The postcolonial exotic: Marketing the margins. Routledge, 2002. Google Scholar
  13. Hans-Ulrich Krieger and Thierry Declerck. Tmo - the federated ontology of the trendminer project. In LREC, pages 4164-4171. Citeseer, 2014. Google Scholar
  14. Hans-Ulrich Krieger and Thierry Declerck. An owl ontology for biographical knowledge. representing time-dependent factual knowledge. In BD, pages 101-110, 2015. Google Scholar
  15. Susan S Lanser. Toward a feminist narratology. Style, pages 341-363, 1986. Google Scholar
  16. Timothy Lebo, Satya Sahoo, Deborah McGuinness, Khalid Belhajjame, James Cheney, David Corsar, Daniel Garijo, Stian Soiland-Reyes, Stephan Zednik, and Jun Zhao. Prov-o: The prov ontology. Technical report, World Wide Web Consortium, 2013. URL:
  17. Stephanie M Lukin, Kevin Bowden, Casey Barackman, and Marilyn A Walker. Personabank: A corpus of personal narratives and their story intention graphs. arXiv preprint, 2017. URL:
  18. Dan P McAdams. Narrative identity. In Handbook of identity theory and research, pages 99-115. Springer, 2011. Google Scholar
  19. John McCrae, Elena Montiel-Ponsoda, and Philipp Cimiano. Integrating wordnet and wiktionary with lemon. In Linked Data in Linguistics, pages 25-34. Springer, 2012. Google Scholar
  20. John P McCrae, Julia Bosque-Gil, Jorge Gracia, Paul Buitelaar, and Philipp Cimiano. The ontolex-lemon model: development and applications. In Proceedings of eLex 2017 conference, pages 19-21, 2017. Google Scholar
  21. Pia Mikander et al. Westerners and others in finnish school textbooks. University of Helsinki, Institute of Behavioural Sciences, Studies in Education, 2016. Google Scholar
  22. Magnus Nilsson. Swedish "immigrant literature" and the construction of ethnicity. Tijdschrift voor skandinavistiek, 31(1), 2010. Google Scholar
  23. Katsuo A Nishikawa, Terri L Towner, Rosalee A Clawson, and Eric N Waltenburg. Interviewing the interviewers: Journalistic norms and racial diversity in the newsroom. The Howard Journal of Communications, 20(3):242-259, 2009. Google Scholar
  24. Ansgar Nünning. Narratology or narratologies? taking stock of recent developments, critique and modest proposals for future usages of the term. What Is Narratology? Questions and Answers Regarding the Status of a Theory, pages 239-75, 2003. Google Scholar
  25. Tim O’Gorman, Kristin Wright-Bettner, and Martha Palmer. Richer event description: Integrating event coreference with temporal, causal and bridging annotation. In Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016), pages 47-56, 2016. Google Scholar
  26. Michele Pasin and John Bradley. Factoid-based prosopography and computer ontologies: towards an integrated approach. Digital Scholarship in the Humanities, 30(1):86-97, 2015. Google Scholar
  27. Valentina Presutti and Aldo Gangemi. Content ontology design patterns as practical building blocks for web ontologies. In International Conference on Conceptual Modeling, pages 128-141. Springer, 2008. Google Scholar
  28. James Pustejovsky, José M Castano, Robert Ingria, Roser Sauri, Robert J Gaizauskas, Andrea Setzer, Graham Katz, and Dragomir R Radev. Timeml: Robust specification of event and temporal expressions in text. New directions in question answering, 3:28-34, 2003. Google Scholar
  29. Lama Saeeda, Michal Med, Martin Ledvinka, Miroslav Blaško, and Petr Křemen. Entity linking and lexico-semantic patterns for ontology learning. In European Semantic Web Conference, pages 138-153. Springer, 2020. Google Scholar
  30. Cogan Shimizu, Pascal Hitzler, Quinn Hirt, Dean Rehberger, Seila Gonzalez Estrecha, Catherine Foley, Alicia M Sheill, Walter Hawthorne, Jeff Mixter, Ethan Watrall, et al. The enslaved ontology: Peoples of the historic slave trade. Journal of Web Semantics, 63:100567, 2020. Google Scholar
  31. John Simpson and Susan Brown. From xml to rdf in the orlando project. In 2013 International Conference on Culture and Computing, pages 194-195. IEEE, 2013. Google Scholar
  32. Zhiyi Song, Ann Bies, Stephanie Strassel, Tom Riese, Justin Mott, Joe Ellis, Jonathan Wright, Seth Kulick, Neville Ryant, and Xiaoyi Ma. From light to rich ere: annotation of entities, relations, and events. In Proceedings of the the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation, pages 89-98, 2015. Google Scholar
  33. Gayatri Chakravorty Spivak. Can the subaltern speak? Die Philosophin, 14(27):42-58, 2003. Google Scholar
  34. Cui Tao, Harold R Solbrig, and Christopher G Chute. Cntro 2.0: a harmonized semantic web ontology for temporal relation inferencing in clinical narratives. AMIA summits on translational science proceedings, 2011:64, 2011. Google Scholar
  35. Cui Tao, Wei-Qi Wei, Harold R Solbrig, Guergana Savova, and Christopher G Chute. Cntro: a semantic web ontology for temporal relation inferencing in clinical narratives. In AMIA annual symposium proceedings, volume 2010, page 787. American Medical Informatics Association, 2010. Google Scholar
  36. Jian-hua Yeh. Towards a biographic knowledge-based story ontology system. In Proceedings of the 2018 International Conference on Intelligent Information Technology, pages 33-38, 2018. Google Scholar
  37. Amy Zhao Yu, Shahar Ronen, Kevin Hu, Tiffany Lu, and César A Hidalgo. Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific data, 3(1):1-16, 2016. Google Scholar
  38. Lu Zhou, Cogan Shimizu, Pascal Hitzler, Alicia M Sheill, Seila Gonzalez Estrecha, Catherine Foley, Duncan Tarr, and Dean Rehberger. The enslaved dataset: A real-world complex ontology alignment benchmark using wikibase. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 3197-3204, 2020. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail