Animacy Detection in Stories

Authors Folgert Karsdorp, Marten van der Meulen, Theo Meder, Antal van den Bosch



PDF
Thumbnail PDF

File

OASIcs.CMN.2015.82.pdf
  • Filesize: 0.66 MB
  • 16 pages

Document Identifiers

Author Details

Folgert Karsdorp
Marten van der Meulen
Theo Meder
Antal van den Bosch

Cite AsGet BibTex

Folgert Karsdorp, Marten van der Meulen, Theo Meder, and Antal van den Bosch. Animacy Detection in Stories. In 6th Workshop on Computational Models of Narrative (CMN 2015). Open Access Series in Informatics (OASIcs), Volume 45, pp. 82-97, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)
https://doi.org/10.4230/OASIcs.CMN.2015.82

Abstract

This paper presents a linguistically uninformed computational model for animacy classification. The model makes use of word n-grams in combination with lower dimensional word embedding representations that are learned from a web-scale corpus. We compare the model to a number of linguistically informed models that use features such as dependency tags and show competitive results. We apply our animacy classifier to a large collection of Dutch folktales to obtain a list of all characters in the stories. We then draw a semantic map of all automatically extracted characters which provides a unique entrance point to the collection.
Keywords
  • animacy detection
  • word embeddings
  • folktales

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jelke Bloem and Gosse Bouma. Automatic animacy classification for Dutch. Computational Linguistics in the Netherlands Journal, 3:82-102, 2013. Google Scholar
  2. Gosse Bouma, Gertjan Van Noord, and Robert Malouf. Alpino: Wide-coverage computational analysis of dutch. Language and Computers, 37(1):45-59, 2001. Google Scholar
  3. Samuel Bowman and Harshit Chopra. Automatic animacy classification. In Proceedings of the NAACL - HLT 2012 Student Research Workshop, pages 7-10, 2012. Google Scholar
  4. Bernard Comrie. Language Universals and Linguistic Typology. University of Chicago Press, 2nd edition, 1989. Google Scholar
  5. Tom De Smedt and Walter Daelemans. Pattern for Python. Journal of Machine Learning Research, 13:2031-2035, 2012. Google Scholar
  6. Daniel Dennett. The Intentional Stance. Cambridge, Massachusetts: The MIT Press, 1996. Google Scholar
  7. Richard Evans and Constantin Orăsan. Improving anaphore resolution by identifying animate entities in texts. In Proceedings of the Discourse Anaphora and Reference Resolution Conference, pages 154-162, 2000. Google Scholar
  8. Tao Gao, Brian Scholl, and Gregory McCarthy. Dissociating the detection of intentionality from animacy in the right posterior superior temporal sulcus. The Journal of neuroscience: the official journal of the Society for Neuroscience, 32(41):14276-14280, 2012. Google Scholar
  9. Emiel Krahmer Jorrig Vogels and Alfons Maes. When a stone tries to climb up a slope: the interplay between lexical and perceptual animacy in referential choices. Frontiers in Psychology, 4(154):1-15, 2013. Google Scholar
  10. Folgert Karsdorp, Peter Van Kranenburg, Theo Meder, and Antal Van den Bosch. Casting a spell: Indentification and ranking of actors in folktales. In F Mambrini, M Passarotti, and C Sporleder, editors, Proceedings of the Second Workshop on Annotation of Corpora for Research in the Humanities (ACRH-2), pages 39-50, 2012. Google Scholar
  11. Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu, and Dan Jurafsky. Deterministic coreference resolution based on entity-centric, precision-ranked rules. Computational Linguistics, 39(4), 2013. Google Scholar
  12. Theo Meder. From a dutch folktale database towards an international folktale database. Fabula, 51(1-2):6-22, 2010. Google Scholar
  13. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of Workship at ICLR, 2013. Google Scholar
  14. Joshua Moore, Christopher Burges, Erin Renshaw, and Wen tau Yih. Animacy detection with voting models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 55-60, 2013. Google Scholar
  15. Mante S. Nieuwland and Jos J.A. van Berkum. When Peanuts Fall in Love: N400 Evidence for the Power of Discourse. Journal of Cognitive Neuroscience, 18(7):1098-1111, 2005. Google Scholar
  16. John Opfer. Identifying living and sentient kinds from dynamic information: The case of goal-directed versus aimless autonomous movement in conceptual change. Cognition, 86(2):97-122, 2002. Google Scholar
  17. Constantin Orăsan and Richard Evans. Learning to identify animate references. In Walter Daelemans and Rémi Zajac, editors, Proceedings of CoNLL-2001, pages 129-136, Toulouse, France, July, 6 - 7 2001. Google Scholar
  18. Constantin Orăsan and Richard Evans. Np animacy identification for anaphora resolution. Journal of Artificial Intelligence Research, 29:79-103, 2007. Google Scholar
  19. Lilja Øvrelid. Animacy classification based on morphosyntactic corpus frequencies: Some experiments with Norwegian nouns. In Kiril Simov, Dimitar Kazakov, and Petya Osenova, editors, Proceedings of the Workshop on Exploring Syntactically Annotated Corpora, pages 24-34, 2005. Google Scholar
  20. Lilja Øvrelid. Towards robust animacy classification using morphosyntactic distributional features. In Proceedings of the EACL 2006 Student Research Workshop, pages 47-54, 2006. Google Scholar
  21. Lilja Øvrelid. Linguistic features in data-driven dependency parsing. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2008), pages 25-32, 2008. Google Scholar
  22. Lilja Øvrelid and Joakim Nivre. When word order and part-of-speech tags are not enough - Swedish dependency parsing with rich linguistic features. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), pages 447-451, 2007. Google Scholar
  23. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011. Google Scholar
  24. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Proceedings of The 2014 Conference on Empirical Methods in Natural Language Processing, pages 1532-1543, Doha, Qatar, 2014. Google Scholar
  25. Anette Rosenbach. Animacy and grammatical variation - findings from english genitive variation. Lingua, 118:151-171, 2008. Google Scholar
  26. Roland Schäfer and Felix Bildhauer. Building large corpora from the web using a new efficient tool chain. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pages 486-493, Istanbul, 2012. ELRA. Google Scholar
  27. Jacques Sinninghe. Volkssprookjes uit Nederland en Vlaanderen. Kruseman, Den Haag, 1978. Google Scholar
  28. Hans-Jörg Uther. The Types of International Folktales: a Classification and Bibliography Based on the System of Antti Aarne and Stith Thompson, volume 1-3 of FF Communications. Academia Scientarium Fennica, Helsinki, 2004. Google Scholar
  29. Lauren Van der Maaten and Geoffrey Hinton. Visualizing high-dimensional data using t-sne. Journal of Machine Learning Research, pages 2579-2605, 2008. Google Scholar
  30. Cornelis Van Rijsbergen. Information Retrieval. Butterworths, 1979. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail