Towards Ordinal Data Science

Authors: Gerd Stumme, Dominik Dürrschnabel, and Tom Hanika

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1

Order is one of the main instruments to measure the relationship between objects in (empirical) data. However, compared to methods that use numerical properties of objects, the amount of ordinal methods developed is rather small. One reason for this is the limited availability of computational resources in the last century that would have been required for ordinal computations. Another reason - particularly important for this line of research - is that order-based methods are often seen as too mathematically rigorous for applying them to real-world data. In this paper, we will therefore discuss different means for measuring and ‘calculating’ with ordinal structures - a specific class of directed graphs - and show how to infer knowledge from them. Our aim is to establish Ordinal Data Science as a fundamentally new research agenda. Besides cross-fertilization with other cornerstone machine learning and knowledge representation methods, a broad range of disciplines will benefit from this endeavor, including, psychology, sociology, economics, web science, knowledge engineering, scientometrics.

Gerd Stumme, Dominik Dürrschnabel, and Tom Hanika. Towards Ordinal Data Science. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 6:1-6:39, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

  author =	{Stumme, Gerd and D\"{u}rrschnabel, Dominik and Hanika, Tom},
  title =	{{Towards Ordinal Data Science}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{6:1--6:39},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.6},
  URN =		{urn:nbn:de:0030-drops-194801},
  doi =		{10.4230/TGDK.1.1.6},
  annote =	{Keywords: Order relation, data science, relational theory of measurement, metric learning, general algebra, lattices, factorization, approximations and heuristics, factor analysis, visualization, browsing, explainability}
08391 Abstracts Collection – Social Web Communities

Authors: Harith Alani, Steffen Staab, and Gerd Stumme

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

From September 21st to September 26th 2008, the Dagstuhl Seminar 08391 ``Social Web Communities'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Harith Alani, Steffen Staab, and Gerd Stumme. 08391 Abstracts Collection – Social Web Communities. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Alani, Harith and Staab, Steffen and Stumme, Gerd},
  title =	{{08391 Abstracts Collection – Social Web Communities}},
  booktitle =	{Social Web Communities},
  pages =	{1--10},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.1},
  URN =		{urn:nbn:de:0030-drops-17928},
  doi =		{10.4230/DagSemProc.08391.1},
  annote =	{Keywords: Social Web Communities, Social Network Analysis, Collaborative Tagging}
08391 Executive Summary – Social Web Communities

Authors: Harith Alani, Steffen Staab, and Gerd Stumme

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

Blogs, Wikis, and Social Bookmark Tools have rapidly emerged on the Web. The reasons for their immediate success are that people are happy to share information, and that these tools provide an infrastructure for doing so without requiring any specific skills. At the moment, there exists no foundational research for these systems, and they provide only very simple structures for organising knowledge. Individual users create their own structures, but these can currently not be exploited for knowledge sharing. The objective of the seminar was to provide theoretical foundations for upcoming Web 2.0 applications and to investigate further applications that go beyond bookmark- and file-sharing. The main research question can be summarized as follows: How will current and emerging resource sharing systems support users to leverage more knowledge and power from the information they share on Web 2.0 applications? Research areas like Semantic Web, Machine Learning, Information Retrieval, Information Extraction, Social Network Analysis, Natural Language Processing, Library and Information Sciences, and Hypermedia Systems have been working for a while on these questions. In the workshop, researchers from these areas came together to assess the state of the art and to set up a road map describing the next steps towards the next generation of social software.

Harith Alani, Steffen Staab, and Gerd Stumme. 08391 Executive Summary – Social Web Communities. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Alani, Harith and Staab, Steffen and Stumme, Gerd},
  title =	{{08391 Executive Summary – Social Web Communities}},
  booktitle =	{Social Web Communities},
  pages =	{1--5},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.2},
  URN =		{urn:nbn:de:0030-drops-17864},
  doi =		{10.4230/DagSemProc.08391.2},
  annote =	{Keywords: }
08391 Group Summary – Mining for Social Serendipity

Authors: Alexandre Passant, Ian Mulvany, Peter Mika, Nicolas Maisonneuve, Alexander Löser, Ciro Cattuto, Christian Bizer, Christian Bauckhage, and Harith Alani

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

A common social problem at an event in which people do not personally know all of the other participants is the natural tendency for cliques to form and for discussions to mainly happen between people who already know each other. This limits the possibility for people to make interesting new acquaintances and acts as a retarding force in the creation of new links in the social web. Encouraging users to socialize with people they don't know by revealing to them hidden surprising links could help to improve the diversity of interactions at an event. The goal of this paper is to propose a method for detecting extit{"surprising"} relationships between people attending an event. By extit{"surprising"} relationship we mean those relationships that are not known a-priori, and that imply shared information not directly related with the local context of the event (location, interests, contacts) at which the meeting takes place. To demonstrate and test our concept we used the Flickr community. We focused on a community of users associated with a social event (a computer science conference) and represented in Flickr by means of a photo pool devoted to the event. We use Flickr metadata (tags) to mine for user similarity not related to the context of the event, as represented in the corresponding Flickr group. For example, we look for two group members who have been in the same highly specific place (identified by means of geo-tagged photos), but are not friends of each other and share no other common interests or, social neighborhood.

Alexandre Passant, Ian Mulvany, Peter Mika, Nicolas Maisonneuve, Alexander Löser, Ciro Cattuto, Christian Bizer, Christian Bauckhage, and Harith Alani. 08391 Group Summary – Mining for Social Serendipity. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Passant, Alexandre and Mulvany, Ian and Mika, Peter and Maisonneuve, Nicolas and L\"{o}ser, Alexander and Cattuto, Ciro and Bizer, Christian and Bauckhage, Christian and Alani, Harith},
  title =	{{08391 Group Summary – Mining for Social Serendipity}},
  booktitle =	{Social Web Communities},
  pages =	{1--11},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.3},
  URN =		{urn:nbn:de:0030-drops-17910},
  doi =		{10.4230/DagSemProc.08391.3},
  annote =	{Keywords: Serendipity, online activity, context, ubiquitous computing}
08391 Group Summary – The Berners-Lee Hypothesis: Power laws and Group Structure in Flickr

Authors: Andrea Baldassarri, Alain Barrat, Andrea Capocci, Harry Halpin, Ulrike Lehner, Jose Ramasco, Valentin Robu, and Dario Taraborelli

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

An intriguing hypothesis, first suggested by Tim Berners-Lee, is that the structure of online groups should conform to a power law distribution. We relate this hypothesis to earlier work around the Dunbar Number, which is a supposed limit to the number of social contacts a user can have in a group. As preliminary results, we show that the number of contacts of a typical Flickr user, the number of groups a user belongs to, and the size of Flickr groups all follow power law distributions. Furthermore, we find some unexpected differences in the internal structure of public and private Flickr groups. For further research, we further operationalize the Berners-Lee hypothesis to suppose that users with a group membership distribution that follows a power law will produce more content for social Web systems.

Andrea Baldassarri, Alain Barrat, Andrea Capocci, Harry Halpin, Ulrike Lehner, Jose Ramasco, Valentin Robu, and Dario Taraborelli. 08391 Group Summary – The Berners-Lee Hypothesis: Power laws and Group Structure in Flickr. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Baldassarri, Andrea and Barrat, Alain and Capocci, Andrea and Halpin, Harry and Lehner, Ulrike and Ramasco, Jose and Robu, Valentin and Taraborelli, Dario},
  title =	{{08391 Group Summary – The Berners-Lee Hypothesis: Power laws and Group Structure in Flickr}},
  booktitle =	{Social Web Communities},
  pages =	{1--11},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.4},
  URN =		{urn:nbn:de:0030-drops-17893},
  doi =		{10.4230/DagSemProc.08391.4},
  annote =	{Keywords: Social group flickr powerlaw}
08391 Group Summary – The Evolution and Dynamics of Research Networks

Authors: Vladimir Batagelj, Bettina Hoser, Claudia Müller, Steffen Staab, and Gerd Stumme

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

Existing collaboration and innovation in scientific communities can be enhanced by understanding the underlying patterns und hidden relations. Social network analysis is an appropriate method to reveal such patterns. Nevertheless, research in this area is mainly focused on social networks. One promising approach is to use homophily networks as well. Furthermore, extending the static to a dynamic network model enables to understand existing interdependencies in these networks. A mathematical description of possible analyses is given. Finally, resulting research questions are illustrated and the necessity of an interdisciplinary research approach is pointed out.

Vladimir Batagelj, Bettina Hoser, Claudia Müller, Steffen Staab, and Gerd Stumme. 08391 Group Summary – The Evolution and Dynamics of Research Networks. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Batagelj, Vladimir and Hoser, Bettina and M\"{u}ller, Claudia and Staab, Steffen and Stumme, Gerd},
  title =	{{08391 Group Summary – The Evolution and Dynamics of Research  Networks}},
  booktitle =	{Social Web Communities},
  pages =	{1--8},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.5},
  URN =		{urn:nbn:de:0030-drops-17906},
  doi =		{10.4230/DagSemProc.08391.5},
  annote =	{Keywords: Homophily networks, social networks, evolution, scientific community}
08391 Working Group Summary – Analyzing Tag Semantics Across Tagging Systems

Authors: Dominik Benz, Marko Grobelnik, Andreas Hotho, Robert Jäschke, Dunja Mladenic, Vito D. P. Servedio, Sergej Sizov, and Martin Szomszor

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

The objective of our group was to exploit state-of-the-art Information Retrieval methods for finding associations and dependencies between tags, capturing and representing differences in tagging behavior and vocabulary of various folksonomies, with the overall aim to better understand the semantics of tags and the tagging process. Therefore we analyze the semantic content of tags in the Flickr and Delicious folksonomies. We find that: tag context similarity leads to meaningful results in Flickr, despite its narrow folksonomy character; the comparison of tags across Flickr and Delicious shows little semantic overlap, being tags in Flickr associated more to visual aspects rather than technological as it seems to be in Delicious; there are regions in the tag-tag space, provided with the cosine similarity metric, that are characterized by high density; the order of tags inside a post has a semantic relevance.

Dominik Benz, Marko Grobelnik, Andreas Hotho, Robert Jäschke, Dunja Mladenic, Vito D. P. Servedio, Sergej Sizov, and Martin Szomszor. 08391 Working Group Summary – Analyzing Tag Semantics Across Tagging Systems. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Benz, Dominik and Grobelnik, Marko and Hotho, Andreas and J\"{a}schke, Robert and Mladenic, Dunja and Servedio, Vito D. P. and Sizov, Sergej and Szomszor, Martin},
  title =	{{08391 Working Group Summary – Analyzing Tag Semantics Across Tagging Systems}},
  booktitle =	{Social Web Communities},
  pages =	{1--15},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.6},
  URN =		{urn:nbn:de:0030-drops-17854},
  doi =		{10.4230/DagSemProc.08391.6},
  annote =	{Keywords: Social Web Communities, Folksonomy, Tag, Semantics}
A Short Note on Social-Semiotic Networks from the Point of View of Quantitative Semantics

Authors: Alexander Mehler

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

In this extended abstract we discuss four related characteristics of semantic spaces as the standard model of meaning representation in quantitative semantics. We argue that these characteristics are challenged from the point of view of social web communities and the possibilities which they offer in terms of exploring semantic emph{and} pragmatic data. More specifically, we plead for a reconstruction of the weak contextual hypothesis in order to account for non-linguistic, pragmatic aspects of context. Finally, we mention two consequences of such a pragmatic turn, that is, in the area of named entity recognition and of language evolution.

Alexander Mehler. A Short Note on Social-Semiotic Networks from the Point of View of Quantitative Semantics. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Mehler, Alexander},
  title =	{{A Short Note on Social-Semiotic Networks from the Point of View of Quantitative Semantics}},
  booktitle =	{Social Web Communities},
  pages =	{1--5},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.7},
  URN =		{urn:nbn:de:0030-drops-17884},
  doi =		{10.4230/DagSemProc.08391.7},
  annote =	{Keywords: Semantic space, social web community, quantitative semantic weak contextual hypothesis}
Information-Theoretic Models of Tagging

Authors: Harry Halpin

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

In earlier work, we showed using Kulback-Leibler (KL) divergence that tags form a power law distribution very quickly. Yet there is one major observed deviation from the ideal power law distribution for the top 25 tags, a large "bump" in increased frequency for the top 7-10 tags. We originally hypothesized that the "bump" in the data could be caused by a preferential attachment mechanism. However, an experiment that tested both feedback and no-feedback conditions over tagging (200+ subjects) shows that the power law distribution arises regardless of any feedback effect. We hypothesize that an information-theoretic analysis of tags lead to a power law without feedback.

Harry Halpin. Information-Theoretic Models of Tagging. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

  author =	{Halpin, Harry},
  title =	{{Information-Theoretic Models of Tagging}},
  booktitle =	{Social Web Communities},
  pages =	{1--2},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8391},
  editor =	{Harith Alani and Steffen Staab and Gerd Stumme},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08391.8},
  URN =		{urn:nbn:de:0030-drops-17876},
  doi =		{10.4230/DagSemProc.08391.8},
  annote =	{Keywords: Tagging information theory feedback}
Ontology Merging with Formal Concept Analysis

Authors: Gerd Stumme

Published in: Dagstuhl Seminar Proceedings, Volume 4391, Semantic Interoperability and Integration (2005)

In this short paper, we summarize two methods for merging ontologies: FCA-Merge and OntEx. Both methods are based on Formal Concept Analysis.

Gerd Stumme. Ontology Merging with Formal Concept Analysis. In Semantic Interoperability and Integration. Dagstuhl Seminar Proceedings, Volume 4391, pp. 1-5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2005)

  author =	{Stumme, Gerd},
  title =	{{Ontology Merging with Formal Concept Analysis}},
  booktitle =	{Semantic Interoperability and Integration},
  pages =	{1--5},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2005},
  volume =	{4391},
  editor =	{Y. Kalfoglou and M. Schorlemmer and A. Sheth and S. Staab and M. Uschold},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.04391.15},
  URN =		{urn:nbn:de:0030-drops-493},
  doi =		{10.4230/DagSemProc.04391.15},
  annote =	{Keywords: Ontology Engineering , Ontology Merging , Formal Concept Analysis}
