Abstract 1 Executive Summary 2 Table of Contents 3 Overview of Talks 4 TUESDAY Working Groups 5 TUESDAY Demos 6 WEDNESDAY Working Groups 7 THURSDAY Working Groups 8 Participants

Social XR: The Future of Communication and Collaboration

Report from Dagstuhl Seminar 23482
Mark Billinghurst111Editor / Organizer University of South Australia – Adelaide, AU Pablo Cesar222Editor / Organizer CWI – Amsterdam, NL Mar Gonzalez-Franco333Editor / Organizer Google – Seattle, US Katherine Isbister444Editor / Organizer University of California at Santa Cruz, US Julie Williamson555Editor / Organizer University of Glasgow, GB Alexandra Kitson666Editorial Assistant / Collector Simon Fraser University – Surrey, CA
Abstract

We are rapidly moving towards a hybrid world where communication and collaboration occur in reality, virtuality, and everywhere in-between. But, are current technologies ready for such a shift? Social Extended Reality (XR) systems promise to overcome the limitations of current real-time teleconferencing systems, enabling a better sense of immersion, enhancing the sense of presence, and fostering more successful interpersonal interactions. The possibility for familiar, meaningful, and strategically heightened social interaction in XR has positioned immersive technology as the future of real-time communication and collaboration. This Dagstuhl Seminar gathered academics and practitioners from different disciplines to address the open challenges of immersive interaction including the ethical, legal and societal aspects of possible futures. Participants shared their work through rapid talks and XR demos. The seminar organizers provided provocation talks before small groups convened to discuss three topics over three days: XR design approaches, ethics and values; capturing and modelling; and proxemics, metrics, instrumentation and evaluation. We conclude with a set of grand challenges in the field of social XR in the areas of empathic computing, blended reality, assets and datasets, and survey instruments.

Keywords and phrases:
Social XR, Augmented Reality, Virtual Reality, Extended Reality, Social Computing
Seminar:
November 26 – December 1, 2023 – https://www.dagstuhl.de/23482
2012 ACM Subject Classification:
Human-centered computing Mixed / augmented reality
; Human-centered computing Collaborative and social computing
Copyright and License:
[Uncaptioned image] Except where otherwise noted, content of this report is licensed under a Creative Commons BY 4.0 International license

1 Executive Summary

Mark Billinghurst (University of South Australia, Adelaide, AU)
Pablo Cesar (CWI, Amsterdam, NL)
Mar Gonzalez-Franco (Google, Seattle, US)
Katherine Isbister (University of Santa Cruz, US)
Alexandra Kitson (Simon Fraser University, Surrey, CA)
Julie Williamson (University of Glasgow, GB)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Mark Billinghurst, Pablo Cesar, Mar Gonzalez-Franco, Katherine Isbister, Alexandra Kitson, and Julie Williamson

This Dagstuhl Seminar focused on Social XR and the future of communication and collaboration, with a particular interest on:

  • Capturing and modelling of humans, ensuring realistic representation of the users and thus allowing for realistic and immersive experiences;

  • Digital proxemics and social metrics, that help and enrich communication and collaboration between the participants;

  • Instrumentation and evaluation, focusing on the possibility of evaluating and monitoring the experience of the users;

  • Principles of Social XR, for making sure that the right values and principles are followed;

  • Exploration of design approaches for Social XR, that support communication and connection by enabling and strategically heightening social signalling and dynamics.

To start the seminar, each participant presented relevant social XR research through rapid talks (see Section 2 and Figure 2), which were then used to finalize the topics to discuss on the remaining days of the seminar. We settled on three topics that participants in small groups would focus on, led by members of the organizing committee with a provocation at the start of each day:
TUESDAY: Social XR Design Approaches, Ethics, and Values led by Katherine Isbister and Alexandra Kitson,
WEDNESDAY: Capturing and Modeling Social XR led by Mark Billinghurst and Mar Gonzalez-Franco,
THURSDAY: Proxemics, Metrics, Instrumentation, and Evaluation of Social XR led by Pablo Cesar and Julie Williamson,
THURSDAY: Grand Challenges of Social XR led by the organizers.

As a major result of the seminar, we identified the following grand challenges:

  1. 1.

    Subjectivity of scientific evaluation of empathy

  2. 2.

    Ethical concerns of sharing physiological data and social XR relationships

  3. 3.

    Ethics of the growing digital divide

  4. 4.

    Blending realities, beyond visual and audio

  5. 5.

    Semantic understanding of the physical and social context

  6. 6.

    Social stitching to create a cohesive scene or world

  7. 7.

    Preserving privacy given the increasing fidelity of capture devices

  8. 8.

    Tension between transparency and social superpowers

  9. 9.

    Devising a shared platform that facilitates collaborative recording, replaying, and immersive experiences

  10. 10.

    New metrics and questionnaires for social XR

In addition to the rapid talks and topic discussions, participants shared demos of their work on Tuesday:

  • Alexandra Kitson – Embodied Telepresent Connection: An interactive art piece designed to support connection and pseudohaptics through visuals and audio http://ispace.iat.sfu.ca/project/etc/

  • Alijosa Smolic – Volograms: record a video and turn it into an AR experience https://www.volograms.com/

  • Anthony Steed – Ubiq: a free, open-source networking library for research, teaching and development https://ubiq.online/

  • Zerrin Yumak – FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis using Self-Supervised Speech Representation Learning https://github.com/galib360/FaceXHuBERT

Social activities in the music room, cellar, games room, and sauna led to some discussions around capturing and modelling leading into Wednesday’s session (see this social media post for some examples) as well as an impromptu research study on cross-reality asymmetrical co-located social games by playing two games: DAVIGO and Acron.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: Participants enjoy social events around Dagstuhl.

In terms of outputs and future collaborations, we plan to share our findings in an opinion article or forum. We have analyzed, written, and submitted the results of the impromptu research study in the cellar and games room to a top-tier conference in our field. Additionally, we have discussed a potential book on social XR with the seminar participants based on the topics of this seminar. We plan to hold follow-up events and workshops at relevant conferences to further explore the grand challenges that we identified through this seminar.

2 Table of Contents

Executive Summary

Mark Billinghurst, Pablo Cesar, Mar Gonzalez-Franco, Katherine Isbister, Alexandra Kitson, and Julie Williamson

Overview of Talks

Shared Realities in Social XR

Sun Joo (Grace) Ahn

Towards Volumetric Video Conferencing

Pablo Cesar

Ubiquitous Metadata: Integrated Fingerprints for Real-World Object Identification and Augmentation

M. Doga Dogan

From Multi-modal to Multi-device interactions in XR

Eric J. Gonzalez

Perceptual Manipulations in XR During Face-to-Face Social Interactions

Jan Gugenheimer

Meaningful Social VR Environments

Linda Hirsch

Designing Social VR Meeting Spaces

Katherine Isbister

Social Communication and Connection in XR

Alexandra Kitson

Augmented Social Perception

Kai Kunze

Designing and Evaluating User Experiences in Social Virtual Reality (VR)

Jie Li

Philosophy and XR Technology

Neil McDonnell

The Empathic Metaverse: An Assistive Bioresponsive Platform for Emotional Experience Sharing in Social XR

Yun Suen Pai

Social VR for Social Skills Training

Sylvia Xueni Pan

Goal-adaptive Collaborative Spatial Experiences with GenAI

Payod Panda

What Can Social XR Do for Us that Traditional Communication Technology Cannot, and How Can We Know?

Alexander Raake

Instrumenting for Understanding Social XR Experiences

David A. Shamma

Bringing Real People into XR

Aljosa Smolic

Requirements for Future Social XR Applications

Anthony Steed

Adaptive Social XR

Kashyap Todi

Human-centric Factors in Immersive Communication

Irene Viola

Fostering Well-being, Communication & Empathy with XR

Nadine Wagener

AI-driven 3D Digital Humans in XR

Zerrin Yumak

TUESDAY Working Groups

Group A: Synchronizing Asymmetric Individual & Shared User Perspectives in XR

Linda Hirsch, Katherine Isbister, Payod Panda, David Ayman Shamma, Kashyap Todi, Zerrin Yumak

Group B: What Kind of XR Future Do We Hope to Have? (Or Rather: “What Kind of Aspects Do We Foresee to be Relevant for an XR Future”?)

Eric J. Gonzalez, Josh Greenberg, Jie Li, Alexander Raake, Aljosa Smolic

Group C: Redefining Common Grounds in Social XR

Sun Joo (Grace) Ahn, M. Doga Dogan, Jan Gugenheimer, Yun Suen Pai, Sylvia Xueni Pan

Group D: Development and Implementation of Social XR Systems

Kai Kunze, Neil McDonnell, Anthony Steed, Irene Viola, Nadine Wagener

TUESDAY Demos

WEDNESDAY Working Groups

THURSDAY Working Groups

Group A: Empathic Computing

Sun Joo (Grace) Ahn, Mark Billinghurst, Linda Hirsch, Alexandra Kitson, Yun Suen Pai, Nadine Wagener

Group B: Blended Reality

M. Doga Dogan, Eric J. Gonzalez, Katherine Isbister, Kai Kunze, Payod Panda, Sylvia Xueni Pan

Group C: Assets and Datasets

Jan Gugenheimer, David Ayman Shamma, Aljosa Smolic, Kashyap Todi, Zerrin Yumak

Group D: Survey Instruments

Jie Li, Sylvia Xueni Pan, Alexander Raake, Anthony Steed, Irene Viola

Participants

3 Overview of Talks

Refer to caption
Refer to caption
Refer to caption
Figure 2: Participants present their work on social XR. Left: Julie Williamson talks about the seminar themes. Center: participants listen to presentations and provocations. Right: Anthony Steed presents Ubiq.

3.1 Shared Realities in Social XR

Sun Joo (Grace) Ahn (University of Georgia – Athens, US; sjahn@uga.edu)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Sun Joo (Grace) Ahn

I am the founding director of the Center for Advanced Computer-Human Ecosystems at University of Georgia (https://www.ugavr.com). Our work has looked at how virtual experiences can transfer into the physical world to continue changing attitudes, behaviors, and worldviews. In particular, social XR can provide a common ground of shared experiences to multiple users, leading to stronger group cohesion. In addition to sharing experiences interpersonally, social XR allows human users to share experiences with virtual agents. Although prior literature in XR has generally focused on embodiment to provide users to shared experiences of others, emerging evidence points to shared experiences that allow users to walk alongside others being more effective than briefly embodying an avatar body. Our recent research project investigates how social XR with other users and virtual agents can establish a sense of shared reality and lead to the generation of collective minds and empathy. We will present how we have integrated community-based participatory research approaches in developing a prototype virtual experience of sharing the reality of redlining, a past zoning policy that has created long-term structural inequity, leading to negative public health issues for marginalized communities across the US.

3.2 Towards Volumetric Video Conferencing

Pablo Cesar (CWI – Amsterdam, NL; P.S.Cesar@cwi.nl)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Pablo Cesar
I lead the Distributed and Interactive Systems (DIS) group at Centrum Wiskunde & Informatica, CWI, (The National Research Institute for Mathematics and Computer Science in the Netherlands) and I am Professor (“Human-Centered Multimedia Systems Chair) at TU Delft, in the Multimedia Computing group. The work in the group combines human-computer interaction and multimedia systems, focusing on facilitating and improving the way people use interactive systems and how people communicate with each other. We combine data science with a strong human-centric, empirical approach to understand the experience of users. This enables us to design and develop next generation intelligent and empathic systems. With Social Extended Reality (XR) emerging as a new medium, where users can remotely experience immersive content with others, the vision of a true feeling of ‘being there together’ has become a realistic goal. Together with my group, we have been working towards such a goal, including the development and deployment of an open-source volumetric video conference system, VR2Gather. The system allows for highly-realistic digital humans, based on point cloud capture, encoding, and transmission. Based on results from practical case studies in different sectors (e.g., cultural heritage, performing arts…) in projects such as 5D Culture, TRANSMIXR, and MediaScape XR, we can better understand the existing challenges and to discover the opportunities of this new medium.

3.3 Ubiquitous Metadata: Integrated Fingerprints for Real-World Object Identification and Augmentation

M. Doga Dogan (MIT – Cambridge, US; doga@mit.edu)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © M. Doga Dogan
In the evolving landscape of immersive experiences, my research focuses on seamlessly integrating physical objects with their digital counterparts through innovative identification, sensing, and tagging methods. By embedding machine-readable tags that convey an object’s identity, origin, and function, I establish gateways to “ubiquitous metadata” in the real world. This concept, akin to digital file metadata, empowers users to augment real-world objects with multimedia content, foster interactive experiences in AR/VR, and retrieve contextual information via digital product passports.

My work intersects with Social XR by enabling collaborative interactions in real-world scenarios. By contextualizing and identifying objects in XR, users may for example collaboratively annotate, share, and interact with their physical surroundings. Whether taking or checking notes during shopping, or enhancing home communication with dynamic, interactive messages, my research aims to enhance the intersection of AR and real-world collaboration. During the seminar, I am eager to explore diverse Social XR applications and address critical considerations such as privacy implications in this evolving landscape.

3.4 From Multi-modal to Multi-device interactions in XR

Eric J. Gonzalez (Google – Seattle, US; ejgonz@google.com)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Eric J. Gonzalez
I am a researcher in the Blended Interactions Research & Devices Lab at Google, where I lead the exploration of mutli-modal and multi-device experiences for XR. Currently, my work focuses on how we can leverage existing ecosystems of devices (e.g., smartphones, smartwatches) to augment and supplement natural input techniques (e.g., gaze, gesture, touch). My work connects to Social XR by enabling collaborative interaction scenarios mediated by familiar devices and modalities. For example: in the near term, it is very likely that users in immersive XR (and those around them) will have a smartphone in their pocket. Supporting device-mediated interactions in XR not only allows immersed users to leverage the sensing and computation offered by their phone (e.g., for precise multi-touch input), but it also enables surrounding collaborators to view and interact with shared XR content through their own devices. I am excited to discuss the future of input in XR as well as other interesting topics such as AI-mediated collaborative experiences.

3.5 Perceptual Manipulations in XR During Face-to-Face Social Interactions

Jan Gugenheimer (TU Darmstadt, DE & Telecom Paris, FR;
jan.gugenheimer@TU Darmstadt.de)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Jan Gugenheimer
My research focuses on two directions at the intersection of XR and HCI: 1: Understand and provide software and hardware solutions on how XR technology has to change to be integrated into an everyday usage scenario (ubiquitous XR) and 2: Understand what potential negative and abusive scenarios (perceptual manipulations, dark patterns) in this future of ubiquitous XR could arise and how we can start shaping the technology to avoid those. In the field of social XR we started to explore how AR technology could impact face-to-face social interactions during ubiquitous XR usage. (https://dl.acm.org/doi/abs/10.1145/3411764.3445597, https://dl.acm.org/doi/abs/10.1145/3491102.3502140). I think one of the core challenges in XR is to understand and leverage its ability to impact the user’s perception of themselves and the real world. The biggest difference between traditional digital media (smartphones and PCs) in contrast to XR is its ability to alter the user’s perception of the real world (not only the digital). This comes with so many exciting possibilities to improve the technology (e.g., redirected walking, haptic illusions) but also with potential risks (perceptual manipulations). The methods we use in our research are partially grounded in traditional engineering approaches (prototyping and empirical evaluations) but are now extended more and more with design research methods (speculative design, design fiction). I am very eager to discuss how the abilities of XR to alter the user’s perception of the world can impact (positive and negative) face-to-face social interactions in the future.

3.6 Meaningful Social VR Environments

Linda Hirsch (LMU Munich, DE; linda.hirsch@ifi.lmu.de)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Linda Hirsch
My research focuses on implicitly increasing socio-cultural connectedness and awareness in shared environments. This includes tracking, moderating, and visualizing users’ activities in VR over different periods (e.g., what happened the day before or two years ago). By this, meaningful user experiences are created by contextualizing VR interactivity and fostering a deeper connection with the VR environment and other users. The means to realize a deeper connection are endless. Yet, the challenge is based on balancing the amount and quality of information, the communication channels, and the translation of comprehensible information. For this, I apply methods and theory from anthropology, materials experience design, and environmental psychology research, e.g., meaning of place framework of place attachment, in addition to common HCI methods. Choosing the “right” method is context-dependent, e.g., are we looking at physical or virtual reality contexts? In addition, it is very important to consider the long-term effects and the “history” of a shared VR space regarding its socio-cultural effects on the virtual, physical, and mixed social reality.

3.7 Designing Social VR Meeting Spaces

Katherine Isbister (University of California, Santa Cruz, US; katherine.isbister@ucsc.edu)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Katherine Isbister
For the last several years my research team has been building Research through Design prototypes of social VR meeting spaces, taking a “beyond being there” approach, with funding first from Mozilla, then from the US National Science Foundation. We’ve written papers about the general approach (see https://dl.acm.org/doi/10.1145/3411763.3450377 and https://www.tandfonline.com/doi/abs/10.1080/07370024.2021.1994860) and have released a toolkit that others are welcome to use (http://info.socialsuperpowers.net/) from this work. More recently, we’ve received a grant from the Sloan Foundation to build social VR prototypes to support scientific sensemaking around spatialized data.

3.8 Social Communication and Connection in XR

Alexandra Kitson (Simon Fraser University – Surrey, CA; akitson@sfu.ca)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Alexandra Kitson

Joint work of: Alexandra Kitson, John Desnoyers-Stewart, Ekaterina R. Stepanova, Pinyao Liu, Patrick Parra Pennefather, Vladislav Ryzhov, Bernhard E. Riecke, Alissa Antle, Petr Slovak, Katherine Isbister, Ashu Adhikari, Kenneth Karthik

I design, develop, implement, and evaluate VR applications for both social transformation and emotional well-being. Two projects that relate to Social XR: (1) “Embodied Telepresent Connection” gives the illusion of social touch and bodily connection through visuals, sounds, biosignals, and embodied metaphors in VR, connecting distanced people in the same virtual space. (2) Go-along interviews in VRChat with adolescents to better understand the social spaces youth are using and the distinctive features of those spaces that contribute to successful emotion regulation. Some of the core challenges I see in the field:

  1. 1.

    Interaction and communication with others in social XR.

  2. 2.

    Representing people in a virtual space, including across mixed platforms.

  3. 3.

    Safeguards and spaces for vulnerable people (e.g., children) in social XR.

  4. 4.

    Design tools that aren’t prototyping.

I will share my experiences using participatory and embodied design methods, as well as ideas around pseudo-haptics and biosignal integration to enhance social communication and connection in XR. I’m most interested in discussing design approaches, values, and ethics.

3.9 Augmented Social Perception

Kai Kunze (Keio University – Yokohama, JP; kai.kunze@pm.me)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Kai Kunze

My research is centered on the exploration and development of technology tool-sets designed to augment human capabilities and overcome our physical and cognitive limitations. The human head, being the center of our senses, vital signs, and actions, presents an ideal location for simultaneous sensing and interactions of assistance applications. By integrating sensing and interaction modalities into the form factor of eyeglasses, we can create multi-purpose wearable monitoring and assistance devices.

3.10 Designing and Evaluating User Experiences in Social Virtual Reality (VR)

Jie LI (EPAM – Hoofddorp, Noord-Holland, NL; jie_li@epam.com)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Jie Li
My interest in Social VR is focused on designing novel user experiences and developing metrics and methods to understand and measure aspects such as user engagement, cognitive load, enjoyment, quality of interaction, and social connectedness. As a researcher in the industry who also collaborates closely with academic researchers, I often observe a disconnect between the two worlds. Industry projects often prioritize application and market readiness, sometimes neglecting the foundational reasons for designing and developing social VR experiences. In contrast, academia usually concentrates on fundamental research and may overlook the practical application of lab innovations for everyday public use. The future of social XR calls for collaborative efforts between academia and industry, demanding not only advanced fundamental research but also sophisticated user experience design. This could include the development of a standardized design system for XR, containing well-tested design components that can be directly used to create the basic user experience architecture. Such collaboration will ensure that diverse users are engaged and included, with accessible hardware and software, complemented by open-sourced evaluation methods, metrics, or shared platforms to facilitate the consistent improvement of user experiences.

3.11 Philosophy and XR Technology

Neil McDonnell (University of Glasgow, GB; Neil.McDonnell@glasgow.ac.uk)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Neil McDonnell
I am a philosopher at the University of Glasgow. I used to work in the 3D viz industry and as a result, I do a lot of interdisciplinary work and lead major projects concerning XR deployments. I have a practical eye for XR deployment issues in research and education. I write about what causation is, whether virtual things are real or valuable, policy papers about XR and education, and the nature of evidence in safety systems. I approach this wide range of topics with analytic philosophy training from metaphysics. I am not an ethicist, and I do not think at all about the meaning of life. I think the big issues around social XR are about access, acceptance and adoption. Who can access and who can not – who are we leaving behind? Why are so many people so resistant to this incredible technology? We need to answer the first two before widescale ubiquitous adoption will be achieved.

3.12 The Empathic Metaverse: An Assistive Bioresponsive Platform for Emotional Experience Sharing in Social XR

Yun Suen Pai (The University of Auckland, NZ; yun.suen.pai@auckland.ac.nz)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Yun Suen Pai
My research explores the social impact of XR and how it can be used to assist, augment and understand others. The Metaverse is poised to be a future platform that redefines what it means to communicate, socialize, and interact with each other. Yet, it is important for us to consider avoiding the pitfalls of social media platforms we use today; cyberbullying, lack of transparency and an overall false mental model of society. In this seminar, I would like to discuss about the Empathic Metaverse, a virtual platform that prioritizes emotional sharing for assistance. It aims to cultivate prosocial behaviour, either egoistically or altruistically, so that our future society can better feel for each other and assist one another. To achieve this, I propose the platform to be bioresponsive; it reacts and adapts to an individual’s physiological and cognitive state and reflects this via carefully designed avatars, environments, and interactions. I will discuss this concept in terms of three research directions: bioresponsive avatars, mediated communications and assistive tools. A preprint draft of this concept can be found on the following link: (https://doi.org/10.48550/arXiv.2311.16610)

3.13 Social VR for Social Skills Training

Sylvia Xueni Pan (Goldsmiths, University of London, GB; x.pan@gold.ac.uk)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Sylvia Xueni Pan

My research is about using VR to make our real life better. I am interested in creating VR applications with virtual humans that can be applied in different areas such as training, therapy, and education. For instance, in our early work in 2007 we used a friendly virtual character to help participants practise their social skills, so they can build more confidence for real life social interactions. More recently we developed a few scenarios in the area of health and healthcare related communication skills training, including understanding the psychological impact of domestic violence for social workers. Another important aspect of my work to use Social VR to help us understand real-world social interactions, which then informs the future design of social VR. For instance, we collaborate with neuroscientists to design and implement experimental studies which contributed towards understanding the brain mechanism behind autism.

3.14 Goal-adaptive Collaborative Spatial Experiences with GenAI

Payod Panda (Microsoft Research – Cambridge, GB; payod.panda@microsoft.com)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Payod Panda
The collaborative lifecycle involves more than meetings. Collaboration occurs at several timescales – the “work planning” timescale (days to weeks), the micro timescale (or “in-the-moment” interactions – scale of seconds), and the macro timescale (e.g., at the scale of projects – months to years). Additionally, effective collaboration is effortful, but traditional collaboration systems offer little support for reducing this effort across the collaborative lifecycle. For example, meetings often do not list what the goals of the meeting are, nor what is expected from meeting attendees. HCI has largely addressed the micro timescale of interactions – what kinds of interactions should a system provide in order to support collaborative tasks? We need to shift from designing for “moments” to designing for “workflows”, which should be driven by collaborative goals. How could we assist users to transition between activities within and across timescales in order to accomplish their short-, mid-, and long-term goals? I propose using Generative AI (GenAI) systems in order to adapt the meeting interface to the individual, team, and organizational goals, involving interactions like reconfiguring a collaborative space (3D virtual environment) and rearranging task elements in the space (distribution of task space).

3.15 What Can Social XR Do for Us that Traditional Communication Technology Cannot, and How Can We Know?

Alexander Raake (Audiovisual Technology Group, I3TC – TU Ilmenau, DE;
alexander.raake@tu-ilmenau.de)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Alexander Raake
Our team conducts research on audiovisual technology, perception and experience. A specific focus lies on telepresence technology used for human-to-human communication. We address Augmented, Virtual or Mixed Reality (AR/VR/MR, eXtended reality, XR), as well as robotics. Here, we integrate the multimedia-driven, initial approaches of Quality of Experience (QoE) assessment with the experience evaluation methods evolving in the AR/VR/MR community over decades, such as presence, social presence and co-presence, plausibility, or cybersickness. Besides direct evaluation methods using questionnaires, we employ indirect methods such as behavior and conversation analysis, regarding verbal and non-verbal communication. Here, the impact of non-obvious technical properties are of interest, such as that of transmission delay. In this case, quality and (audiovisual) fidelity may appear very high, but the individual temporal realities may be out of sync. In previous research, we showed that attribution may then be to the other person(s), not the system, e.g., considering the (previously unknown) other as less extrovert or open, when delay was on the line. Besides the visual modality, our group is interested in impact of audio and hearing, as well as audiovisual integration for attention, cognition and communication. Beyond the Social XR experience-perspective, we are interested in the “resources” involved, in terms of sustainability: (1) The human mental and physical resources spent, for example measuring fatigue for MR-based telepresence versus meeting face-to-face, or the positive impact on wellbeing with mediated social presence. (2) The amount of energy and natural resources consumed along the end-to-end chain (e.g., by a given media system implementation versus another), or resources saved (e.g., meeting via videoconferencing or MR rather than travelling). For the seminar, I would like to jointly specify a common set of research methods and use cases to be considered, aiming to address key challenges brought to the seminar. Moreover, I am interested in collaboratively elaborating a selection of these challenges and possibly develop initial ideas on how to address them.

3.16 Instrumenting for Understanding Social XR Experiences

David A. Shamma (Toyota Research Institute – Los Altos, US; aymans@acm.org)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © David A. Shamma

Research on Social XR has seen two fronts. One is the exciting, far vision of the future inspired by design fiction narratives and imagining technology beyond our capacity to build. The other is what we can make with today’s impressive but limited technology to adapt to tablets, web browsers, and head-mounted displays. Between these two is a field ripe for research because one can measure, test, and evaluate how people behave, interact, and enrich their lives with XR technology. As we step forward, our research should address theory-informed social conditions in the real world and explore how these patterns manifest in XR environments. It is not enough to instrument the virtual and augmented worlds. We should alter what is physically possible into the impossible, as XR’s great potential lies in creating non-realistic experiences. These unreal XR experiences have the ultimate potential to unlock stronger interactions and collaborations, and they should require exploration as technology takes each step forward.

3.17 Bringing Real People into XR

Aljosa Smolic (Lucerne University of Applied Sciences and Arts – Rotkreuz, CH;
aljosa.smolic@hslu.ch)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Aljosa Smolic
Social XR inherently requires digital representations of humans. For the visual part this is some kind of 3D computer graphics model. In most XR applications today we find purely computer-generated models which may be referred to as avatars. As an alternative it is possible to reconstruct 3D models of from images and video by means of 3D computer vision. The result is often referred to as volumetric video/holograms (VV). While many aspects of VV technology from capture to display have reached a high level of maturity, still a lot of problems remain unresolved to make it widely acceptable for social XR and telepresence applications.

3.18 Requirements for Future Social XR Applications

Anthony Steed (Department of Computer Science, University College London, GB;
A.Steed@ucl.ac.uk)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Anthony Steed
My research started out in the low-level engineering of collaborative virtual reality systems. In my talk I started by presenting some of our early work on social VR applications, and how, at the time, the main problems with the graphics and network engineering. I then presented some more recent work that focuses on identifying the key technical challenges for systems that support effective communication. For example the role of eye-tracking, body-tracking and latency.

Our previous work focused on technical demonstrations and lab-based experiments. Going forward, we are trying to do more longitudinal studies of social XR use to identify how users adapt after time. We are very interested in building social applications that support users with different literacies and competencies with VR.

Finally I talked about and demonstrated our Ubiq toolkit. Ubiq supports a variety of AR and XR devices. Client APIs, demonstrations and server code is completely open source so that anyone can set up secure, GDPR-compliant social systems. We talked about how we recently extended it to support instrumentation and scalability. We built a Virtual Dagstuhl social VR demo.

3.19 Adaptive Social XR

Kashyap Todi (Meta – Redmond, US; kashyap.todi@gmail.com)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Kashyap Todi

As a research scientist at RL-R, I work at the intersection of Human–Computer Interaction (HCI) and Artificial Intelligence (AI) towards solving emergent XR interaction problems. My expertise and interests are mainly around applying computational methods to address core HCI problems systematically. I have been doing so in domains of generating user interfaces via models of interactions and adapting user interfaces to individual users and their context. I believe that contextually adaptive UIs and interactions will be critical for enabling highly performant and usable XR applications and experiences. This will require extensive research on key components including modeling users, environments, and interactions, developing AI and/or computational approaches for optimization and adaptation, collecting and formatting extensive training datasets, identifying highly reliable quantitative metrics and evaluation methods, and finally close alignment with end-user applications. While I’ve worked extensively on “Solo-XR” scenarios in my research so far, I believe this philosophy and approach will be crucial and beneficial for Social XR settings as well. As such, I encourage and urge everyone to consider: what might adaptive social XR look and feel like in the future?

3.20 Human-centric Factors in Immersive Communication

Irene Viola (Centrum Wiskunde en Informatica – Amsterdam, NL; irene.viola@cwi.nl)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Irene Viola
My research relates to quality of experience in immersive multimedia systems. In particular, I am interested in understanding the user at the center of immersive systems: how do they behave, what they are interested in, how do they interact with each other and with the media objects. There are some core challenges in how we measure the user experience, whether it is qualitative or quantitative, explicit or implicit; how we can predict the reaction of the user in such experiences, whether it is the way they will move, what they will focus on, or whether they’ll want to replicate the experience; and how we can use such measurements and predictions to optimize the system and make it user-centric.

3.21 Fostering Well-being, Communication & Empathy with XR

Nadine Wagener (University of Bremen, DE; nwagener@uni-bremen.de)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Nadine Wagener
In my research I explore how to design technology fostering well-being, mental health, communication and empathy with XR. I explore two main aspects: 1. How can a XR system support a self-exploration approach of own emotions and our “inner worlds”, e.g. by offering passive haptic feedback and prompts to facilitate self-awareness and self-reflection?, and 2. How can we share (and collaboratively explore) these “inner worlds” with a social ecosystem? As one example, users can invite friends or colleagues into VR spaces that they autonomously create to represent their emotions (e.g. in regard to a shared conflict), and can also collaboratively express their emotions through art in VR. This approach can provide the foundation for developing a shared language, mutual understanding, and effective conflict management. I am further interested in including physiological data to make inner states accessible to oneself and others, focusing on finding means to represent those data in a qualitative way, e.g. through mirroring stress with a VR thunderstorm. I look forward to discussing different modes of biosignal inclusion and ethics in regard to SocialXR.

3.22 AI-driven 3D Digital Humans in XR

Zerrin Yumak (Utrecht University, NL; z.yumak@uu.nl)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Zerrin Yumak
My research is about 3D digital human technologies in games and Social XR applications. In particular, I am focusing on AI-driven non-verbal behavior synthesis algorithms for facial expressions, gestures and gaze behavior using deep learning algorithms. The goal of the research is to automatically generate animations conditioned for instance on audio and text and to create convincing and believable animations. That is useful to support the costly game development pipelines as well as for interactive applications where motions need to be generated on-the-fly. My work is data-driven and Motion Capture and Virtual Reality Lab at Utrecht University becomes instrumental for collecting human movement data for my research. I presented an overview of our research work during the seminar including FaceXHubert and FaceDiffuser. Another aspect of my research is the perception of animations to better understand what aspects of these characters makes them accepted by users. I have also done research on socially interactive characters in particular on the topics of emotion and memory modeling and multi-party interaction. The connection between AI, XR and HCI is the core of my research which is also discussed in our IEEE VR MASSXR Workshop.

4 TUESDAY Working Groups

During the Tuesday session, we set the challenge of “Co-envisioning Social XR futures and how we may achieve them”, breaking researchers into four working groups. Working groups used shared Miro boards to brainstorm key concerns and organize them into output summaries, shared below. When considering possible futures, participants were asked to identify key assumptions and values, as well as tools that will be needed, alongside their visions.

4.1 Group A: Synchronizing Asymmetric Individual & Shared User Perspectives in XR

Linda Hirsch (LMU Munich, DE)
Katherine Isbister (University of Santa Cruz, US)
Payod Panda (Microsoft Research, Cambridge, UK)
David Ayman Shamma (Centrum Wiskunde & Informatica , Amsterdam, NL)
Kashyap Todi (Meta, Redmond, WA, US)
Zerrin Yumak (Utrecht University, Utrecht, NL)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Linda Hirsch, Katherine Isbister, Payod Panda, David Ayman Shamma, Kashyap Todi, Zerrin Yumak

Envisioning future developments of social XR, we imagine that asymmetric social experiences will become ubiquitous, easy to use, and integrated into daily routines. This change toward blended realities introduces multiple challenges. One challenge questions the balance between personalization and individual interests versus shared understanding and social experiences. The increase in personal devices and possibilities to customize your virtual environment to create one’s own “reality” opposes the idea of social XR experiences. In comparison, social experiences require a shared understanding of shared activities, a common language, or consented social norms and practices. In the future, we expect a multiverse of XR realities, including “My reality”, “your reality” and “our reality”. Such a multiverse questions privacy settings and the degree to which realities blend. For example, would putting flowers on someone else’s table in VR impact the physical private household, and to what extent? Similarly, we assume we will have multiple virtual proxies interacting with others in meetings or social events for us. Proxies should be distinguished between passive and active, with passive being virtually present without further interaction and active being interactive. Interacting with a proxy instead of a “real user” will challenge social norms and relationships, which will have to be observed in the future. Additionally, being present at all times through proxies is the next step to being available at all times. Current issues of overloading users and induced stress are already consequences, which will potentially worsen. Thus, an important next step is to balance users’ attention and set their well-being into focus. We summarize inventions that will drive and challenge future social XR: AI & Proxies, Adaptive Interfaces, Multi-Device Constellations, Multi-Location Constellations, and Privacy.

4.2 Group B: What Kind of XR Future Do We Hope to Have? (Or Rather: “What Kind of Aspects Do We Foresee to be Relevant for an XR Future”?)

Eric J. Gonzalez (Google, Seattle, US)
Josh Greenberg (Alfred P. Sloan Foundation, New York, US)
Jie Li (EPAM, Hoofddorp, NL)
Alexander Raake (Audiovisual Technology Group, TU Ilmenau, DE)
Aljosa Smolic (Hochschule Luzern, Rotkreuz, CH)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Eric J. Gonzalez, Josh Greenberg, Jie Li, Alexander Raake, Aljosa Smolic

We started with broad brainstorming, then clustered responses across a broad spectrum from technical (see Fig. 3, left) to overall goals and ethical considerations (middle) and super-powers (right). Applications of more general nature are on the bottom left, more to the right commercial applications. An aspect discussed for some longer time is that of feature-access control, based on desired privacy and properties of bi-lateral / multi-lateral relationships (e.g., share photorealistic info only with dedicated others). Can access to such info be controlled asymmetrically? Will the space be one world or a sort of multiverse where persons can be in instances of the same space?

Figure 3: Overview of results Group B, What kind of XR future do we hope to have? (Or rather: “What kind of aspects do we foresee to be relevant for an XR future”?).

4.3 Group C: Redefining Common Grounds in Social XR

Sun Joo (Grace) Ahn (University of Georgia, Athens, US)
M. Doga Dogan (MIT, Cambridge, US)
Jan Gugenheimer (TU Darmstadt, DE & Telecom Paris, FR)
Yun Suen Pai (The University of Auckland, NZ)
Sylvia Xueni Pan (Goldsmiths, University of London, GB)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Sun Joo (Grace) Ahn, M. Doga Dogan, Jan Gugenheimer, Yun Suen Pai, Sylvia Xueni Pan

We discussed how XR platforms today do not share a common ground; a “ground truth” for a society to function properly. Fragmentation of media platforms has led to a reduction and loss of common grounds. This creates difficulty in determining the agenda or importance of problems that must be solved (e.g., war? climate change? gender issues). When there is no common ground, who determines what the ground truth is? How do we determine what is normal and what is not? How do we prepare against misinformation/deepfakes? These discussion points are illustrated in Figure 4. We propose that social VR can be used to re-establish common ground via the following mechanisms:

  • Allow companies to publish their core values. Users will select which social XR service they want to engage in.

  • Use social XR and immersive experiences to highlight important problems and solutions (e.g., globalize local events)

  • Social XR researchers will need to learn how to ’break’ the experience first so that we can prepare safety/protective tools (we can figure out what can go wrong)

  • Users’ ability to determine the credibility of the information source will be important

Refer to caption
Figure 4: Overview of results for Group C’s discussion regarding the lost of common grounds in XR.

4.4 Group D: Development and Implementation of Social XR Systems

Kai Kunze (Keio University, Yokohama, JP)
Neil McDonnell (University of Glasgow, GB)
Anthony Steed (University College London, GB)
Irene Viola (Centrum Wiskunde en Informatica, Amsterdam, NL)
Nadine Wagener (Universität Bremen, DE)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Kai Kunze, Neil McDonnell, Anthony Steed, Irene Viola, Nadine Wagener
The group explored various aspects of Social VR system development, raising concerns and considerations from different angles. We emphasized the challenges of creating user-friendly, data-protected, and persistent VR systems, while focusing on the balance between avatar personalization and privacy. Data control, privacy, and potential attention manipulation through eye tracking in VR scenes were discussed, highlighting the need for ethical considerations.

Opportunities and threats associated with making internal physiological data visible in XR were explored, considering individual willingness to share such personal information. The challenges and ethical concerns in XR, particularly personalized advertising and societal behavior impact, were highlighted. Topics included the fragmentation of reality in XR, the potential negative impact on communities, and the balance between individualism and commonality.

The discussion touched on the societal implications of allowing individuals to curate their XR environments. Concerns about data exploitation in XR, issues with GDPR enforcement, and the need for responsible data management were expressed. The difficulty of conducting experiments with social XR platforms due to data privacy issues and GDPR bureaucracy was discussed.

The discussion concluded with a focus on designing future XR spaces with consideration for minimizing exploitative practices, keeping data local, and ensuring accountability for companies involved in creating XR experiences. Overall, the discussion underscored the complex ethical and practical considerations in the development and implementation of Social VR systems.

5 TUESDAY Demos

On Tuesday afternoon, participants were invited to share demonstrations of their research related to social XR. We tried out some of the latest social XR hardware (see Figure 5: left). While other participants presented several different software solutions including for supporting social connection (see Figure 6) and avatar expression, as well as tools for networking (see Figures 5: center and 2: right) and recording XR content (see Figures 5: right and 7):

Refer to caption
Refer to caption
Refer to caption
Figure 5: Participants try out different tools and hardware for social XR. Left: a participant tries out AR glasses. Center: a screenshot of a video that has compiled five separate Vologram captures of participants dancing, superimposed onto the Dagstuhl steps. Right: a participant trying out Ubiq in virtual reality.
Refer to caption
Refer to caption
Refer to caption
Figure 6: Participants try out the Embodied Telepresent Connection (ETC) demo.

6 WEDNESDAY Working Groups

The Capturing and Modeling session explored the future of capturing and modeling people and places.

It began by discussing the history of place capture, from early attempts in the 1990s to more recent advances like Microsoft’s Holoportation or Avatar Codec. The presentation then looked at the future of people capture, with a focus on high-quality streamable free-viewpoint video and the rise of virtual humans (or “vTubers”).

The session then discussed some of the challenges and ethical implications of capturing and modeling people and places. It asks thought-provoking questions about what communication cues should be captured and shared, what elements of the environment should be shared, and how we can separate out communication cues from representation. The presentation also discussed the potential long-term social impact of this technology, and asks questions about how we prepare people for the additional cognition needed to deal with the virtual and AI worlds.

The conclusion was a hands-on task in which participants were asked to build and test many of the tools presented and reflect about their own experiences with capturing and sharing place and people. They were asked to consider how difficult it was to capture place and people with current tools, how effective the current captured content is for communication, and what would need to happen for their grandparents to capture people and place.

The session was meant to be a thought-provoking and engaging exploration of the future of capturing and modeling people and places. It raised important questions about the ethical implications of this technology and encourages participants to think critically about their own experiences with capturing and sharing place and people.

A good snip of the creations from this session is available on: https://x.com/twi_mar/status/1730062159151800399?s=20.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: Participants try out the Vologram Capturing tool and experiment with creating AR volograms with their smart phones.

7 THURSDAY Working Groups

7.1 Group A: Empathic Computing

Sun Joo (Grace) Ahn (University of Georgia, Athens, US)
Mark Billinghurst (University of South Australia, Adelaide, AU)
Linda Hirsch (LMU München, DE)
Alexandra Kitson (Simon Fraser University, Surrey, CA)
Yun Suen Pai (The University of Auckland, Auckland, NZ)
Nadine Wagener (Universität Bremen, DE)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Sun Joo (Grace) Ahn, Mark Billinghurst, Linda Hirsch, Alexandra Kitson, Yun Suen Pai, Nadine Wagener

Our working group was tasked with answering the following three questions: What theories should be driving this research? How can we incorporate empathic computing into our communications? Long term impact of engaging with empathic experiences? Figure 8 documents some of our notes and discussion.

Refer to caption
Figure 8: Notes on empathic computing from Thursday Working Group A taken in the News Room at Dagstuhl.

7.1.1 Relevant Theories and Conceptualization of Empathy

We discussed different models of empathy, recognizing that there is not a consensus among the scientific community around the precise definition of empathy and its constructs.

Goleman & Ekman [1]: Three dimensions of empathy (cognitive, emotional, compassion)

  1. 1.

    Cognitive refers to perspective taking (I understand you)

  2. 2.

    Emotional refers to feeling personal distress and sharing emotions (I feel you)

  3. 3.

    Compassion refers to actionable outcomes (I help you)

Davis [2]: Individual differences in empathy (perspective taking, fantasy, empathic concern, personal distress)

  • Measured through Interpersonal Reactivity Index (IRI)

Batson et al. [3]: Altruism vs. self-interest

  • What is the motivation of empathy?

Based on the above theories and our discussion, we formed a working definition of empathic computing: Forming meaningful relationships in social XR – empathic computing may provide the foundations for meaningful social XR.

7.1.2 Current Challenges of Empathic Computing

Second, we discussed the current challenges of incorporating empathic computing into our communications.

  • Objective vs. Subjective assessment

  • Defining and conceptualizing empathy

  • Systems struggle with the multi-layered complexities of context (e.g., place, user differences, social relationships)

  • Physiological Signals are also very context dependent

  • Can AI “train” people to become more empathic?

  • Accessibility and usability of wearables and sensors in empathic systems, including problems of scaling

  • How do you express micro-cues to facilitate communication (e.g., turn-taking)

7.1.3 Empathic Computing in Social XR

Next, we narrowed in on two main challenges of incorporating empathic computing into our communications: context aware XR and representing emotions in XR:

Context-Aware XR: The necessary contexts for empathy include Place, User, Social, Past Experience, Relationships, and Systems that acknowledge its limitations and disclose learning process (transparency).

Representing emotions in XR: Some of the ways include Avatars, (objects in) the VE, and separate virtual entities.

7.1.4 Long-term Impact

Finally, we discussed some of the long-term impacts of empathic computing in social XR.

  • Interpersonal conflict due to different relationship intimacy between users and empathic systems

  • Privacy concerns: the dilemma between privacy vs. context-aware. For example, users want to understand others’ emotions; don’t want to reveal theirs

  • Ethical concerns related to long-term relationships with empathic systems

  • Concerns related to over-gamified systems (can you gamify relationships)

  • Potential risk of hyper-empathy (caring too much)

7.1.5 Grand Challenges

Scientific Evaluation of Empathy. Measuring empathy and making it comparable is a continuous challenge because emotions are highly subjective. Research increasingly complements qualitative data with quantitative biodata measurements. For social XR, we need an understanding on an individual and collective level. Thus, we suggest a focus on mixed methods approaches for the scientific evaluation of empathy and see benefits in supporting AI models that take into account the user(s)’s user traits, background, etc.

Ethical Concerns. Virtual technologies allow customization to a great extent based on how users feel and their preferences. Empathic computing provides the data and technological setup to implement this for individual and social contexts. However, sharing biodata on this level raises privacy concerns, which can easily be taken advantage of. Similarly, empathic computing can support understanding others’ emotions. However, this also means that we hand over our empathy to a system to tell us, in return, about how another person feels. This requires great trust in an empathic system, further requiring transparency and a certain level of user control.

Furthermore, empathic computing increases the lack of transparency about users’ intentions when engaging in social interaction or showing empathy. Sometimes, this will be beneficial, such as when a doctor talks to patients about their diagnosis. However, in a more intimate relationship, fake or pretended empathy is not sustainable for a healthy relationship. It questions how we can disclose user intentions and raise system transparency for protecting misuse and contributing to “good” social relationships and connections.

Context Adaptation & Reaction. Emotions are context- and person-dependent. This requires empathic computing integrated into social XR to be context-aware and -sensitive when gathering, evaluating, processing, and displaying the data. Challenges arise on different levels. One relates to training models on being context-sensitive and having a comparable data set over multiple situations. Another derives from a user perspective where emotions are expressed very differently for different purposes depending on the cultural background. This also leads to the system’s sensitivity regarding when to disclose a user’s emotions to others and when not.

Evolving XR. One of the grand challenges is the anticipation of how social XR will evolve over generations of users and technological advancements. It might lead to a greater digital divide between users of social XR and non-users, followed by a diverging understanding of (social) reality. Similarly, AI systems and proxies will become regular members of our social system, communication, and collaboration. Yet, it is currently not predictable to what extent, in what ways, and how we, as human users, can keep control.

References

  • [1] Daniel Goleman, The brain and emotional intelligence: New insights, More than sound Northampton, MA, 94, 47–48, 2011.
  • [2] Mark H Davis, A multidimensional approach to individual differences in empathy, American Psychological Association Washington, DC, 1980.
  • [3] C Daniel Batson; Janine L Dyck; Randall J Brandt; Judy G Batson; Anne L Powell; Rosalie M McMaster; Cari Griffitt, Five studies testing two new egoistic alternatives to the empathy-altruism hypothesis, Journal of personality and social psychology, American Psychological Association, 55:1, 1–52, 1988.

7.2 Group B: Blended Reality

M. Doga Dogan (MIT, Cambridge, US)
Eric J. Gonzalez (Google, Seattle, US)
Katherine Isbister (University of Santa Cruz, US)
Kai Kunze (Keio University, Yokohama, JP)
Payod Panda (Microsoft Research, Cambridge, GB)
Sylvia Xueni Pan (Goldsmiths, University of London, GB)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © M. Doga Dogan, Eric J. Gonzalez, Katherine Isbister, Kai Kunze, Payod Panda, Sylvia Xueni Pan

7.2.1 Multi-sensory Social Experiences

Much advancement in the area of Virtual and Augmented reality has been mainly driven by our ability to push the boundary of computer graphics and audio display. Real-life experiences does not stop at what we see and what we hear. One of the challenges in the area of creating effective multi-sensory social experiences is the simulation and display of senses beyond visual and audio, such as social touch. There are few reasons behind this challenge:

  • Those senses could plays a bit part in shared experience, for instance, the small (olfactory) of mulled wine and roast chestnuts could people of the Christmas Market experience. However, as they are normally experienced at a subconscious level, it is difficult to describe and come up with a set of rules to code for in a simulated environment.

  • Device challenge – using haptics as an example, there is no general device that could address the richness of touch (e.g., tactile, weight, pressure). This is also a big challenge in making a business case for a particular type of haptics in a consumer device.

  • Individual differences in our perceptual threshold: our sensitivity in distinguishing different weights, temperature could be quite different; making it very difficult to control the experience with pre-defined code. There is also the challenge of culture differences – if someone from Japan were to greet someone from France, should the Japanese bow be translated into a hug and two kisses on the cheek? Shall we introduce asymmetric social interaction to calibrate the social experience, or should we maintain the authenticity at the cost of creating a potentially very awkward social interaction?

7.2.2 Semantic Understanding of Physical and Social Context

To effectively blend real and virtual environments, objects, and experiences, XR systems must have a rich understanding of the physical and virtual worlds. This includes having a sense of which objects in the user can touch or interact with (e.g., a desktop surface) and which should be avoided (e.g., a glass of water on the desk). Not only does this make the system “smarter” and allow it to provide information that is more relevant to the user’s surroundings, it can also improve interaction affordances by opportunistically aligning real and virtual content for improved haptic experiences (e.g., displaying touch UI on tabletop). The same concepts extend to social contexts, where it may be beneficial to alter how certain XR information is displayed depending on, for example, whether the user is in private, having a conversation, driving, or in a public setting. Enabling this richer understanding by leveraging AI and computer vision tools will be essential for this task.

7.3 Group C: Assets and Datasets

Jan Gugenheimer (TU Darmstadt, DE & Telecom Paris, FR)
David Ayman Shamma (Centrum Wiskunde & Informatica, Amsterdam, NL)
Aljosa Smolic (Hochschule Luzern, Rotkreuz, CH)
Kashyap Todi (Meta, Redmond, US)
Zerrin Yumak (Utrecht University, NL)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Jan Gugenheimer, David Ayman Shamma, Aljosa Smolic, Kashyap Todi, Zerrin Yumak

7.3.1 How Can We Recreate “Social XR @ Dagstuhl 2023”?

During the week, we have collected and/or created various bits and bytes of data that capture varying aspects of this seminar. This includes:

  • 3D Objects & Scenes

  • Volumetric Video

  • Images

  • Audio

  • Video

There are some immediate technical challenges that we can observe around data collection and use:

  • Formats: interoperability, open vs. closed, etc.

  • Scalability: quality, quantity, compression, etc.

  • Metadata: additional info, synchronization, …

  • License (cc by-nc-nd?)

But the looming big question: how do we really capture and recreate the event experience? How does someone reliving this Dagstuhl understand and grasp “What happens at 20:00?” (spoiler: cheese plates are served in the wine cellar).

7.3.2 Beyond Space for Social XR Capture

We are missing data and information that might be essential for adequately capturing the event:

  • Time (program/schedule/events), people, speech, emotional states, reactions, routines, experience.

  • What is needed to recreate complex social organization? (Two teams occupying same space)

Coming out of these discussions, we have identified three grand challenges around assets and datasets for social XR, summarized in subsection 7.3

7.3.3 Grand Challenges

Social Stitching. Can a sparse set of assets be stitched together with AI to fabricate missing parts? What kind of quantitative and qualitative evaluation would be required? For this challenge, a sparse collection of a scanned space is distributed. The goal is to make a cohesive scene/world. For the Dagstuhl assets, one would need to include a program (meals, breaks, seminar schedule, etc.) and floor prints (as seen around the building). A scan of a few guest rooms would also be good to add. Challengers would stitch the assets together to make Dagstuhl. Tools like GenAI could be used to hallucinate or speculate what’s in each room, who might be occupying the room, and what people are doing in the rooms. This would have two evaluations: quantitative and qualitative. Both would be evaluated given a program and personnel load. Evaluators should be able to drop into (or jump to) any time and feel the scene.

  • For the quantitative, one would need a high res scan of a few rooms and we could measure accuracy.

  • For qualitative, it would have to feel like Dagstuhl.

An AR variant of the stitching challenge could be an additional evaluation metric.

Authoring and Sense-making. What approaches can enable sense-making and authoring to reconstruct the spirit of the event? How can pieces in the form of different assets (3D, audio, video, text, etc.) be put together (authoring) to create a whole “event”? For that, what is a “model” of an event (space, time, humans, objects, audio, temperature, interrelations, etc.), what belongs to it in terms of abstract types? Given a (sparse) set of assets, related to an “event model”, sampled from an event

  • how can we interpolate and complete it, i.e. “make sense”?

  • How can this go beyond immediate asset types that can be captured (e.g. video, 3D), but have to be inferred/derived (e.g. mental state of a person, interrelations).

  • Can those be represented by some kind of network of nodes?

Preserving Privacy. How can privacy be preserved given the increasing fidelity of capture devices? What is privacy? What needs to be “preserved”? Risks and opportunities when embedding/representing data in latent spaces? What hardware alternative to RGB cameras could be used to capture environments that might be inherently privacy preserving (e.g., Lidar ?) Our understanding of “private” data might need to be extended beyond the captured data to also incorporate inferenced data from large reconstructions (e.g., can I use a full recording of the Dagstuhl event to figure out what Kash’s favorite food is?)

7.4 Group D: Survey Instruments

Jie Li (EPAM – Hoofddorp, NL)
Sylvia Xueni Pan (Goldsmiths, University of London, GB)
Alexander Raake (Audiovisual Technology Group, TU Ilmenau, DE)
Anthony Steed (University College London, GB)
Irene Viola (Centrum Wiskunde en Informatica, Amsterdam, NL)

License: [Uncaptioned image] Creative Commons BY 4.0 International license © Jie Li, Sylvia Xueni Pan, Alexander Raake, Anthony Steed, Irene Viola

7.4.1 Tension between Transparency and Superpowers

The tension between transparency and the desire for social superpowers is a central challenge in the context of social XR. On one hand, the pursuit of social transparency can be a way to effectively create a medium in which people interact and communicate in a way that is indistinguishable from real life. This involves supporting situations that are already socially intricate, such as brainstorming sessions, and ensuring support for accessibility, with a focus on the clarity of identity. The acknowledgment of existing biases, such as the proteus effect, underscores the need for careful consideration – would we need to replicate such effects in a transparent social VR medium? Is it desirable? Usability is emphasized as a crucial factor, aiming to prevent any loss of information. Transparency is certainly important for very broad accessibility, such as bringing in family, professional or support groups with social dynamics that are important to preserve. For example, assisting someone undertake a task in a virtual world that is an extension of a situation in the real world (e.g. some forms of training or support).

On the other hand, the collective interest in social “superpowers” is evident, as individuals seek to don “mask” personalities in novel situations. However, the attractiveness of these situations may vary depending on individual personalities. The intention is to replace or enrich experiences to re-empower diminished cues, recognizing the potential risk of “after-effects.” A key point is that individual superpowers might interfere with others, or create an equivalent of filter bubbles.

Thus when taking a step back, there might be a notion of a collective superpower that is itself well understood by participants. For example, empathy-enhancing powers might be very desirable if everyone agrees that this is acceptable, and no-one person has special insight. A very specific type of collective superpower might be to create “equity” (for example, the Altspace eye-line normalization?). That is, the idea that any biases that one might be inherent in the real world (e.g. simply height) are reduced or removed, to create a fairer interaction space. While this could be interpreted negatively at a individual level, the idea would be that collectively this would be advantageous. This then leads to interesting questions of the role of anonymity or social conventions in these situations (c.f. discussions held in a “Chatham House” style, or simply the purpose of masking in programs such as “The Masked Singer”).

This intricate balance between transparency and superpowers thus poses significant conceptual and practical challenges in the realm of social XR.

7.4.2 How to Achieve a Shared Platform?

The second grand challenge lies in devising a shared platform that facilitates collaborative recording, replaying, and immersive experiences. Instruments must be crafted to analyze recorded formats, ensuring both reusability and auditability. The platform must seamlessly support motion and behavior capture, alongside accommodating video for non-recordable volumetric scenarios. Drawing inspiration from tools utilized in Computer-Supported Cooperative Work (CSCW), challenges include incorporating features like video annotation and time-series analysis. Overcoming obstacles, such as navigating funding schemes and addressing the apparent need for individualized recognition, presents a distinctive challenge within the collaborative landscape of Social XR.

7.4.3 Is There a Missing Instrument Such As a Questionnaire?

The absence of an optimal questionnaire poses a significant challenge. For example, the sensitivity of social presence to prior familiarity underscores the need for tailored assessment tools. For unfamiliar participants, the focus may shift towards gauging how much they learn about others, while for familiar participants, understanding mutual comprehension becomes a key interest. Moreover, certain concepts that were predominant A potential avenue for questionnaire development involves reverting to the “usability” concept in social systems, building upon established frameworks like the SUS questionnaire. The discourse extends to the validity and relevance of “presence” in mixed and augmented reality, questioning the balance between usability and utility in virtual reality applications and considering the likelihood of repeated use for specific purposes.

Conclusions drawn from these considerations emphasize the identification of a set of constructs deemed crucial for Social XR. Addressing the topic of scenarios, a proposed strategy involves breaking down activities into modular components across space and time, creating encounters with specific affordances. This approach aims to enhance comparability between systems and contribute to a more nuanced understanding of Social XR dynamics.

8 Participants

  • Sun Joo (Grace) Ahn – University of Georgia – Athens, US

  • Mark Billinghurst – University of South Australia – Adelaide, AU

  • Pablo Cesar – CWI – Amsterdam, NL

  • Mustafa Doga Dogan – MIT – Cambridge, US

  • Eric J Gonzalez – Google – Seattle, US

  • Mar Gonzalez-Franco – Google – Seattle, US

  • Josh Greenberg – Alfred P. Sloan Foundation – New York, US

  • Jan Gugenheimer – TU Darmstadt, DE

  • Linda Hirsch – LMU München, DE

  • Katherine Isbister – University of California at Santa Cruz, US

  • Alexandra Kitson – Simon Fraser University – Surrey, CA

  • Kai Kunze – Keio University – Yokohama, JP

  • Jie Li – EPAM – Hoofddorp, NL

  • Neil McDonnell – University of Glasgow, GB

  • Yun Suen Pai – Keio University – Yokohama, JP

  • Sylvia Xueni Pan – University of London, GB

  • Payod Panda – Microsoft Research – Cambridge, GB

  • Alexander Raake – TU Ilmenau, DE

  • David Ayman Shamma – Toyota Research Institute – Los Altos, US

  • Aljosa Smolic – Hochschule Luzern – Rotkreuz, CH

  • Anthony Steed – University College London, GB

  • Kashyap Todi – Meta Reality Labs – Redmond, US

  • Irene Viola – CWI – Amsterdam, NL

  • Nadine Wagener – Universität Bremen, DE

  • Julie Williamson – University of Glasgow, GB

  • Zerrin Yumak – Utrecht University, NL

[Uncaptioned image]