Search Results

Documents authored by Xu, Kai


Document
Human-Centered Approaches for Provenance in Automated Data Science (Dagstuhl Seminar 23372)

Authors: Anamaria Crisan, Lars Kotthoff, Marc Streit, and Kai Xu

Published in: Dagstuhl Reports, Volume 13, Issue 9 (2024)


Abstract
The scope of automated machine learning (AutoML) technology has extended beyond its initial boundaries of model selection and hyperparameter tuning and towards end-to-end development and refinement of data science pipelines. These advances, both theoretical and realized, make the tools of data science more readily available to domain experts that rely on low- or no-code tooling options to analyze and make sense of their data. To ensure that automated data science technologies are applied both effectively and responsibly, it becomes increasingly urgent to carefully audit the decisions made both automatically and with guidance from humans. This Dagstuhl Seminar examines human-centered approaches for provenance in automated data science. While prior research concerning provenance and machine learning exists, it does not address the expanded scope of automated approaches and the consequences of applying such techniques at scale to the population of domain experts. In addition, most of the previous works focus on the automated part of this process, leaving a gap on the support for the sensemaking tasks users need to perform, such as selecting the datasets and candidate models and identifying potential causes for poor performance. The seminar brought together experts from across provenance, information visualization, visual analytics, machine learning, and human-computer interaction to articulate the user challenges posed by AutoML and automated data science, discuss the current state of the art, and propose directions for new research. More specifically, this seminar: - articulates the state of the art in AutoML and automated data science for supporting the provenance of decision making, - describes the challenges that data scientists and domain experts face when interfacing with automated approaches to make sense of an automated decision, - examines the interface between data-centric, model-centric, and user-centric models of provenance and how they interact with automated techniques, and - encourages exploration of human-centered approaches; for example leveraging visualization.

Cite as

Anamaria Crisan, Lars Kotthoff, Marc Streit, and Kai Xu. Human-Centered Approaches for Provenance in Automated Data Science (Dagstuhl Seminar 23372). In Dagstuhl Reports, Volume 13, Issue 9, pp. 116-136, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{crisan_et_al:DagRep.13.9.116,
  author =	{Crisan, Anamaria and Kotthoff, Lars and Streit, Marc and Xu, Kai},
  title =	{{Human-Centered Approaches for Provenance in Automated Data Science (Dagstuhl Seminar 23372)}},
  pages =	{116--136},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2024},
  volume =	{13},
  number =	{9},
  editor =	{Crisan, Anamaria and Kotthoff, Lars and Streit, Marc and Xu, Kai},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.13.9.116},
  URN =		{urn:nbn:de:0030-drops-198236},
  doi =		{10.4230/DagRep.13.9.116},
  annote =	{Keywords: Dagstuhl Seminar, Provenance, AutoML, Data Science, Information Visualisation, Visual Analytics, Machine Learning, Human-Computer Interaction}
}
Document
Provenance and Logging for Sense Making (Dagstuhl Seminar 18462)

Authors: Jean-Daniel Fekete, T. J. Jankun-Kelly, Melanie Tory, and Kai Xu

Published in: Dagstuhl Reports, Volume 8, Issue 11 (2019)


Abstract
Sense making is one of the biggest challenges in data analysis faced by both the industry and the research community. It involves understanding the data and uncovering its model, generating a hypothesis, selecting analysis methods, creating novel solutions, designing evaluation, and also critical thinking and learning wherever needed. The research and development for such sense making tasks lags far behind the fast-changing user needs, such as those that emerged recently as the result of so-called "Big Data". As a result, sense making is often performed manually and the limited human cognition capability becomes the bottleneck of sense making in data analysis and decision making. One of the recent advances in sense making research is the capture, visualization, and analysis of provenance information. Provenance is the history and context of sense making, including the data/analysis used and the users' critical thinking process. It has been shown that provenance can effectively support many sense making tasks. For instance, provenance can provide an overview of what has been examined and reveal gaps like unexplored information or solution possibilities. Besides, provenance can support collaborative sense making and communication by sharing the rich context of the sense making process. Besides data analysis and decision making, provenance has been studied in many other fields, sometimes under different names, for different types of sense making. For example, the Human-Computer Interaction community relies on the analysis of logging to understand user behaviors and intentions; the WWW and database community has been working on data lineage to understand uncertainty and trustworthiness; and finally, reproducible science heavily relies on provenance to improve the reliability and efficiency of scientific research. This Dagstuhl Seminar brought together researchers from the diverse fields that relate to provenance and sense making to foster cross-community collaboration. Shared challenges were identified and progress has been made towards developing novel solutions.

Cite as

Jean-Daniel Fekete, T. J. Jankun-Kelly, Melanie Tory, and Kai Xu. Provenance and Logging for Sense Making (Dagstuhl Seminar 18462). In Dagstuhl Reports, Volume 8, Issue 11, pp. 35-62, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Article{fekete_et_al:DagRep.8.11.35,
  author =	{Fekete, Jean-Daniel and Jankun-Kelly, T. J. and Tory, Melanie and Xu, Kai},
  title =	{{Provenance and Logging for Sense Making (Dagstuhl Seminar 18462)}},
  pages =	{35--62},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2019},
  volume =	{8},
  number =	{11},
  editor =	{Fekete, Jean-Daniel and Jankun-Kelly, T. J. and Tory, Melanie and Xu, Kai},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.8.11.35},
  URN =		{urn:nbn:de:0030-drops-103554},
  doi =		{10.4230/DagRep.8.11.35},
  annote =	{Keywords: Logging, Provenance, Sensemaking, Visualization}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail