LIPIcs, Volume 295

5th Symposium on Foundations of Responsible Computing (FORC 2024)



Thumbnail PDF

Event

FORC 2024, June 12-14, 2024, Harvard University, Cambridge, MA, USA

Editor

Guy N. Rothblum
  • Apple, Cupertino, CA, USA

Publication Details

  • published at: 2024-06-10
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-95977-319-5
  • DBLP: db/conf/forc/forc2024

Access Numbers

Documents

No documents found matching your filter selection.
Document
Complete Volume
LIPIcs, Volume 295, FORC 2024, Complete Volume

Authors: Guy N. Rothblum


Abstract
LIPIcs, Volume 295, FORC 2024, Complete Volume

Cite as

5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 1-208, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Proceedings{rothblum:LIPIcs.FORC.2024,
  title =	{{LIPIcs, Volume 295, FORC 2024, Complete Volume}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{1--208},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024},
  URN =		{urn:nbn:de:0030-drops-200828},
  doi =		{10.4230/LIPIcs.FORC.2024},
  annote =	{Keywords: LIPIcs, Volume 295, FORC 2024, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Guy N. Rothblum


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 0:i-0:x, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{rothblum:LIPIcs.FORC.2024.0,
  author =	{Rothblum, Guy N.},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{0:i--0:x},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.0},
  URN =		{urn:nbn:de:0030-drops-200830},
  doi =		{10.4230/LIPIcs.FORC.2024.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Effects of Privacy-Inducing Noise on Welfare and Influence of Referendum Systems

Authors: Suat Evren and Praneeth Vepakomma


Abstract
Social choice functions help aggregate individual preferences while differentially private mechanisms provide formal privacy guarantees to release answers of queries operating on sensitive data. However, preserving differential privacy requires introducing noise to the system, and therefore may lead to undesired byproducts. Does an increase in the level of privacy for releasing the outputs of social choice functions increase or decrease the level of influence and welfare, and at what rate? In this paper, we mainly address this question in more precise terms in a referendum setting with two candidates when the celebrated randomized response mechanism is used. We show that the level of privacy is inversely proportional to society’s welfare and influence.

Cite as

Suat Evren and Praneeth Vepakomma. Effects of Privacy-Inducing Noise on Welfare and Influence of Referendum Systems. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 1:1-1:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{evren_et_al:LIPIcs.FORC.2024.1,
  author =	{Evren, Suat and Vepakomma, Praneeth},
  title =	{{Effects of Privacy-Inducing Noise on Welfare and Influence of Referendum Systems}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{1:1--1:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.1},
  URN =		{urn:nbn:de:0030-drops-200841},
  doi =		{10.4230/LIPIcs.FORC.2024.1},
  annote =	{Keywords: Welfare, influence, social choice functions, differential privacy, randomized response}
}
Document
Incentivized Collaboration in Active Learning

Authors: Lee Cohen and Han Shao


Abstract
In collaborative active learning, where multiple agents try to learn labels from a common hypothesis, we introduce an innovative framework for incentivized collaboration. Here, rational agents aim to obtain labels for their data sets while keeping label complexity at a minimum. We focus on designing (strict) individually rational (IR) collaboration protocols, ensuring that agents cannot reduce their expected label complexity by acting individually. We first show that given any optimal active learning algorithm, the collaboration protocol that runs the algorithm as is over the entire data is already IR. However, computing the optimal algorithm is NP-hard. We therefore provide collaboration protocols that achieve (strict) IR and are comparable with the best known tractable approximation algorithm in terms of label complexity.

Cite as

Lee Cohen and Han Shao. Incentivized Collaboration in Active Learning. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 2:1-2:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.FORC.2024.2,
  author =	{Cohen, Lee and Shao, Han},
  title =	{{Incentivized Collaboration in Active Learning}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{2:1--2:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.2},
  URN =		{urn:nbn:de:0030-drops-200851},
  doi =		{10.4230/LIPIcs.FORC.2024.2},
  annote =	{Keywords: pool-based active learning, individual rationality, incentives, Bayesian, collaboration}
}
Document
Can Copyright Be Reduced to Privacy?

Authors: Niva Elkin-Koren, Uri Hacohen, Roi Livni, and Shay Moran


Abstract
There is a growing concern that generative AI models will generate outputs closely resembling the copyrighted materials for which they are trained. This worry has intensified as the quality and complexity of generative models have immensely improved, and the availability of extensive datasets containing copyrighted material has expanded. Researchers are actively exploring strategies to mitigate the risk of generating infringing samples, with a recent line of work suggesting to employ techniques such as differential privacy and other forms of algorithmic stability to provide guarantees on the lack of infringing copying. In this work, we examine whether such algorithmic stability techniques are suitable to ensure the responsible use of generative models without inadvertently violating copyright laws. We argue that while these techniques aim to verify the presence of identifiable information in datasets, thus being privacy-oriented, copyright law aims to promote the use of original works for the benefit of society as a whole, provided that no unlicensed use of protected expression occurred. These fundamental differences between privacy and copyright must not be overlooked. In particular, we demonstrate that while algorithmic stability may be perceived as a practical tool to detect copying, such copying does not necessarily constitute copyright infringement. Therefore, if adopted as a standard for detecting an establishing copyright infringement, algorithmic stability may undermine the intended objectives of copyright law.

Cite as

Niva Elkin-Koren, Uri Hacohen, Roi Livni, and Shay Moran. Can Copyright Be Reduced to Privacy?. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 3:1-3:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{elkinkoren_et_al:LIPIcs.FORC.2024.3,
  author =	{Elkin-Koren, Niva and Hacohen, Uri and Livni, Roi and Moran, Shay},
  title =	{{Can Copyright Be Reduced to Privacy?}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{3:1--3:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.3},
  URN =		{urn:nbn:de:0030-drops-200866},
  doi =		{10.4230/LIPIcs.FORC.2024.3},
  annote =	{Keywords: Copyright, Privacy, Generative Learning}
}
Document
Balanced Filtering via Disclosure-Controlled Proxies

Authors: Siqi Deng, Emily Diana, Michael Kearns, and Aaron Roth


Abstract
We study the problem of collecting a cohort or set that is balanced with respect to sensitive groups when group membership is unavailable or prohibited from use at deployment time. Specifically, our deployment-time collection mechanism does not reveal significantly more about the group membership of any individual sample than can be ascertained from base rates alone. To do this, we study a learner that can use a small set of labeled data to train a proxy function that can later be used for this filtering or selection task. We then associate the range of the proxy function with sampling probabilities; given a new example, we classify it using our proxy function and then select it with probability corresponding to its proxy classification. Importantly, we require that the proxy classification does not reveal significantly more information about the sensitive group membership of any individual example compared to population base rates alone (i.e., the level of disclosure should be controlled) and show that we can find such a proxy in a sample- and oracle-efficient manner. Finally, we experimentally evaluate our algorithm and analyze its generalization properties.

Cite as

Siqi Deng, Emily Diana, Michael Kearns, and Aaron Roth. Balanced Filtering via Disclosure-Controlled Proxies. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 4:1-4:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{deng_et_al:LIPIcs.FORC.2024.4,
  author =	{Deng, Siqi and Diana, Emily and Kearns, Michael and Roth, Aaron},
  title =	{{Balanced Filtering via Disclosure-Controlled Proxies}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{4:1--4:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.4},
  URN =		{urn:nbn:de:0030-drops-200872},
  doi =		{10.4230/LIPIcs.FORC.2024.4},
  annote =	{Keywords: Algorithms, Sampling, Ethical/Societal Implications}
}
Document
Distribution-Specific Auditing for Subgroup Fairness

Authors: Daniel Hsu, Jizhou Huang, and Brendan Juba


Abstract
We study the problem of auditing classifiers for statistical subgroup fairness. Kearns et al. [Kearns et al., 2018] showed that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remedying statistical measures of discrimination against subgroups assumes access to an oracle for this problem, despite the fact that no efficient algorithms are known for it. If we assume the data distribution is Gaussian, or even merely log-concave, then a recent line of work has discovered efficient agnostic learning algorithms for halfspaces. Unfortunately, the reduction of Kearns et al. was formulated in terms of weak, "distribution-free" learning, and thus did not establish a connection for families such as log-concave distributions. In this work, we give positive and negative results on auditing for Gaussian distributions: On the positive side, we present an alternative approach to leverage these advances in agnostic learning and thereby obtain the first polynomial-time approximation scheme (PTAS) for auditing nontrivial combinatorial subgroup fairness: we show how to audit statistical notions of fairness over homogeneous halfspace subgroups when the features are Gaussian. On the negative side, we find that under cryptographic assumptions, no polynomial-time algorithm can guarantee any nontrivial auditing, even under Gaussian feature distributions, for general halfspace subgroups.

Cite as

Daniel Hsu, Jizhou Huang, and Brendan Juba. Distribution-Specific Auditing for Subgroup Fairness. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hsu_et_al:LIPIcs.FORC.2024.5,
  author =	{Hsu, Daniel and Huang, Jizhou and Juba, Brendan},
  title =	{{Distribution-Specific Auditing for Subgroup Fairness}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{5:1--5:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.5},
  URN =		{urn:nbn:de:0030-drops-200882},
  doi =		{10.4230/LIPIcs.FORC.2024.5},
  annote =	{Keywords: Fairness auditing, agnostic learning, intractability}
}
Document
Modeling Diversity Dynamics in Time-Evolving Collaboration Networks

Authors: Christopher Archer and Gireeja Ranade


Abstract
Increasing diversity in a community or an organization requires paying attention to many different aspects, including recruitment, hiring, retention, climate, and more. In this paper, we focus on how climate, captured through network interactions, can affect the growth or decay of minority populations within that community. Building on previous work, we develop a dynamic stochastic block model that grows according to a weighted version of preferential attachment, while having some memory of previous edges as well. This models how interactions between nodes in the network can influence the recruitment of new nodes to the network. We derive a deterministic approximation of this random system and prove its convergence is determined by the network parameters. Additionally, we show how the memory of the network affects convergence under different parameter regimes, and we validate this model by assessing the growth of women scientists in the American Physics Society’s co-authorship network.

Cite as

Christopher Archer and Gireeja Ranade. Modeling Diversity Dynamics in Time-Evolving Collaboration Networks. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 6:1-6:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{archer_et_al:LIPIcs.FORC.2024.6,
  author =	{Archer, Christopher and Ranade, Gireeja},
  title =	{{Modeling Diversity Dynamics in Time-Evolving Collaboration Networks}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{6:1--6:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.6},
  URN =		{urn:nbn:de:0030-drops-200897},
  doi =		{10.4230/LIPIcs.FORC.2024.6},
  annote =	{Keywords: Network Models, Diversity, Collaboration Networks, Stochastic Block Model}
}
Document
Drawing Competitive Districts in Redistricting

Authors: Gabriel Chuang, Oussama Hanguir, and Clifford Stein


Abstract
In the process of redistricting, one important metric is the number of competitive districts, that is, districts where both parties have a reasonable chance of winning a majority of votes. Competitive districts are important for achieving proportionality, responsiveness, and other desirable qualities; some states even directly list competitiveness in their legally-codified districting requirements. In this work, we discuss the problem of drawing plans with at least a fixed number of competitive districts. In addition to the standard, "vote-band" measure of competitivenesss (i.e., how close was the last election?), we propose a measure that explicitly considers "swing voters" - the segment of the population that may choose to vote either way, or not vote at all, in a given election. We present two main, contrasting results. First, from a computational complexity perspective, we show that the task of drawing plans with competitive districts is NP-hard, even on very natural instances where the districting task itself is easy (e.g., small rectangular grids of population-balanced cells). Second, however, we show that a simple hill-climbing procedure can in practice find districtings on real states in which all the districts are competitive. We present the results of the latter on the precinct-level graphs of the U.S. states of North Carolina and Arizona, and discuss trade-offs between competitiveness and other desirable qualities.

Cite as

Gabriel Chuang, Oussama Hanguir, and Clifford Stein. Drawing Competitive Districts in Redistricting. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 7:1-7:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{chuang_et_al:LIPIcs.FORC.2024.7,
  author =	{Chuang, Gabriel and Hanguir, Oussama and Stein, Clifford},
  title =	{{Drawing Competitive Districts in Redistricting}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{7:1--7:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.7},
  URN =		{urn:nbn:de:0030-drops-200902},
  doi =		{10.4230/LIPIcs.FORC.2024.7},
  annote =	{Keywords: Redistricting, Computational Complexity, Algorithms}
}
Document
Score Design for Multi-Criteria Incentivization

Authors: Anmol Kabra, Mina Karzand, Tosca Lechner, Nati Srebro, and Serena Wang


Abstract
We present a framework for designing scores to summarize performance metrics. Our design has two multi-criteria objectives: (1) improving on scores should improve all performance metrics, and (2) achieving pareto-optimal scores should achieve pareto-optimal metrics. We formulate our design to minimize the dimensionality of scores while satisfying the objectives. We give algorithms to design scores, which are provably minimal under mild assumptions on the structure of performance metrics. This framework draws motivation from real-world practices in hospital rating systems, where misaligned scores and performance metrics lead to unintended consequences.

Cite as

Anmol Kabra, Mina Karzand, Tosca Lechner, Nati Srebro, and Serena Wang. Score Design for Multi-Criteria Incentivization. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 8:1-8:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{kabra_et_al:LIPIcs.FORC.2024.8,
  author =	{Kabra, Anmol and Karzand, Mina and Lechner, Tosca and Srebro, Nati and Wang, Serena},
  title =	{{Score Design for Multi-Criteria Incentivization}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{8:1--8:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.8},
  URN =		{urn:nbn:de:0030-drops-200919},
  doi =		{10.4230/LIPIcs.FORC.2024.8},
  annote =	{Keywords: Multi-criteria incentives, Score-based incentives, Incentivizing improvement, Computational geometry}
}
Document
Privacy Can Arise Endogenously in an Economic System with Learning Agents

Authors: Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, Sai Praneeth Karimireddy, and Michael I. Jordan


Abstract
We study price-discrimination games between buyers and a seller where privacy arises endogenously - that is, utility maximization yields equilibrium strategies where privacy occurs naturally. In this game, buyers with a high valuation for a good have an incentive to keep their valuation private, lest the seller charge them a higher price. This yields an equilibrium where some buyers will send a signal that misrepresents their type with some probability; we refer to this as buyer-induced privacy. When the seller is able to publicly commit to providing a certain privacy level, we find that their equilibrium response is to commit to ignore buyers' signals with some positive probability; we refer to this as seller-induced privacy. We then turn our attention to a repeated interaction setting where the game parameters are unknown and the seller cannot credibly commit to a level of seller-induced privacy. In this setting, players must learn strategies based on information revealed in past rounds. We find that, even without commitment ability, seller-induced privacy arises as a result of reputation building. We characterize the resulting seller-induced privacy and seller’s utility under no-regret and no-policy-regret learning algorithms and verify these results through simulations.

Cite as

Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, Sai Praneeth Karimireddy, and Michael I. Jordan. Privacy Can Arise Endogenously in an Economic System with Learning Agents. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 9:1-9:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{ananthakrishnan_et_al:LIPIcs.FORC.2024.9,
  author =	{Ananthakrishnan, Nivasini and Ding, Tiffany and Werner, Mariel and Karimireddy, Sai Praneeth and Jordan, Michael I.},
  title =	{{Privacy Can Arise Endogenously in an Economic System with Learning Agents}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{9:1--9:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.9},
  URN =		{urn:nbn:de:0030-drops-200921},
  doi =		{10.4230/LIPIcs.FORC.2024.9},
  annote =	{Keywords: Privacy, Game Theory, Online Learning, Price Discrimination}
}
Document
Extended Abstract
Online Algorithms with Limited Data Retention (Extended Abstract)

Authors: Nicole Immorlica, Brendan Lucier, Markus Mobius, and James Siderius


Abstract
We introduce a model of online algorithms subject to strict constraints on data retention. An online learning algorithm encounters a stream of data points, one per round, generated by some stationary process. Crucially, each data point can request that it be removed from memory m rounds after it arrives. To model the impact of removal, we do not allow the algorithm to store any information or calculations between rounds other than a subset of the data points (subject to the retention constraints). At the conclusion of the stream, the algorithm answers a statistical query about the full dataset. We ask: what level of performance can be guaranteed as a function of m? We illustrate this framework for multidimensional mean estimation and linear regression problems. We show it is possible to obtain an exponential improvement over a baseline algorithm that retains all data as long as possible. Specifically, we show that m = Poly(d, log(1/ε)) retention suffices to achieve mean squared error ε after observing O(1/ε) d-dimensional data points. This matches the error bound of the optimal, yet infeasible, algorithm that retains all data forever. We also show a nearly matching lower bound on the retention required to guarantee error ε. One implication of our results is that data retention laws are insufficient to guarantee the right to be forgotten even in a non-adversarial world in which firms merely strive to (approximately) optimize the performance of their algorithms. Our approach makes use of recent developments in the multidimensional random subset sum problem to simulate the progression of stochastic gradient descent under a model of adversarial noise, which may be of independent interest.

Cite as

Nicole Immorlica, Brendan Lucier, Markus Mobius, and James Siderius. Online Algorithms with Limited Data Retention (Extended Abstract). In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 10:1-10:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{immorlica_et_al:LIPIcs.FORC.2024.10,
  author =	{Immorlica, Nicole and Lucier, Brendan and Mobius, Markus and Siderius, James},
  title =	{{Online Algorithms with Limited Data Retention}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{10:1--10:8},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.10},
  URN =		{urn:nbn:de:0030-drops-200937},
  doi =		{10.4230/LIPIcs.FORC.2024.10},
  annote =	{Keywords: online algorithms, machine learning, data, privacy, law}
}

Filters


Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail