DROPS

Document

DOI: 10.4230/LIPIcs.FORC.2024.3

Can Copyright Be Reduced to Privacy?

Authors: Niva Elkin-Koren, Uri Hacohen, Roi Livni, and Shay Moran

Published in: LIPIcs, Volume 295, 5th Symposium on Foundations of Responsible Computing (FORC 2024)

Abstract

There is a growing concern that generative AI models will generate outputs closely resembling the copyrighted materials for which they are trained. This worry has intensified as the quality and complexity of generative models have immensely improved, and the availability of extensive datasets containing copyrighted material has expanded. Researchers are actively exploring strategies to mitigate the risk of generating infringing samples, with a recent line of work suggesting to employ techniques such as differential privacy and other forms of algorithmic stability to provide guarantees on the lack of infringing copying. In this work, we examine whether such algorithmic stability techniques are suitable to ensure the responsible use of generative models without inadvertently violating copyright laws. We argue that while these techniques aim to verify the presence of identifiable information in datasets, thus being privacy-oriented, copyright law aims to promote the use of original works for the benefit of society as a whole, provided that no unlicensed use of protected expression occurred. These fundamental differences between privacy and copyright must not be overlooked. In particular, we demonstrate that while algorithmic stability may be perceived as a practical tool to detect copying, such copying does not necessarily constitute copyright infringement. Therefore, if adopted as a standard for detecting an establishing copyright infringement, algorithmic stability may undermine the intended objectives of copyright law.

Cite as

Niva Elkin-Koren, Uri Hacohen, Roi Livni, and Shay Moran. Can Copyright Be Reduced to Privacy?. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 3:1-3:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{elkinkoren_et_al:LIPIcs.FORC.2024.3,
  author =	{Elkin-Koren, Niva and Hacohen, Uri and Livni, Roi and Moran, Shay},
  title =	{{Can Copyright Be Reduced to Privacy?}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{3:1--3:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.3},
  URN =		{urn:nbn:de:0030-drops-200866},
  doi =		{10.4230/LIPIcs.FORC.2024.3},
  annote =	{Keywords: Copyright, Privacy, Generative Learning}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2024.76

Making Progress Based on False Discoveries

Authors: Roi Livni

Published in: LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

Abstract

We consider Stochastic Convex Optimization as a case-study for Adaptive Data Analysis. A basic question is how many samples are needed in order to compute ε-accurate estimates of O(1/ε²) gradients queried by gradient descent. We provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) Ω(1/ε³) samples are required, which is more than the number of sample required to simply optimize the population loss. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known T rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst Ω (√T/ε²) samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent ̃ Ω(1/ε^{2.5}) samples are necessary. Which is again suboptimal in terms of optimization. Our assumptions are that the oracle has only first order access and is post-hoc generalizing. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of post-hoc generalization follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. Overall these results are in contrast with classical bounds that show that with O(1/ε²) samples one can optimize the population risk to accuracy of O(ε) but, as it turns out, with spurious gradients.

Cite as

Roi Livni. Making Progress Based on False Discoveries. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 76:1-76:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{livni:LIPIcs.ITCS.2024.76,
  author =	{Livni, Roi},
  title =	{{Making Progress Based on False Discoveries}},
  booktitle =	{15th Innovations in Theoretical Computer Science Conference (ITCS 2024)},
  pages =	{76:1--76:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-309-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{287},
  editor =	{Guruswami, Venkatesan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.76},
  URN =		{urn:nbn:de:0030-drops-196040},
  doi =		{10.4230/LIPIcs.ITCS.2024.76},
  annote =	{Keywords: Adaptive Data Analysis, Stochastic Convex Optimization, Learning Theory}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2018.55

Improper Learning by Refuting

Authors: Pravesh K. Kothari and Roi Livni

Published in: LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)

Abstract

The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of efficient agnostic learning. Informally, refutation complexity of a class C is the minimum number of example-label pairs required to efficiently distinguish between the case that the labels correlate with the evaluation of some member of C (structure) and the case where the labels are i.i.d. Rademacher random variables (noise). The easy direction of this relationship was implicitly used in the recent framework for improper PAC learning lower bounds of Daniely and co-authors via connections to the hardness of refuting random constraint satisfaction problems. Our work can be seen as making the relationship between agnostic learning and refutation implicit in their work into an explicit equivalence. In a recent, independent work, Salil Vadhan discovered a similar relationship between refutation and PAC-learning in the realizable (i.e. noiseless) case.

Cite as

Pravesh K. Kothari and Roi Livni. Improper Learning by Refuting. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 55:1-55:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{kothari_et_al:LIPIcs.ITCS.2018.55,
  author =	{Kothari, Pravesh K. and Livni, Roi},
  title =	{{Improper Learning by Refuting}},
  booktitle =	{9th Innovations in Theoretical Computer Science Conference (ITCS 2018)},
  pages =	{55:1--55:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-060-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{94},
  editor =	{Karlin, Anna R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.55},
  URN =		{urn:nbn:de:0030-drops-83488},
  doi =		{10.4230/LIPIcs.ITCS.2018.55},
  annote =	{Keywords: learning thoery, computation learning}
}

Search Results

Documents authored by Livni, Roi

Can Copyright Be Reduced to Privacy?

Abstract

Cite as

Making Progress Based on False Discoveries

Abstract

Cite as

Improper Learning by Refuting

Abstract

Cite as

Thanks for your feedback!

Could not send message