Abstract 1 Motivation & Background 2 Contributions & Related Work 3 Problem Formulation 4 Main Results References

Differentially Private Sequential Learning

Yuxin Liu ORCID Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA M. Amin Rahimian ORCID Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
Abstract

In a differentially private sequential learning setting, agents introduce endogenous noise into their public actions to limit information leakage about their private signals. The impact of this privacy noise varies depending on whether the signals are continuous or binary. For continuous signals and a finite privacy budget ϵ>0, we propose a smooth randomized response mechanism that adapts the noise level based on the distance to a decision threshold, in contrast to the standard randomized response with uniform noise. This allows agents’ actions to better reflect both their private signals and public history, achieving an accelerated convergence rate of Θϵ(logn), surpassing the rate of Θ(logn) in the non-private regime. In this case, privacy noise helps to amplify the log-likelihood ratio over time, improving information aggregation.

For binary signals, differential privacy consistently degrades learning performance by reducing the probability of correct cascades compared to the non-private baseline. In this case, agents tend to use a constant randomized response strategy before the information cascade occurs. This constant privacy noise reduces the informativeness of their actions and hinders effective learning until an information cascade occurs. However, even for binary signals, the probability of correct cascades does not vary monotonically with the privacy budget ϵ. There are values of ϵ where the probability of a correct cascade increases as the privacy budget decreases because the threshold for initiating an information cascade increases by one when the privacy budget crosses below those values.

Keywords and phrases:
Differential Privacy, Sequential Learning, Randomized Response, Learning Efficiency
Category:
Extended Abstract
Copyright and License:
[Uncaptioned image] © Yuxin Liu and M. Amin Rahimian; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Security and privacy Social aspects of security and privacy
Related Version:
Full Version: https://arxiv.org/abs/2502.19525
Acknowledgements:
The authors thank Satish Iyengar, Wade Hann-Caruthers, and David Hirshleifer for valuable feedback and discussions. We also thank the organizers, reviewers and participants of the 6th annual Symposium on Foundations of Responsible Computing, 6th AAAI Workshop on Privacy-Preserving Artificial Intelligence, 2025 Conference on Network Science and Economics, and 2024 Theory and Practice of Differential Privacy Workshop for helpful comments and suggestions.
Funding:
This material is based on work supported by the National Science Foundation under Grant No. 2318844.
Editors:
Mark Bun

1 Motivation & Background

When individuals make decisions, they rely on both private signals and public observations, such as previous actions [6, 2]. Sequential learning models formalize this process and examine whether learning converges to the truth and how quickly it occurs [5, 7, 1, 15]. Although early work emphasizes that herding can prevent accurate learning, later studies highlight the role of network structure and signal strength in shaping learning speed [11, 8].

A critical but often overlooked factor in these models is privacy. In sensitive settings, for example, related to healthcare or political expression, actions can reveal sensitive private information, causing people to distort their behavior (observable actions) to protect their privacy (hide their private signals). Such distortions (e.g., adding random noise to limit information leakage) can hinder information aggregation and disrupt learning [3], but can also prevent harmful cascades caused by early misinformation [14].

We adopt the differential privacy framework [9] to formalize the trade-off between privacy and learning and propose a structured mechanism that selectively adds noise when necessary. This approach preserves privacy while supporting efficient learning, particularly during the early stages when actions are most informative.

2 Contributions & Related Work

This work contributes to the literature on sequential learning, a foundational topic in economics and decision theory. Early models [5, 7] formalize information cascades, showing how rational agents ignore their private signals and mimic their predecessors, leading to inefficient learning (i.e., inefficient aggregation of private signals over time). Later studies identify conditions for asymptotic learning under limited observations [17, 1, 16]. Another line of research investigates learning speed, highlighting slow convergence when private signals are Gaussian [15, 11], and examining the effects of behavioral biases [4, 10], signal structures [13], and group dynamics [12].

We contribute to this literature by incorporating differential privacy into the sequential learning framework and by providing, to our knowledge, the first formal analysis of convergence rates under differential privacy constraints. For binary signals, we model the learning dynamics as a Markov process and characterize the likelihood of correct cascades. For continuous signals, we propose a smooth randomized response mechanism and establish an accelerated learning rate of Θϵ(logn), improving upon the Θ(logn) rate in the non-private case. Our findings challenge the traditional trade-off between privacy and efficiency, showing that well-designed privacy mechanisms can both protect individual data and enhance collective learning.

3 Problem Formulation

3.1 The Binary Model

We consider a classical sequential learning model with an unknown, uniformly distributed binary state θ{1,+1}. Each agent n receives an i.i.d. private signal sn{1,+1}, satisfying (sn=θ)=p>1/2. Without privacy concerns, agents choose actions an{1,+1} based on their signal and the history of past actions hn1, receiving utility 1 if an=θ, and 0 otherwise.

To model privacy concerns, we assume that agents report their actions via a mechanism satisfying local differential privacy. Specifically, we adopt the randomized response strategy [9], where an agent flips their intended action (rational choice) with probability un, so that the public action xn may differ from an. We formalize this binary setting below.

Definition 1.

A randomized strategy is said to be ϵ-differentially private in the binary model if, for all actions xn{1,+1}, for all signals sn,sn{1,+1}, and for any history of past actions hn1{±1}n1, we have:

((sn;hn1)=xn)exp(ϵ)((sn;hn1)=xn).
Definition 2 (Randomized Response Strategy).

A mechanism (sn;hn1) is said to follow a randomized response strategy with flip probability un if it outputs xn{+1,1} by flipping the agent’s true action an with probability un, that is,

(xn=anan)=un, and (xn=anan)=1un,

where an is the rational action determined from signal sn and history hn1 without privacy concerns.

3.2 The Continuous Model

Building on the binary model, we extend our analysis to the continuous case. Here, private signals sn are i.i.d. Gaussian with mean θ{±1} and variance σ. To define differential privacy in this setting, we consider neighboring signals s,s satisfying ss11, and measure how variations within this range affect the agent’s reported action. This extension aligns with the continuous domain privacy models introduced in [9].

Definition 3.

A randomized strategy with domain is said to be ϵ-differentially private in the continuous model if, for all actions xn{1,+1} and for all adjacent signals sn,sn satisfying snsn11, we have: 111In the full version (arXiv:2502.19525), we consider a relaxed adjacency condition snsn1a, which amounts to a simple rescaling and does not affect the main results.

((sn;hn1)=xn)exp(ϵ)((sn;hn1)=xn). (1)

4 Main Results

4.1 The Binary Model

A critical parameter in the binary model is cascade threshold k, which is defined as the minimal absolute difference between the number of +1 and 1 actions in the observed history hn1 such that agent n chooses an action independently of their private signal. If this difference exceeds k, then the agent follows the majority regardless of what private signal they receive, initiating the information cascade.

Theorem 4.

Before an information cascade occurs, agents use the randomized response strategy with flip probability un=11+eϵ. After the cascade begins, agents report their actions truthfully, i.e., un=0.

Theorem 5.

Let p be the signal accuracy and α=(1p)/p. Define ϵk=log((1wk)/wk), where wk=1α(k2)/(k1)1α(k2)/(k1)+α1/(k1)α. Then for any ϵ(ϵk+1,ϵk], the cascade threshold remains k, and the probability of a correct cascade increases with ϵ.

The learning performance in the binary model is quantified by the probability of correct cascade, which is lower bounded by p for all ϵ>0 and is upper bounded by its non-private baseline (ϵ) at p2/(12p+2p2). However, this probability does not increase monotonically from p to p2/(12p+2p2) as the privacy budget increases in the range 0<ϵ<. Within each of the intervals (ϵk+1,ϵk] identified in Theorem 5, the probability of correct cascade increases monotonically with increasing privacy budget, consistent with a traditional view of privacy-utility trade-offs. However, reducing the privacy budget immediately below ϵk+1 increases the probability of a correct cascade. This is because as the privacy budget is reduced immediately below the lower limit of the interval, the cascade threshold increases by one, from k to k+1, causing the correct cascade probability to also increase.

4.2 The Continuous Model

With Gaussian signals, asymptotic learning is always guaranteed in the non-private case. Without loss of generality, if we assume θ=+1, then the public belief of agent n, (θ=+1|x1,,xn1), that is, their belief in the true state given the observed actions of the n1 preceding agents but excluding their private signal sn, converges to one as n. Analysis of the learning dynamics in the continuous case is based on the log-likelihood ratio of the public belief after observing the reports x1,,xn1 of the first n1 agents, defined as follows:

ln=log(θ=+1|x1,,xn1)(θ=1|x1,,xn1). (2)

Given θ=+1, ln eventually ceases to change sign and remains positive from a certain time onward, with probability one. Subsequently, we are interested in the convergence rate of the public log-likelihood ratio ln by determining a function f(n) that exhibits the same long-term behavior as ln, i.e., limnln/f(n)=1 with probability 1. This function f(n) then serves as a measure of the speed of asymptotic learning.

Gaussian signals preclude information cascades because agents can always receive sufficiently strong private signals to overturn the public belief. For agent n, we denote the threshold value of the private signal at which the optimal action of the agent changes from 1 to +1 by t(ln), noting that this threshold is determined by the log-likelihood ratio of the public belief, ln. The absence of information cascades in the continuous case implies that without a carefully tailored privacy mechanism under the randomized response strategy, all actions are flipped with a constant probability and asymptotic learning fails, even though agents eventually know the true state and the log-likelihood ratio of public belief converges to + for θ=+1. Here, we propose the smooth randomized response strategy, which modulates the flip probability based on the signal’s distance from the decision threshold, while preserving differential privacy.

Definition 6.

A mechanism (sn;hn1) follows a smooth randomized response strategy with flip probability un if, given an initial decision an, the reported action xn is determined as follows:

(xn=+1an=1)=(xn=1an=+1)=un(sn),
(xn=+1an=+1)=(xn=1an=1)=1un(sn),

with the flip probability un defined as:

un(sn)={11+eϵeϵ(|snt(ln)|1).if sn(,t(ln)1)(t(ln)+1,)11+eϵ,if sn[t(ln)1,t(ln)+1]

Here, t(ln) represents the threshold value for different actions as a function of the public belief ln. If sn>t(ln), agents prefer to choose action +1 before flipping the action. Conversely, if sn<t(ln), agents prefer to choose action 1 before flipping.

Hann-Caruthers, Martynov and Tamuz characterize the asymptotic speed of learning precisely [11], showing that for Gaussian signals, the log-likelihood ratio evolves at a rate of Θ(log(n)), which is very slow. This result implies that, while asymptotic learning is guaranteed, the expected time for learning to begin is infinite. The smooth randomized response mechanism above achieves an accelerated convergence rate of Θϵ(logn) that is a significant improvement over Θ(logn) in the non-private case [11].

Theorem 7.

The smooth randomized response strategy with parameter ϵ satisfies ϵ-differential privacy under Gaussian signals. Under the smooth randomized response strategy and for any ϵ(0,), asymptotic learning occurs in the continuous model with Gaussian signals.

Theorem 8.

Let C(ϵ)=ϵeϵσ22(1+eϵ)(eϵ+ϵ2σ22eϵ+ϵ2σ22). For ϵ(0,), the convergence rate of the public log-likelihood ratio under this strategy is given by f(n)=2ϵσ2ln(C(ϵ)n)2ϵσ2ln(n).

In the non-private setting, the expected stopping time for achieving the first correct decision and the expected number of incorrect actions both diverge to infinity, meaning that early agents may continue making incorrect decisions for an unreasonably long period.

In the full version (arXiv:2502.19525), we show that when ϵ<2/σ2, both of these measures remain finite and are minimized at some 0<ϵ<2/σ2, suggesting a sharp and surprising improvement over sequential learning outcomes in the non-private regime. Learning with continuous signals in the private regime is more efficient because the varying probability of flipping actions between the true and false states under the smooth randomized response amplifies the log-likelihood ratio over time and improves the aggregation of private information between the agents.

Our findings challenge conventional views on privacy-utility trade-offs, suggesting that well-designed privacy mechanisms can simultaneously achieve strong privacy and fast learning, offering new insights into the design of privacy-aware decision systems.

References

  • [1] Daron Acemoglu, Munther A. Dahleh, Ilan Lobel, and Asuman Ozdaglar. Bayesian learning in social networks. The Review of Economic Studies, 78(4):1201–1236, 2011.
  • [2] Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. Learning from reviews: The selection effect and the speed of learning. Econometrica, 90(6):2857–2899, 2022.
  • [3] S. Nageeb Ali. Herding with costly information. Journal of Economic Theory, 175:713–729, 2018. doi:10.1016/J.JET.2018.02.009.
  • [4] Itai Arieli, Yakov Babichenko, Stephan Müller, Farzad Pourbabaee, and Omer Tamuz. The hazards and benefits of condescension in social learning. arXiv preprint arXiv:2301.11237, 2023. doi:10.48550/arXiv.2301.11237.
  • [5] Abhijit V. Banerjee. A Simple Model of Herd Behavior. The Quarterly Journal of Economics, 107(3):797–817, 1992.
  • [6] Sushil Bikhchandani, David Hirshleifer, Omer Tamuz, and Ivo Welch. Information cascades and social learning. Working Paper 28887, National Bureau of Economic Research, June 2021.
  • [7] Sushil Bikhchandani, David Hirshleifer, and Ivo Welch. A theory of fads, fashion, custom, and cultural change as informational cascades. Journal of Political Economy, 100(5):992–1026, 1992.
  • [8] Krishna Dasaratha and Kevin He. Aggregative efficiency of bayesian learning in networks. arXiv preprint arXiv:1911.10116, 2019.
  • [9] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014. doi:10.1561/0400000042.
  • [10] Mira Frick, Ryota Iijima, and Yuhta Ishii. Belief convergence under misspecified learning: A martingale approach. The Review of Economic Studies, 90(2):781–814, 2023.
  • [11] Wade Hann-Caruthers, Vadim V. Martynov, and Omer Tamuz. The speed of sequential asymptotic learning. Journal of Economic Theory, 173:383–409, 2018. doi:10.1016/J.JET.2017.11.009.
  • [12] Matan Harel, Elchanan Mossel, Philipp Strack, and Omer Tamuz. Rational groupthink. The Quarterly Journal of Economics, 136(1):621–668, 2021.
  • [13] Wanying Huang. Learning about informativeness. arXiv preprint arXiv:2406.05299, 2024.
  • [14] Yuval Peres, Miklós Z Rácz, Allan Sly, and Izabella Stuhl. How fragile are information cascades? The Annals of Applied Probability, 30(6):2796–2814, 2020.
  • [15] Dinah Rosenberg and Nicolas Vieille. On the efficiency of social learning. Econometrica, 87(6):2141–2168, 2019.
  • [16] Lones Smith and Peter Norman Sorensen. Rational social learning by random sampling. Available at SSRN 1138095, 2013.
  • [17] Lones Smith and Peter Sørensen. Pathological outcomes of observational learning. Econometrica, 68(2):371–398, 2000.