The exponential increase in the amount of available data makes taking advantage of them without violating users' privacy one of the fundamental problems of computer science. This question has been investigated thoroughly under the framework of differential privacy. However, most of the literature has not focused on settings where the amount of data is so large that we are not even able to compute the exact answer in the non-private setting (such as in the streaming setting, sublinear-time setting, etc.). This can often make the use of differential privacy unfeasible in practice.

In this paper, we show a general approach for making Monte-Carlo randomized approximation algorithms differentially private. We only need to assume the error R of the approximation algorithm is sufficiently concentrated around 0 (e.g. 𝔼[|R|] is bounded) and that the function being approximated has a small global sensitivity Δ. Specifically, if we have a randomized approximation algorithm with sufficiently concentrated error which has time/space/query complexity T(n,ρ) with ρ being an accuracy parameter, we can generally speaking get an algorithm with the same accuracy and complexity T(n,Θ(ε ρ)) that is ε-differentially private.

Our technical results are as follows. First, we show that if the error is subexponential, then the Laplace mechanism with error magnitude proportional to the sum of the global sensitivity Δ and the subexponential diameter of the error of the algorithm makes the algorithm differentially private. This is true even if the worst-case global sensitivity of the algorithm is large or infinite. We then introduce a new additive noise mechanism, which we call the zero-symmetric Pareto mechanism. We show that using this mechanism, we can make an algorithm differentially private even if we only assume a bound on the first absolute moment of the error 𝔼[|R|].

Finally, we use our results to give either the first known or improved sublinear-complexity differentially private algorithms for various problems. This includes results for frequency moments, estimating the average degree of a graph in subliinear time, rank queries, or estimating the size of the maximum matching. Our results raise many new questions and we state multiple open problems.