On White-Box Learning and Public-Key Encryption

Liu, Yanyi; Mazor, Noam; Pass, Rafael

doi:10.4230/LIPIcs.ITCS.2025.73

On White-Box Learning and Public-Key Encryption

Yanyi Liu Cornell Tech, New York, NY, USA Noam Mazor Tel Aviv University, Israel Rafael Pass Cornell Tech, New York, NY, USA
Technion, Haifa, Israel
Tel Aviv University, Israel

Abstract

We consider a generalization of the Learning With Error problem, referred to as the white-box learning problem: You are given the code of a sampler that with high probability produces samples of the form $y,f(y)+\epsilon$ where $\epsilon$ is small, and $f$ is computable in polynomial-size, and the computational task consist of outputting a polynomial-size circuit $C$ that with probability, say, $1/3$ over a new sample $y^{\prime}$ according to the same distributions, approximates $f(y^{\prime})$ (i.e., $|C(y^{\prime})-f(y^{\prime})|$ is small). This problem can be thought of as a generalizing of the Learning with Error Problem (LWE) from linear functions $f$ to polynomial-size computable functions.

We demonstrate that worst-case hardness of the white-box learning problem, conditioned on the instances satisfying a notion of computational shallowness (a concept from the study of Kolmogorov complexity) not only suffices to get public-key encryption, but is also necessary; as such, this yields the first problem whose worst-case hardness characterizes the existence of public-key encryption. Additionally, our results highlights to what extent LWE “overshoots” the task of public-key encryption.

We complement these results by noting that worst-case hardness of the same problem, but restricting the learner to only get black-box access to the sampler, characterizes one-way functions.

Keywords and phrases:

Public-Key Encryption, White-Box Learning

Funding:

Yanyi Liu: Research partly supported by NSF CNS-2149305.

Noam Mazor: Research partly supported by NSF CNS-2149305, AFOSR Award FA9550-23-1-0312 and AFOSR Award FA9550-23-1-0387 and ISF Award 2338/23.

Rafael Pass: Supported in part by NSF Award CNS 2149305, AFOSR Award FA9550-23-1-0387, AFOSR Award FA9550-23-1-0312 and AFOSR Award FA9550-24-1-0267 and ISF Award 2338/23. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government, or the AFOSR.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Computational complexity and cryptography

Editors:

Raghu Meka

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Public-Key Encryption (PKE) [12, 31] is one of the most central cryptographic primitives enabling secure communication on the Internet: it is the primitive that enables entities that have not physically met to engage in confidential conversations and collaborations.

In contrast to private-key primitives, such as symmetric-key encryption and digital signatures, that can be securely build from the minimal primitive of one-way functions (and for which many candidate problems are known), we only know of a handful of candidate hard problems from which public-key encryption can be constructed. More specifically, these include (a) number-theory problems based on either factoring [31, 29] or discrete logarithms [12, 13], (b) coding-theory based problems [27], (c) lattice problems as finding shortest/longest vectors in lattices [1, 30, 8], and (d) noisy linear-algebra based problems [2, 5]. Out of these, the number-theory based problems can be efficiently solved by quantum algorithms [32], and the coding-theory, lattice and noisy linear algebra problems are all very related – in essence, they can all be viewed as different instances of solving noisy systems of linear equations (on either particular natural distributions, or even in the worst-case, when restricting attention to systems of equations satisfying the condition that appropriate solutions exist.¹¹1In more detail, we require worst-case hardness to hold when conditioning on instances that define a lattice where the shortest vector is long compared to amount of noise.

The main purpose of this paper is to provide an assumption that generalizes all these assumptions (i.e., is implied by all of them), yet suffices for obtaining secure PKE. Indeed, the main result of this paper is that hardness of a notion of white-box learning achieves this goal.

White-box Learning

Perhaps the most common noisy linear algebra-based assumption is the hardness of the Learning With Error (LWE) problem [30], which, in essence, stipulates the hardness of recovering a vector ${\bf x}$ given ${\bf A},{\bf Ax+e}$ , where ${\bf A}$ is a matrix, ${\bf e}$ is some “small” noise vector and all arithmetic is modulo some prime $p$ . In more detail, we typically require an stronger condition: not only that is hard to recover $x$ , but also that it is hard to compute a value “close” to ${\bf a}^{T}{\bf x}$ for a random vector ${\bf a}$ . In other words, we can think of ${\bf x}$ as the description of a function $f_{\bf x}({\bf a})={\bf a}^{T}{\bf x}$ that we are trying to improperly approximately learn given noisy samples – thus the name “learning with error”.

In fact, there is also a different way to think about the LWE problem that will be useful for our purposes (which follows from the construction of Regev’s PKE [30]): We are given the code of a sampler $P_{x}$ that enables providing samples of the form $({\bf a},f_{\bf x}({\bf a})+{\bf e})$ , and the goal is to approximate $f_{\bf x}({\bf a}^{\prime})$ on a new fresh sample ${\bf a}^{\prime}$ according to the same distribution as ${\bf a}$ .²²2The reason for this is that given ${\bf A},{\bf Ax+e}$ , we can generate noisy samples of $f_{\bf x}({\bf a}^{\prime})$ by taking linear combinations. See [30] and the full version of this paper for more details. We refer to this alternative way of thinking of LWE as an instance of (improper) white-box learning, where, more generally, we are given the code of a sampler $P$ that generates noisy samples of the form $(y,f(y)+\epsilon)$ and the goal is to approximate $f(y^{\prime})$ for a fresh sample $y^{\prime}$ according to the same distribution, using a polynomial-size circuit. In essence, this problem is generalizing the LWE problem from (improperly) learning linear functions from noisy samples, to (improperly) learning a polynomial-size circuit from noisy samples, and given white-box access to the sampler³³3As we will discuss in more detail later, it is also more general than the LWE problem in the aspect the LWE sampler has a particular form, but we here may consider more general classes of samplers.; the white-box access feature can be viewed as a generalization of Valiant classic PAC-learning model [35] to a setting where the learner not only gets random samples, but also gets the code of the sampler.

In more detail, given a sampler circuit $P$ that samples “labeled instances” $(y,z)$ (where we think of $z\in{\mathbb{N}}$ as a, perhaps, noisy label for $y$ ), let ${\sf Comp}_{\epsilon}^{\Delta}(P)$ denote the set of circuits $f$ that with probability $1-\epsilon$ over $(y,z)\leftarrow P$ satisfy the property that $|z-f(y)|\leq\Delta$ (when interpreting both $z$ and $f(z)$ as elements in ${\mathbb{N}}$ ). For a function $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ , let $\Delta$ - ${\sf WBLearn}$ denote the following learning problem:

$\blacksquare$

Input: Circuit $P\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}$ , with the promise that there exists a circuit $\widehat{C}\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ such that $\widehat{C}\in{\sf Comp}^{\Delta(1^{n})/n^{10}}_{n^{-10}}(P)$ .
$\blacksquare$

Output: Circuit $C\in{\sf Comp}^{\Delta(1^{n})}_{1/3}(P)$ .⁴⁴4We remark that for efficient algorithms, outputting a circuit that approximates a function $f$ is essentially equivalent to approximating the value of $f(y^{\prime})$ on a random sample $y^{\prime}$ . Indeed, the circuit implementation of an algorithm for the latter task, is a valid output for the first task.

In other words, we are given a sampler that with very high ( $1-n^{-10}$ ) probability outputs labelled samples where the label is very close ( $\Delta(1^{n})/n^{10}$ ) to being correct, and the goal is to, given the code of the sampler $P$ , simply find a circuit $C$ that with probability $2/3$ approximates the label within $\Delta(1^{n})$ (i.e., with a factor $n^{10}$ higher error). We use ${\sf ExactWBLearn}$ to denote $\Delta$ - ${\sf WBLearn}$ when $\Delta(1^{n})=0$ .

Hardness of Learning v.s. (Public-Key) Cryptography

Roughly speaking, the main result of this paper will be to show that under certain restrictions on samplers $P$ (which come from the study of time-bounded Kolmogorov complexity [23, 34, 11, 22, 17, 33] and in particular the notion of computational depth [4]), this generalization of the LWE problem not only suffices to realize a PKE, but will also be necessary. In more detail, worst-case hardness of $\Delta$ - ${\sf WBLearn}$ under these conditions will characterize the existence of public-key encryption. As such, our results yield insight on the extent with which LWE “overshoots” the task of public-key encryption.

We highlight that we are not the first ones to consider connections between learning-theory and cryptography; however, as far as we know, all earlier connections were between the hardness of learning and private-key cryptography (i.e., the notion of one-way functions), as opposed to public-key cryptography. Indeed, classic results from [21, 7] demonstrate the equivalence of the hardness of a notion of average-case PAC-learning of polynomial-size circuits (i.e., black-box learning) and one-way functions.⁵⁵5And the results of [20] can be thought of as characterizing one-way functions through a different type of learning. In contrast, our focus here is on PKE, and instead of considering black-box learning, we consider white-box learning. An additional difference is that we consider worst-case hardness, as opposed to average-case hardness, of the learning theory problem. As was recently shown in [18], this issue can be overcome (in the context of characterizing one-way functions) through the use of a (alternative) notion of computational depth (more details on the relationship with this work below). To put out result in context, we additionaly show that exactly the same problem that we demonstrate characterizes PKE, also characterizes one-way functions once modified to only provide the learner black-box access to the sampler.

1.1 Our Results

Towards explaining our results in more detail, let us first recall the notion of computational depth [4]. Let the $t$ -computational depth of $x$ , denoted $cd^{t}(x)$ , be defined as $cd^{t}(x)={\mathrm{K}}^{t}(x)-{\mathrm{K}}(x)$ where ${\mathrm{K}}(x)$ denotes the Kolmogorov complexity of $x$ and ${\mathrm{K}}^{t}$ denotes the $t$ -bounded Kolmogorov complexity of $x$ . That is, $cd^{t}(x)$ measures how much more $x$ could be compressed if the decompression algorithm may be computationally unbounded, as opposed to it running in time bounded by $t(|x|)$ .

We will focus on instances $P$ of the $\Delta\textit{-}{\sf WBLearn}$ problem having the property that $(P,\widehat{C})$ is “computationally shallow”, where $\widehat{C}$ is a circuit that agrees with $P$ with high probability: let $\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}$ denote $\Delta$ - ${\sf WBLearn}$ with the additional promise that $cd^{t}(P,\widehat{C})\leq 2\log n$ .⁶⁶6We note that there is nothing special about the constant 2; it can be anything that is strictly larger than 1.

Characterizing PKE through Hardness of White-Box Learning

Our main results is that the hardness of this problem characterizes the existence of public-key encryption:

Theorem 1.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be an efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then, following are equivalent:

$\blacksquare$

PKE exists.
$\blacksquare$

${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ .

Computational depth is typically thought of as measure of “unnaturality” of strings: strings with low computational depth are considered “natural” and those with high computational depth are considered “unnatural”.⁷⁷7The reason why low computational depth captures “natural string” is as follows: random strings are known to have low computational depth; furthermore, known results (c.f. slow growth laws [18]) show (at least under derandomization assumptions) that one needs to have a long running time to even find a string with high computational depth. So strings with high computational depth are rare and “hard to find”, which is why they can be thought of as “unnatural”. Given this interpretation, our characterization of public key encryption is thus in terms of the worst-case hardness of white-box learning for “natural” instances.

As far as we know, this thus yields the first problem whose worst-case hardness not only suffices for public-key encryption (such as e.g., [30]) but also is necessary.

On the Use of Computational Depth

We note that Antunes and Fortnow [3] elegantly used computational depth to connect worst-case hardness of a problem when restricting attention to elements with small computational depth and errorless average-case hardness on sampleable distributions; errorless hardness, however, is not sufficient for cryptographic applications. Nevertheless, inspired by the work of [3], worst-case hardness conditioned on instances with small computational depth was used in [26] (and independently using a variant of this notion in [18]) to characterize one-way functions; additionally, an (interactive) variant of such a notion was also implicitly used in [6] to characterize key exchange protocols. Our techniques are similar to those employed in [26, 6] but instead of applying them to study the hardness of a time-bounded Kolmogorov complexity problem (following [25]), we here instead apply them to study a learning theory problem (namely, white-box learning).

We note that learning theory problems conditioned on small computational depth were recently used in [18] to characterize one-way functions, but our techniques here are more similar to [26, 6]. In particular, [18] does not actually use the standard notion of computational depth but instead define a new alternative variant; in contrast, we here rely on just the standard notion.

Relating Exact and Approximate White-Box Learning

Note that in Theorem 1, the equivalence hold for any (sufficiently small) choice of $\Delta$ , as such we directly get as a corollary the equivalence of Exact and Approximate White-box Learning:

Corollary 2.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be any efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and let $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be any polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then the following are equivalent:

$\blacksquare$

${\sf ExactWBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$
$\blacksquare$

$\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$

Bounded-Degree Learning and LWE

While in the $\Delta$ - ${\sf WBLearn}$ problem, we allow the function we are trying to learn to be any polynomial-size circuit, we may also consider a restricted version of the problem, denoted $\Delta$ - ${\sf WBLearn}^{d}_{q}$ , where we restrict attention to functions that can be computed by a degree $d$ polynomial, and we assume that arithmetic is now over ${\mathbb{Z}}_{q}$ .

We first remark that our main theorem can next be generalized (basically using padding) to show that it suffices to use learning of degree- $n^{\epsilon}$ polynomials to characterize PKE:

Theorem 3.

Let $\epsilon>0$ be a constant and let $\Delta,q\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be efficiently computable functions with $\Delta(1^{n})\leq q(1^{n})/4$ and $q(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be polynomial such that $t(n)\geq n^{1+\epsilon}$ . The following are equivalent:

$\blacksquare$

PKE exists.
$\blacksquare$

${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ .
$\blacksquare$

${\Delta\textit{-}{\sf WBLearn}^{n}_{q}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ .
$\blacksquare$

${\Delta\textit{-}{\sf WBLearn}^{n^{\epsilon}}_{q}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ .

Additionally, as informally discussed above, we note that the hardness of the LWE problem, implies hardness of the $\Delta$ - ${\sf WBLearn}^{1}_{q}$ problem (i.e., white-box learning of linear functions.)

Lemma 4 (Informal).

Assuming the hardness of LWE, there exists a polynomial $t$ such that ${\Delta\textit{-}{\sf WBLearn}^{1}_{q}|_{{\sf CS}^{t}}}\not\in{\sf ioFBPP}$ where $\Delta=q/4$ .

Lemma 4 thus shows that white-box learning of linear functions is at least as weak an assumption as LWE. At first sight, one would expect that a converse result may also hold due to known worst-case to average-case reductions for the LWE problem [30, 28, 9] and thus that LWE is equivalent to white-box learning of linear functions. However, the problem with proving the converse direction is that the known worst-case to average-case reductions for LWE only work when the LWE instance defines a lattice where the shortest vector is long compared to amount of noise. Instances sampled from $P$ may not necessarily satisfy this promise, and thus is it not clear how to use an LWE oracle to generally solving white-box learning of linear functions. Thus, it would seem that hardness even of just ${\Delta\textit{-}{\sf WBLearn}^{1}_{q}|_{{\sf CS}^{t}}}$ is seemingly a weaker (and therefore more general) assumption than LWE. (Of course, it may be that a stronger worst-case to average-case reduction can be established for the LWE problem, in which case equivalence would hold.)

We finally investigate what happens in the regime of “intermediate-degree” polynomials. We remark that using standard linearization techniques, the constant-degree problem is equivalent to the case of degree 1:

Lemma 5.

For every constant $d\in{\mathbb{N}}$ , and functions $t,q,\Delta\colon{\mathbb{N}}\to{\mathbb{N}}$ ,
there exist $t^{\prime},q^{\prime},\Delta^{\prime}\colon{\mathbb{N}}\to{\mathbb{N}}$ such that

\Delta\textit{-}{\sf WBLearn}^{d}_{q}|_{{\sf CS}^{t}}\leq_{p}\Delta^{\prime}% \textit{-}{\sf WBLearn}^{1}_{q^{\prime}}|_{{\sf CS}^{t^{\prime}}},

and

\Delta\textit{-}{\sf WBLearn}^{d}_{q}\leq_{p}\Delta^{\prime}\textit{-}{\sf WBLearn% }^{1}_{q^{\prime}}.

Black-box Learning

Finally, to put our results in context, we consider the standard PAC learning model [35] where the learner only get access to samples: Let $\Delta$ - ${\sf BBLearn}$ denote identically the same problem as ${\sf WBLearn}$ with the exception that the learner gets oracle access (i.e., black-box access) to the sampler (as opposed to white-box access to the sampler). This notion is equivalent to the notion of improper $\Delta$ -approximate PAC learning for polynomial-size circuits (and when $\Delta=0$ to simply improper PAC learning).

As before, let $\Delta$ - ${\sf BBLearn}|_{{\sf CS}^{t}}$ denote the problem $\Delta$ - ${\sf BBLearn}$ with the additional promise that $cd^{t}(P,\widehat{C})\leq 2\log n$ , and let ${\sf ExactBBLearn}$ and ${\sf ExactBBLearn}|_{{\sf CS}^{t}}$ to denote $\Delta$ - ${\sf BBLearn}$ and $\Delta$ - ${\sf BBLearn}|_{{\sf CS}^{t}}$ when $\Delta(1^{n})=0$ .

The following theorem can be viewed as the worst-case analog of the classic result of [21, 7] characterizing one-way functions through the hardness of average-case PAC learning.

Theorem 6.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be any efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and let $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be any polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then the following are equivalent:

$\blacksquare$

One-way function exists
$\blacksquare$

$\Delta$ - ${\sf BBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$

As mentioned, [18] also recently obtained a worst-case characterization of one-way functions through a learning problem, and using a notion of computational depth. The problems, however, are somewhat different. As opposed to [18], we here consider the standard PAC learning problem (whereas they consider a more general learning problem), and condition on the standard notion of low computational depth (whereas they condition on a new notion of low computational depth that they introduce).⁸⁸8Of course, on a conceptual level, these results are similar; the key point we are trying to make here is that exactly the same learning problem characterizes either one-way functions or PKE, depending on whether the learner gets black-box or white-box access to the sampler.

We remark that our Theorem 6 differs from [21, 7] not only in the worst-case condition, but also generalizes those results in the sense that we handle the hardness of $\Delta$ -approximate learning for any $\Delta$ . As a consequence, we again get an equivalence of approximate and exact black-box learning:

Corollary 7.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be any efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and let $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be any polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then the following are equivalent:

$\blacksquare$

${\sf ExactBBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$
$\blacksquare$

$\Delta$ - ${\sf BBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$

Open Problems

We leave as an intriguing open problem the question of whether white-box learning of polynomial, or even logarithmic-degree polynomials, also can be collapsed down to the case of constant-degree functions (and thus to linear functions); if this were possible, it would show that PKE, in essence, inherently requires the structure of the LWE problem.

Additionally, as discussed above, even when just restricting attention to learning linear functions, our learning problem generalizes LWE. It would appear that despite this, cryptographic applications of LWE (e.g., to obtain fully homomorphic encryption [14, 10]) nevertheless may still be possible from this generalized version; we leave an exploration of this for future work.

Finally, it is an intriguing open problem to relate black-box and white-box learning. By our results, doing so is equivalent to relating public-key and secret-key encryption. We note that relating black-box and white-box learning is interesting even just for the case of linear functions (which by our results is equivalent to $O(1)$ -degree polynomials). Indeed, Regev’s construction of a PKE [30] can be thought of a reduction from black-box learning to white-box learning of linear functions for a specific distribution; it is possible that a similar reduction may be applicable more generally.

1.2 Proof Overview

We here provide a detailed proof overview for the proof of Theorem 1. For simplicity, we will show the equivalence between PKE and the worst-case hardness of the exact version, ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ . We start with the construction of PKE based on the hardness of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ .

Weak PKE

First, we use the well-known fact that a PKE is simply a two-rounds key-agreement protocol. Moreover, by the Key-agreement Amplification Theorem of Holenstein [19] and an application of the Goldreich-Levin theorem [15], to obtain (full-fledged) PKE, it suffices to obtain a weak form of two-rounds key-agreemnt, which we simply refer to as Weak PKE defined as follows: There exist some $\epsilon=1/{\mathrm{poly}}$ such that agreement between A and B happens with probability $1-\epsilon$ . Security requires that Eve cannot guess the key (output by Alice) with probability better than $1-20\epsilon$ .

The Weak PKE protocol

We will next define the weak PKE protocol. We note that this construction resembles the universal key-agreement construction from [16], but with some crucial difference that enable our security proof. The parties A and B on input $n$ perform the following steps:

$\blacksquare$

Sample random program: A samples a random length $\lambda\in[2n]$ , and a random program $\Pi$ of length $\lambda$ .
$\blacksquare$

Run random program: Next, A runs the program $\Pi$ for at most $t(n)$ steps to get an output, and interprets the output as a pair of circuits $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ and $C\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{\ell}$ . (Think of $P$ as the sampler for a white-box learning instance, and of $C$ as a potential solution to the problem.) If the output of $\Pi$ is not a valid encoding of such a pair, A sets $P=P_{0}$ and $C=C_{0}$ for two fixed circuits $P_{0},C_{0}$ that always output $0$ .
$\blacksquare$

Agreement estimation: A estimates the “agreement probability” of $P$ and $C$ (i.e., checking whether $C$ indeed is a good solution): it samples $n^{20}$ random inputs $w_{1},\dots,w_{n^{20}}$ for $P$ , and computes $(x_{i},s_{i})=P(w_{i})$ . It then lets $\widehat{\alpha}={\mathrm{Pr}}_{i\leftarrow[n^{20}]}\mathopen{}\mathclose{{% \left[C(x_{i})=s_{i}}}\right]$ . If $\widehat{\alpha}\leq 1-n^{-9}$ , A reset the pair $(P,C)=(P_{0},C_{0})$ .
$\blacksquare$

First message: A sends (the “sampler”) $P$ to B.
$\blacksquare$

Second message: B applies $P$ on a random input $w$ , and computes $(x,s)=P(w)$ . It then sends $x$ to A.
$\blacksquare$

Outputs: A outputs $C(x)$ and Bob outputs $s$ .

Agreement

We claim that with probability $1-1/n^{8}$ , Alice and Bob will agree (i.e., the final outputs are the same). Note that if $(P,C)=(P_{0},C_{0})$ , Alice and Bob always agree. Moreover, let $\alpha={\mathrm{Pr}}_{(x,s)\leftarrow P(U_{r})}\mathopen{}\mathclose{{\left[C(% x)=s}}\right]$ . Then, the probability of Alice and Bob to agree given that Alice uses $(P,C)$ as the circuits in the protocol, is exactly $\alpha$ . We observe that by the Chernoff bound, the probability that $\widehat{\alpha}$ is far from $\alpha$ is small, and thus Alice uses $(P,C)$ only when $\alpha$ is larger than $1-n^{-8}$ .

Security

We claim that ${\sf Eve}$ that can guess $s$ with probability $1-n^{-7}$ can be used to solve
${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ . In more detail, consider such ${\sf Eve}$ , and let $P$ be an input for
${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ . The idea is to construct an algorithm ${\sf Learner}$ , that on input $P$ outputs a circuit $C$ , such that $C(x)$ simply simulates ${\sf Eve}$ on the messages $P$ and $x$ . $C$ then outputs ${\sf Eve}$ ’s output. For simplicity, in the following we assume that ${\sf Eve}$ (and thus ${\sf Learner}$ ) is deterministic. (We note that this is a non-black-box reduction: We are using the code of Eve to generate $C$ – in particular, we are including the code of Eve into this circuit.⁹⁹9This particular non-black usage of Eve is not inherent. We could have considered a different formalization of the learning theory problem which simply requires the attacker to succeed on a randomly sampled instance. Subsequent parts of the argument, however, will use non-black-box access to Eve more inherently.)

Let ${\bf P}$ be a random variable distributed according to the same distribution of the first message in the above protocol. By assumption, we have that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[C^{\prime}={\sf Learner}({\bf P});(x% ,s)\leftarrow{\bf P}(U_{r});C^{\prime}(x)=s}}\right]\geq 1-n^{-7}.

It follows by a simple averaging argument that

{\mathrm{Pr}}_{{\bf P},C^{\prime}={\sf Learner}({\bf P})}\mathopen{}\mathclose% {{\left[{\mathrm{Pr}}_{w}\mathopen{}\mathclose{{\left[(x,s)\leftarrow{\bf P}(w% );C^{\prime}(x)=s}}\right]\geq 2/3}}\right]\geq 1-3n^{-7}.

Namely, ${\sf Learner}$ solves ${\sf ExactWBLearn}$ with probability at least $1-3n^{-7}$ over the distribution of ${\bf P}$ . We next use ideas from [26, 6] to show this implies that ${\sf Learner}$ solves ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ in the worst case.

Indeed, assume that ${\sf Learner}$ fails on some instance $P$ of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ . By the promise of the problem, there exists a circuit $\widehat{C}$ such that $cd^{t}(P,\widehat{C})\leq 2\log n$ , and $\widehat{C}$ agrees with $P$ with probability at least $1-n^{-10}$ . Let $\ell={\mathrm{K}}^{t}(P,\widehat{C})$ . Our goal is to show that ${\mathrm{K}}(P,\widehat{C})<\ell-2\log n$ , which is a contradiction.

Toward this goal, let Let ${\cal S}_{\ell}$ be the set of all pairs of circuits $(P^{\prime},\widehat{C^{\prime}})$ with ${\mathrm{K}}^{t}(P^{\prime},\widehat{C^{\prime}})=\ell$ that agree with probability at least $1-n^{-10}$ , and on which ${\sf Learner}$ fails, so that $(P,\widehat{C})\in{\cal S}_{\ell}$ . Fix $(P^{\prime},\widehat{C^{\prime}})\in{\cal S}_{\ell}$ , and let $\Pi$ be the length $\ell$ program that outputs $(P^{\prime},\widehat{C^{\prime}})$ in time $t$ . Observe that the probability of A to sample $(P^{\prime},\widehat{C^{\prime}})$ in the first step of the above protocol is at least its probability to sample the program $\Pi$ , which is $1/2n\cdot 2^{-\ell}$ . Since $(P^{\prime},\widehat{C^{\prime}})$ agree with high probability, the equality test it the third step of the protocol will pass with high probability, and thus $A$ will send $P^{\prime}$ to B with probability at least $1/4n\cdot 2^{-\ell}$ . In other words,

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[{\bf P}=P^{\prime}}}\right]\geq 1/4n% \cdot 2^{-\ell}

for every $(P^{\prime},\widehat{C^{\prime}})\in{\cal S}_{\ell}$ , and thus

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[({\bf P},\cdot)\in{\cal S}_{\ell}}}% \right]\geq\mathopen{}\mathclose{{\left|{\cal S}_{\ell}}}\right|\cdot 1/4n% \cdot 2^{-\ell}

(here we say that $(P,\cdot)\in{\cal S}_{\ell}$ if there is some $C$ such that $(P,C)\in{\cal S}_{\ell}$ ).

On the other hand, by definition of ${\cal S}_{\ell}$ , ${\sf Learner}$ fails on every $(P^{\prime},\widehat{C^{\prime}})\in{\cal S}_{\ell}$ . Since ${\sf Learner}$ fails with probability at most $3n^{-7}$ on ${\bf P}$ , it must holds that

3n^{-7}\geq{\mathrm{Pr}}\mathopen{}\mathclose{{\left[({\bf P},\cdot)\in{\cal S% }_{\ell}}}\right].

Combining the above, we get that

\mathopen{}\mathclose{{\left|{\cal S}_{\ell}}}\right|\leq 12n^{-6}\cdot 2^{% \ell}.

We can now use the bound on ${\cal S}_{\ell}$ to bound the Kolmogorov complexity of $(P,\widehat{C})$ . To describe $(P,\widehat{C})$ , it is enough to describe the set ${\cal S}_{\ell}$ , and the index of the pair $(P,\widehat{C})$ in this set. That is,

{\mathrm{K}}(P,\widehat{C})\leq{\mathrm{K}}({\cal S}_{\ell})+\log\mathopen{}% \mathclose{{\left|{\cal S}_{\ell}}}\right|+O(\log\log n)\leq{\mathrm{K}}({\cal S% }_{\ell})+\ell-6\log n+O(\log\log n).

We conclude the proof with the observation that to describe ${\cal S}_{\ell}$ it is enough to describe $n$ (which can be done using $\log n+O(\log\log n)$ bits), $\ell$ ( $\log n$ bits) and ${\sf Eve}$ (that can be described with constant many bits¹⁰¹⁰10This non-black-box usage of Eve (which is taken from [26, 6]) is seemingly inherent to our proof technique. Note that we are here relying on the fact that Eve is a uniform algorithm, but as we discuss in the formal section, the argument can be extended to work also in the non-uniform setting.). Thus, ${\mathrm{K}}({\cal S}_{\ell})\leq 3\log n$ , and we get that

{\mathrm{K}}(P,\widehat{C})<\ell-2\log n,

as we wanted to show.

Hardness of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ from PKE

We next show that the existence of PKE implies the hardness of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ . We now sketch the proof. Let $\mathsf{Gen}$ be the algorithm the generate pair $(pk,sk)$ of public and secret key, and let $\mathsf{Enc},\mathsf{Dec}$ be the encryption and decryption protocols. For a random pair of keys $(pk,sk)\leftarrow\mathsf{Gen}(r)$ , we construct two circuits, $P(w,s)=(x=\mathsf{Enc}_{pk}(s;w),s)$ , and $\widehat{C}(x)=\mathsf{Dec}_{sk}(x)$ .

By the security of the PKE scheme, it follows that with high probability over the randomness of $\mathsf{Gen}$ , it is hard to learn the function of $C$ , as the circuit $P$ only uses the public key, and the function $C$ computes is the decryption of a random encryption. This already implies that ${\sf ExactWBLearn}$ is hard, but we still need to show that $P$ is inside the promise of the problem ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ .

It follows by the correctness of the PKE scheme that $\widehat{C}$ computes the function that is sampled by $P$ . Thus, to be in the promise of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ , we only need that $(P,\widehat{C})$ has small computational depth. This, however, is not necessarily the case. To solve this, we simply pad $\widehat{C}$ with the randomness used by $\mathsf{Gen}$ . That is, we let $\widehat{C^{\prime}}$ be a functionally equivalent circuit to $\widehat{C}$ , with $r$ encoded to it, where $r$ is such that $\mathsf{Gen}(r)=(pk,sk)$ . It follows that when $t$ is large enough, ${\mathrm{K}}^{t}(P,\widehat{C^{\prime}})\leq|r|+O(1)$ (as we can describe them by simply describing the randomness and the algorithm $\mathsf{Gen}$ ). On the other hand, ${\mathrm{K}}(P,\widehat{C^{\prime}})\geq{\mathrm{K}}(r)$ (since $r$ can be obtained from $\widehat{C^{\prime}}$ ), which is at least $\mathopen{}\mathclose{{\left|r}}\right|-O(1)$ with high probability. Together, we conclude that with high probability over the randomness of $\mathsf{Gen}$ , the circuits $(P,\widehat{C^{\prime}})$ are in the promise of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ .

Comparison with [6]

We remark that at a high-level, our construction of a two-message key-agreement protocol shares many similaries with the key-agreement protocol developing in [6], relying on the hardness of an interactive notion of time-bounded Kolmogorov complexity, conditionned on an analog of computational shallowness. They central difference is that the protocol in [6] requires at least 3 rounds due to the use of an equility check protocol to determine whether Alice and Bob managed to agree on a key. In contrast, our protocol does not rely on such an equality check step (and thus it can be executed in only two rounds, which is crucial to get PKE); indeed, we replaced the equality protocol with a step where Alice on her own can determine whether the message she sends will enable agreement to happen. Of course, the reason why we can do this is that we are reducing security from a different problem.

Other features of the protocol are quite similar; this is because we also rely on the worst-case hardness of a problem “conditioning on computationally shallow instances”. Indeed, as described above, our security proof shares many features with those of [25, 6].

Comparison with Universal Constructions

Universal constructions of PKE are known (see [16]); that is, constructions having the property that they are secure if (any) PKE exist. We emphasize that while the details of the protocol from [16] are somewhat different, the “agreement estimation” step performed there is very similar to what we do. Furthermore, we can interpret our protocol as an alternative (variant) universal PKE protocol in which Alice chooses a random (key-generation) program $\mathsf{Gen}$ and executes it to get an encryption scheme $\mathsf{Enc}_{pk}$ and description scheme $\mathsf{Dec}_{sk}$ , with the keys $p k, s k$ hardcoded to the scheme.¹¹¹¹11In the protocol of [16], Alice would instead choose random programs for Alice and Bob to run; we do not know how to prove the security of such a protocol under our assumption. Alice then estimates the agreement probability of the encryption, and if it is high enough, she sends the encryption scheme as the public-key, and uses the decryption scheme as the private-key. If PKE exists, with a noticeable probability over the choice of $\mathsf{Gen}$ , this scheme will be secure. We emphasize that since we base the security of our protocol (and thus also the above universal one) on the hardness of the white-box learning problem, it enables an approach for measuring the concrete security of the protocol by relating it to the security of the learning theory problem.

2 Preliminaries

2.1 Notations

All logarithms are taken in base $2$ . We use calligraphic letters to denote sets and distributions, bold uppercase for random variables, and lowercase for values and functions. Let ${\mathrm{poly}}$ stand for the set of all polynomials. Let ppt stand for probabilistic poly-time, and n.u.-poly-time stand for non-uniform poly-time. An n.u.-poly-time algorithm $\mathsf{A}$ is equipped with a (fixed) poly-size advice string set $\mathopen{}\mathclose{{\left\{z_{n}}}\right\}_{n\in{\mathbb{N}}}$ (that we typically omit from the notation), and we let $\mathsf{A}_{n}$ stand for $\mathsf{A}$ equipped with the advice $z_{n}$ (used for inputs of length $n$ ). For a randomized algorithm $\mathsf{A}$ , we denote by $\mathsf{A}(\cdot;r)$ the algorithm $\mathsf{A}$ with fixed randomness $r\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ . Let $\operatorname{neg}$ stand for a negligible function. Given a vector $v\in\Sigma^{n}$ , let $v_{i}$ denote its $i^{\rm th}$ entry, let $v_{<i}=(v_{1},\dots,v_{i-1})$ and $v_{\leq i}=(v_{1},\dots,v_{i})$ . Similarly, for a set ${\cal I}\subseteq[n]$ , let $v_{\cal I}$ be the ordered sequence $(v_{i})_{i\in{\cal I}}$ . For $x,y\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ , we use $x||y$ to denote the concatenation of $x$ and $y$ . For a set ${\cal S}\subseteq\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ , we use ${\cal S}||y$ to denote the set $\mathopen{}\mathclose{{\left\{x||y\colon x\in{\cal S}}}\right\}$ .

2.2 Distributions and Random Variables

When unambiguous, we will naturally view a random variable as its marginal distribution. The support of a finite distribution ${\cal P}$ is defined by $\operatorname{Supp}({\cal P}):=\mathopen{}\mathclose{{\left\{x\colon{\mathrm{% Pr}}_{{\cal P}}\mathopen{}\mathclose{{\left[x}}\right]>0}}\right\}$ . For a (discrete) distribution ${\cal P}$ , let $x\leftarrow{\cal P}$ denote that $x$ was sampled according to ${\cal P}$ . Similarly, for a set ${\cal S}$ , let $x\leftarrow{\cal S}$ denote that $x$ is drawn uniformly from ${\cal S}$ . For $m\in{\mathbb{N}}$ , we use ${\bf U}_{m}$ to denote a uniform random variable over $\mathopen{}\mathclose{{\left\{0,1}}\right\}^{m}$ (that is independent from other random variables in consideration). The statistical distance (also known as, variation distance) of two distributions ${\cal P}$ and ${\cal Q}$ over a discrete domain ${\cal X}$ is defined by $\mathsf{\textsc{SD}}({{\cal P}},{{\cal Q}}):=\max_{{\cal S}\subseteq{\cal X}}% \mathopen{}\mathclose{{\left|{\cal P}({\cal S})-{\cal Q}({\cal S})}}\right|=% \frac{1}{2}\sum_{x\in{\cal S}}\mathopen{}\mathclose{{\left|{\cal P}(x)-{\cal Q% }(x)}}\right|$ .

The following lemma is proven in the full version of this paper.

Lemma 8.

There exists an efficient oracle-aided algorithm $\mathsf{A}$ such that the following holds. Let $({\bf X},{\bf Y})$ be a pair of jointly distributed random variables over $\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}\times{\mathopen{}\mathclose{{% \left\{0,1}}\right\}^{n}}$ , and let ${\bf R}$ be a uniform independent random variable over ${\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ . Let $\mathsf{E}$ be an algorithm such that ${\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathsf{E}({\bf X},{\bf R})=\langle{% \bf Y},{\bf R}\rangle}}\right]\geq 1-\epsilon$ , for $0\leq\epsilon$ . Then ${\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathsf{A}^{\mathsf{E}}(1^{n},{\bf X% })={\bf Y}}}\right]\geq 1-8\epsilon-\operatorname{neg}(n)$ .

2.3 Circuits

In this paper we consider circuits over the De-Morgan Basis, which contains the following gates: $\land$ (“and” gate with fan-in $2$ ), $\lor$ (“or” gate with fan-in $2$ ), and $\lnot$ (“not” gate with fan-in one). The size of a circuit $C$ , is the number of gates in $C$ .

We consider encoding of a circuit as a string over $\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ in the following natural way: Given a circuit $C$ , we first encode the length of $C$ using a prefix-free encoding, and then for every gate $g$ , according to a topological order, we encode its type (input, output, $\land$ , $\lor$ , or $\lnot$ ), and the (up to two) gates wired into $g$ .

Observe that every circuit $C$ of size $s$ can be encoded to a string of length $O(s\log s)$ . Moreover, given the encoding and an input $x$ to $C$ , $C(x)$ can be computed efficiently (in polynomial time in the encoding length).

We identify a string in $\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ with the circuit it encodes. For example, for $C\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}$ we use $C(x)$ to denote the value of the circuit encoded by $C$ applied on the input $x$ . We denote by $\mathopen{}\mathclose{{\left|C}}\right|$ the length of the encoding of $C$ .

Finally, by encoding first the size of $C$ , we assume that for every encoding of a circuit $C$ , and for every $z\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ , the string $C||z$ encodes the same circuit as $C$ .¹²¹²12We actually only use the fact that we can encode $z$ into the circuit. This can be done by adding dummy gates that do not change the output of $C$ , where each gate $g_{i}$ is either $\land$ or $\lor$ according to the $i$ th bit of $z$ .

2.4 Entropy

For a random variable ${\bf X}$ , let $\operatorname{H}({\bf X})={\mathrm{E}}[\log\frac{1}{{\mathrm{Pr}}[{\bf X}=x]}]$ denote the (Shannon) entropy of ${\bf X}$ , and let $\operatorname{H_{\infty}}({\bf X})=\min_{x\in Supp({\bf X})}\log\frac{1}{{% \mathrm{Pr}}[{\bf X}=x]}$ denote the min-entropy of ${\bf X}$ .

\min_{x\in\operatorname{Supp}({\bf X})}\log\frac{1}{{\mathrm{Pr}}\mathopen{}% \mathclose{{\left[{\bf X}=x}}\right]}.

For a random variable ${\bf X}$ and an event $E$ , we use $\operatorname{H_{\infty}}({\bf X}\mid E)$ to denote the min-entropy of the distribution ${\bf X}|_{E}$ . We will use the following facts.

Fact 9.

Let ${\bf X}$ and ${\bf Y}$ be independent random variables. Then $\operatorname{H_{\infty}}({\bf X},{\bf Y})=\operatorname{H_{\infty}}({\bf X})+% \operatorname{H_{\infty}}({\bf Y})$ .

2.5 Complexity Classes

We define the complexity classes FBPP and ${\sf ioP}/{\mathrm{poly}}$ .

Definition 10 (Infinitely-often FBPP ( ${\sf ioFBPP}$ )).

A binary relation ${\cal R}\subseteq\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}\times% \mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ is in ${\sf ioFBPP}$ if there exists ppt algorithms $\mathsf{A}$ such that the following holds for infinitely many $n$ ’s:

For every $x\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ such that there exists $y\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ with $(x,y)\in{\cal R}$ ,

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[(x,\mathsf{A}(x,1^{k}))\in{\cal R}}}% \right]\geq 1-1/k,

for every $k>0$ . ${\cal R}$ is in ${\sf ioP}/{\mathrm{poly}}$ if the above holds with respect to $n.u.-poly-time$ algorithms $\mathsf{A}$ .

2.6 One-Way Function

Definition 11 (One-way function).

A polynomial-time computable function $f:\{0,1\}^{*}\rightarrow\{0,1\}^{*}$ is called a one-way function if for every polynomial-time algorithm $\mathsf{A}$ ,

{\mathrm{Pr}}_{x\leftarrow\{0,1\}^{n}}\mathopen{}\mathclose{{\left[\mathsf{A}(% 1^{n},f(x))\in f^{-1}(f(x))}}\right]=\operatorname{neg}(n)

Definition 12 (Weak one-way function).

Let $m\in{\mathrm{poly}}$ be a polynomial-time computable function. A polynomial-time computable function $f:\{0,1\}^{m(n)}\rightarrow\{0,1\}^{*}$ is called $\alpha$ -weak one-way function if for every polynomial-time algorithm $\mathsf{A}$ , for every large enough $n$ ,

{\mathrm{Pr}}_{x\leftarrow\{0,1\}^{m(n)}}\mathopen{}\mathclose{{\left[\mathsf{% A}(1^{n},f(x))\in f^{-1}(f(x))}}\right]\leq 1-\alpha(n)

$f$ is a weak one-way function if it is $1/p$ -weak one way function, for some $p\in{\mathrm{poly}}$ .

Theorem 13 (Weak to strong OWFs, [36]).

One-way functions exist if and only if weak-one way functions exist.

2.7 Public-Key Encryption

Definition 14 (Public-key encryption scheme (PKE)).

A triplet of randomized, efficiently computable functions $(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})$ is a $(\alpha(n),\beta(n))$ -public-key encryption scheme (PKE) if the following holds:

$\blacksquare$

Correctness: For every large enough $n\in{\mathbb{N}}$ and any $m\in\mathopen{}\mathclose{{\left\{0,1}}\right\}$ ,

${\mathrm{Pr}}_{(sk,pk)\leftarrow\mathsf{Gen}(1^{n})}\mathopen{}\mathclose{{% \left[\mathsf{Dec}(sk,\mathsf{Enc}(pk,m))=m}}\right]\geq 1-\alpha(n)$
$\blacksquare$

Security: For every PPT ${\sf Eve}$ , for every large enough $n\in{\mathbb{N}}$ ,

${\mathrm{Pr}}_{(sk,pk)\leftarrow\mathsf{Gen}(1^{n}),m\leftarrow\mathopen{}% \mathclose{{\left\{0,1}}\right\}}\mathopen{}\mathclose{{\left[{\sf Eve}(pk,% \mathsf{Enc}(pk,m))=m}}\right]\leq 1/2+\beta(n).$

Such a scheme is a PKE if it is $(1/n^{c},1/n^{c})$ -PKE for every $c\in{\mathbb{N}}$ .

The following lemma shows that it is possible to amplify an $1$ -bit weak key-agreement protocol into a key-agreement. This lemma is a simple case of the more general result of [19].

Lemma 15 (PKE amplification, [19]).

The following holds for every constants $c_{1}>c_{2}$ . Assume there exists an $(n^{-c_{1}},1/2-n^{-c_{2}})$ -PKE. Then, there exists a PKE.

We also define weak-PKE.

Definition 16 (Weak Public-key encryption scheme (weak-PKE)).

For an efficiently computable function $d\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ , a triplet of randomized, efficiently computable functions $(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})$ is a $(\alpha(n),\beta(n),\gamma(n))$ -weak-public-key encryption scheme (weak-PKE) if the following holds:

$\blacksquare$

For every $n\in{\mathbb{N}}$ , given $p k$ and randomness $r$ , $\mathsf{Enc}(pk;r)$ outputs a message $m(pk;r)$ and an output $o(pk;r)\in{\mathbb{N}}$ .
$\blacksquare$

Correctness: For every large enough $n\in{\mathbb{N}}$

${\mathrm{Pr}}_{(sk,pk)\leftarrow\mathsf{Gen}(1^{n}),r}\mathopen{}\mathclose{{% \left[\mathopen{}\mathclose{{\left|\mathsf{Dec}(sk,m(pk,r))-o(pk,r)}}\right|% \leq d(1^{n})}}\right]\geq 1-\alpha(n)$
$\blacksquare$

Security: For every PPT ${\sf Eve}$ , for every large enough $n\in{\mathbb{N}}$ ,

${\mathrm{Pr}}_{(sk,pk)\leftarrow\mathsf{Gen}(1^{n}),r}\mathopen{}\mathclose{{% \left[\mathopen{}\mathclose{{\left|{\sf Eve}(pk,m(pk,r))-o(pk,r)}}\right|\leq% \gamma(n)\cdot d(1^{n})}}\right]\leq\beta(n).$

In the full version of this paper we prove the following lemma, stating that weak-PKE can be used to construct PKE.

Lemma 17 (Weak-PKE amplification).

The following holds for every constants $c_{1}>c_{2}$ . Assume there exists an $(n^{-c_{1}}/2,1-10n^{-c_{2}},2n^{c_{1}})$ -weak-PKE. Then, there exists a PKE.

2.8 Kolmogorov Complexity and Computational Depth

Roughly speaking, the $t$ -time-bounded Kolmogorov complexity, ${\mathrm{K}}^{t}(x)$ , of a string $x\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ is the length of the shortest program $\Pi=(M,y)$ such that, when simulated by an universal Turing machine, $\Pi$ outputs $x$ in $t(|x|)$ steps. Here, a program $\Pi$ is simply a pair of a Turing Machine $M$ and an input $y$ , where the output of $P$ is defined as the output of $M(y)$ . When there is no running time bound (i.e., the program can run in an arbitrary number of steps), we obtain the notion of Kolmogorov complexity.

In the following, fix universal TM $\mathsf{U}$ with polynomial simulation overhead, and let $\mathsf{U}(\Pi,1^{t})$ denote the output of $\Pi$ when emulated on $\mathsf{U}$ for $t$ steps. We now define the notion of Kolmogorov complexity with respect to the universal TM $\mathsf{U}$ .

Definition 18.

Let $t$ be a polynomial. For all $x\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ , define the $t$ -bounded Kolmogorov complexity of $x$

{\mathrm{K}}^{t}(x)=\min_{\Pi\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*% }}\{|\Pi|:\mathsf{U}(\Pi,1^{t(|x|)})=x\}

where $|\Pi|$ is referred to as the description length of $\Pi$ . When there is no time bound, the Kolmogorov complexity of $x$ is defined as

{\mathrm{K}}(x)=\min_{\Pi\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}}\{% |\Pi|:\exists t\in{\mathbb{N}}\mathsf{U}(\Pi,1^{t})=x\}.

The computational depth of $x$ [4], denoted by $cd^{t,\infty}(x)$ , is defined to be

cd^{t,\infty}(x)={\mathrm{K}}^{t}(x)-{\mathrm{K}}(x).

We use ${\sf K}(x,y)$ to denote the Kolmogorov complexity of some generic self-delimiting encoding of the pair $x, y$ . Recall that we use ${\sf K}(x||y)$ to denote the complexity of the concatenation of $x$ and $y$ . We will use the following well-known fact:

Fact 19.

For every $x,y\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{*}$ ,

{\sf K}(x,y)\leq{\sf K}(x)+{\sf K}(y)+\log({\sf K}(x))+2\log\log({\sf K}(x))+O% (1).

We will also use the following bound on the Kolmogorov complexity of strings sampled from distributions with high min-entropy.

Lemma 20.

For every $n\in{\mathbb{N}}$ , and every distribution ${\cal D}$ , it holds that

{\mathrm{Pr}}_{x\leftarrow{\cal D}}\mathopen{}\mathclose{{\left[{\sf K}(x)\geq% \operatorname{H_{\infty}}({\cal D})-\log n}}\right]\geq 1-1/n.

Proof.

There are at most $2^{\operatorname{H_{\infty}}({\cal D})-\log n}$ strings $x$ with ${\sf K}(x)<\operatorname{H_{\infty}}({\cal D})-\log n$ . Thus,

{\mathrm{Pr}}_{x\leftarrow{\cal D}}\mathopen{}\mathclose{{\left[{\sf K}(x)<% \operatorname{H_{\infty}}({\cal D})-\log n}}\right]\leq 2^{\operatorname{H_{% \infty}}({\cal D})-\log n}\cdot 2^{-\operatorname{H_{\infty}}({\cal D})}=1/n.\

$\hfill\blacktriangleleft$

We will also use the well-known Chernoff bound in our proof.

Fact 21 (Hoeffding’s inequality).

Let ${\bf A}_{1},...,{\bf A}_{n}$ be independent random variables s.t. ${\bf A}_{i}\in\mathopen{}\mathclose{{\left\{0,1}}\right\}$ . Let $\widehat{{\bf A}}=1/n\cdot\Sigma_{i=1}^{n}{\bf A}_{i}$ and $\mu={\mathrm{E}}\mathopen{}\mathclose{{\left[\widehat{{\bf A}}}}\right]$ . For every $\epsilon\in[0,1]$ it holds that:

\displaystyle{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{% \left|\widehat{{\bf A}}-\mu}}\right|\geq\epsilon}}\right]\leq 2\cdot e^{-% \epsilon^{2}\cdot n}.

3 White-Box Distributional Learning

Let $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ be a circuit that, given $r$ bits of randomness, samples labeled instances $(x,s)$ . In the following we view $s$ as a binary representation of a number in ${\mathbb{N}}$ (and respectively all the operations below are over ${\mathbb{N}}$ , and we use $\mathopen{}\mathclose{{\left|\cdot}}\right|$ to denote the absolute value). We define the set of all circuits that approximate $P$ with high probability,

{\sf Comp}_{\epsilon}^{\Delta}(P)=\mathopen{}\mathclose{{\left\{C\colon% \mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}\to\mathopen{}\mathclose{{\left% \{0,1}}\right\}^{\ell}\colon{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}% \mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|C(x)-s}}\right|\leq% \Delta}}\right]\geq 1-\epsilon}}\right\}.

We define the following white-box learning problem WBLearn:

Definition 22 ( $\Delta$ -WBLearn).

For a function $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ , let $\Delta$ - ${\sf WBLearn}$ be the following learning problem:

$\blacksquare$

Input: Circuit $P\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}$ , with the promise that there exists a circuit $\widehat{C}\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ such that $\widehat{C}\in{\sf Comp}^{\Delta(1^{n})/n^{10}}_{n^{-10}}(P)$ .
$\blacksquare$

Output: Circuit $C\in{\sf Comp}^{\Delta(1^{n})}_{1/3}(P)$

In this work we are focusing on inputs $P$ for which the circuit $\widehat{C}$ that agree with $P$ with high probability, is such that the description of $\widehat{C}$ and $P$ is computational shallow. Formally, for a time function $t\colon{\mathbb{N}}\to{\mathbb{N}}$ , we denote by $\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}$ the problem $\Delta$ - ${\sf WBLearn}$ with the additional promise that $cd^{t}(P,\widehat{C})\leq 2\log n$ .

We use ${\sf ExactWBLearn}$ and ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ to denote $\Delta$ - ${\sf WBLearn}$ and $\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}$ when $\Delta(1^{n})=0$ . Note that ${\sf ExactWBLearn}$ and ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}$ are incomparable: While in ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}$ we only need to find a circuit that approximates $P$ , the promise in ${\sf ExactWBLearn}$ is stronger. Yet, there is a simple reduction from ${\sf ExactWBLearn}$ to
${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}$ .

Lemma 23.

For every $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ such that $\Delta\in 2^{o(n/\log n)}$ , it holds that

{\sf ExactWBLearn}\leq_{p}\Delta\textit{-}{\sf WBLearn}.

Similarly, for any such $\Delta$ and a function $t\colon{\mathbb{N}}\to{\mathbb{N}}$ , there exists $t^{\prime}\colon{\mathbb{N}}\to{\mathbb{N}}$ , such that

{\sf ExactWBLearn}|_{{\sf CS}^{t}}\leq_{p}\Delta\textit{-}{\sf WBLearn}|_{{\sf CS% }^{t^{\prime}}}.

Proof.

We start with the first reduction ${\sf ExactWBLearn}\leq_{p}\Delta\textit{-}{\sf WBLearn}$ . Given a circuit $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ of length $n$ , the reduction outputs a circuit $P^{\prime}\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}% \mathclose{{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}% \right\}^{\ell+n}$ , which is equivalent to $P$ , with additional $n$ output gates that always output $0$ . That is, $P^{\prime}(w)=(x,s^{\prime})$ , where $P(w)=(x,s)$ and $s^{\prime}=2^{n}\cdot s$ . As we only added $n$ gates to $P$ , $P^{\prime}$ can be encoded using $O(n\log n)$ bits. Using padding we assume that $\mathopen{}\mathclose{{\left|P^{\prime}}}\right|\in\Theta(n\log n)$ .

For correctness, observe that if there exists $\widehat{C}\in{\sf Comp}^{0}_{n^{-10}}(P)$ , then there exists such $\widehat{C^{\prime}}\in{\sf Comp}^{0}_{n^{-10}}(P^{\prime})$ (where $\widehat{C^{\prime}}$ is defined similarly to $P^{\prime}$ using $\widehat{C}$ ). Moreover, an approximation of the output $s^{\prime}$ of $P^{\prime}$ within a distance of $2^{n}/4$ is equivalent to the exact output $s$ of $P$ . Indeed, given an $2^{n}/4$ -approximation of $s^{\prime}$ we can find $s$ by simply dividing $s^{\prime}$ by $2^{n}$ and rounding to the closest integer.

Finally, to see the second reduction, it is enough to show that $cd^{t^{\prime}}(P^{\prime},\widehat{C^{\prime}})\leq 2\log\mathopen{}% \mathclose{{\left|P^{\prime}}}\right|$ . Since $P^{\prime},\widehat{C^{\prime}}$ can be efficiently constructed given $P,\widehat{C}$ , and similarly $P,\widehat{C}$ can be efficiently constructed given $P^{\prime},\widehat{C^{\prime}}$ , it holds that for large enough (polynomial time $t^{\prime}$ , $cd^{t^{\prime}}(P^{\prime},\widehat{C^{\prime}})\leq cd^{t}(P,\widehat{C})+O(1% )\leq 2\log n+O(1)\leq 2\log n+2\log\log n\leq 2\log\mathopen{}\mathclose{{% \left|P^{\prime}}}\right|$ . $\hfill\blacktriangleleft$

We prove the following theorem.

Theorem 24.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be any efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and let $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be any polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then the following are equivalent:

$\blacksquare$

PKE exists
$\blacksquare$

$\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$

As a corollary, we get a result of independent interest relating exact and approximate white-box learning:

Corollary 25.

Let $\epsilon>0$ be a constant and let $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ be any efficiently computable function such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , and let $t\colon{\mathbb{N}}\to{\mathbb{N}}$ be any polynomial such that $t(n)\geq n^{1+\epsilon}$ . Then the following are equivalent:

$\blacksquare$

${\sf ExactWBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$
$\blacksquare$

$\Delta$ - ${\sf WBLearn}|_{{\sf CS}^{t}}\notin{\sf ioFBPP}$

Theorem 24 follows by the following two theorems.

Theorem 26.

Let $t(n)$ be a polynomial and $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ an efficiently computable function. Then if ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ , PKE exist.

Theorem 27.

Assume there exists a PKE. Then for any constant $\epsilon>0$ and any $t(n)\geq n^{1+\epsilon}$ and any efficiently computable $\Delta\colon\mathopen{}\mathclose{{\left\{1}}\right\}^{*}\to{\mathbb{N}}$ such that $\Delta(1^{n})\leq 2^{n^{(1-\epsilon)}}$ , ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\notin{\sf ioFBPP}$ .

Theorem 26 is proven in Section 4, and Theorem 27 is proven in the full version of this paper.

4 Worst-Case hardness of ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}$ $\implies$ KE

In this section we prove Theorem 26, that states that the worst-case hardness of ${\sf WBLearn}|_{{\cal Q}_{t}}$ implies the existence of public-key encryption scheme. In the following, let $C_{0}\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}$ be the circuit that always outputs $0$ , and $P_{0}\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}$ be the circuit that always outputs $(0,0)$ .

To prove the above theorem, fix $t=t(n)$ and $\Delta=\Delta(1^{n})$ , and consider the following scheme $(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})$ :

Algorithm 28 ( $\mathsf{Gen}$ ).

Parameter: function $t\colon{\mathbb{N}}\to{\mathbb{N}}$ .

Input: $1^{n}$ .

Operation:

1.

Sample $\lambda\leftarrow[3n]$ and $\Pi\leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\lambda}$ .
2.

Run $\Pi$ for $t(2n)$ steps to get circuits $C\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}\rightarrow\mathopen{}% \mathclose{{\left\{0,1}}\right\}^{\ell},P\colon\mathopen{}\mathclose{{\left\{0% ,1}}\right\}^{r}\rightarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}% \times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ for some $k,r,\ell$ . If the output of $\Pi$ is not two such circuits, set $C=C_{0}$ , $P=P_{0}$ , $r=k=\ell=1$ .
3.

Randomly sample $(x_{1},s_{1}),\dots,(x_{n^{20}},s_{n^{20}})\leftarrow P(U_{r})$ , and compute

$\widehat{\alpha}={\mathrm{Pr}}_{i\leftarrow[n^{20}]}\mathopen{}\mathclose{{% \left[\mathopen{}\mathclose{{\left|C(x_{i})-s_{i}}}\right|\leq\Delta(1^{n})/n^% {10}}}\right].$

If $\widehat{\alpha}<1-2n^{-8}$ , reset $C=C_{0}$ and $P=P_{0}$ .
4.

Output $(k,\ell,C)$ as the secret key, and $(r,k,\ell,P)$ as the public key.

.

Algorithm 29 ( $\mathsf{Enc}$ ).

Input: public-key $(r\in{\mathbb{N}},k\in{\mathbb{N}},\ell\in{\mathbb{N}},P\colon\mathopen{}% \mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose{{\left\{0,1}}% \right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell})$ .

Operation:

1.

Sample randomness $z\leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}$ .
2.

Compute $P(z)$ to get $(x,s)$ .
3.

Output $m=x$ and $o=s$ .

.

Algorithm 30 ( $\mathsf{Dec}$ ).

Input: secret-key $(k\in{\mathbb{N}},\ell\in{\mathbb{N}},C\colon\mathopen{}\mathclose{{\left\{0,1% }}\right\}^{k}\to\mathopen{}\mathclose{{\left\{0,1}}\right\})$ , cipher $x\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}$ .

Operation:

1.

Compute $C(x)=s^{\prime}$ .
2.

Output $s^{\prime}$ .

.

Observe that the size of the circuits $C$ and $P$ sampled by $\mathsf{Gen}(1^{n})$ is at most $t(2n)$ . Thus, all of the above algorithms can be implemented efficiently. Below we bound the correctness and the security of the above scheme. For every $n\in{\mathbb{N}}$ , let $({\bf K}_{n},{\bf L}_{n},{\bf C}_{n})$ and $({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n})$ be the random variables distributed according to the secret and public keys $(k,\ell,C),(r,k,\ell,P)$ in a random execution of $\mathsf{Gen}(1^{n})$ . Let ${\bf M}_{n}$ and ${\bf O}_{n}$ be the output of $\mathsf{Enc}({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n})$ in a random execution.

4.1 Correctness

We start by analyzing the correctness probability of the scheme.

Lemma 31.

For every $n\in{\mathbb{N}}$ , it holds that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|\mathsf% {Dec}(({\bf K}_{n},{\bf L}_{n},{\bf C}_{n}),{\bf M_{n}})-{\bf O}_{n}}}\right|% \leq\Delta(1^{n})/n^{10}}}\right]\geq 1-n^{-7}.

To prove Lemma 31 we will use the following simple claim, which is immediate from the Hoffeding bound.

Claim 32.

Let $C\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{k}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{\ell}$ and $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ be two circuits. Let ${\bf A}$ be a random variable distributed according to the following process:

Sample $(x_{1},s_{1})\dots,s(x_{n^{20}},s_{n^{20}})\leftarrow P({\bf U}_{r})$ , and let ${\bf A}={\mathrm{Pr}}_{i\leftarrow[n^{20}]}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|C(x_{i})-s_{i}}}\right|\leq\Delta}}\right]$ .
Let $\mu={\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left% [\mathopen{}\mathclose{{\left|C(x)-s}}\right|\leq\Delta}}\right]$ . Then ${\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\bf A}% -\mu}}\right|\geq n^{-9}}}\right]\leq 2^{-n}$ .

Proof.

Immediate by 21. $\hfill\vartriangleleft$

Proof of Lemma 31.

When $\mathsf{Gen}(1^{n})$ outputs $C_{0}$ and $P_{0}$ , the scheme has perfect correctness. Moreover, when the secret and public key are $(k,\ell,C)$ and $(r,k,\ell,P)$ , it holds that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|\mathsf% {Dec}((k,\ell,C),{\bf M}))-{\bf O}}}\right|\leq\Delta(1^{n})/n^{10}}}\right]={% \mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|C(x)-s}}\right|\leq\Delta(1^{n})/n^{10}}}\right].

Thus,

	$\displaystyle{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{% \left\|\mathsf{Dec}(({\bf K}_{n},{\bf L}_{n},{\bf C}_{n}),{\bf M}_{n})-{\bf O}_% {n}}}\right\|\leq\Delta(1^{n})/n^{10}}}\right]$
	$\displaystyle\geq(1-3n^{-8})\cdot{\mathrm{Pr}}_{{\bf R}_{n},{\bf K}_{n},{\bf P% }_{n},{\bf C}_{n}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{(x,s)\leftarrow{% \bf P}_{n}({\bf U}_{{\bf R}_{n}})}\mathopen{}\mathclose{{\left[\mathopen{}% \mathclose{{\left\|{\bf C}_{n}(x)-s}}\right\|\leq\Delta(1^{n})/n^{10}}}\right]% \geq 1-3n^{-8}}}\right]$
	$\displaystyle\geq 1-3n^{-8}-{\mathrm{Pr}}_{{\bf R}_{n},{\bf K}_{n},{\bf P}_{n}% ,{\bf C}_{n}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{(x,s)\leftarrow{\bf P% }_{n}({\bf U}_{{\bf R}_{n}})}\mathopen{}\mathclose{{\left[\mathopen{}% \mathclose{{\left\|{\bf C}_{n}(x)-s}}\right\|\leq\Delta(1^{n})/n^{10}}}\right]<1% -3n^{-8}}}\right].$

The lemma now follows since by Lemma 31, the probability of circuits $P, C$ with

{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|{\bf C}_{n}(x)-s}}\right|\leq\Delta(1^{n})/n^{10}% }}\right]<1-3n^{-8}

to pass the test in Step 3 is at most $2^{-n}$ . $\hfill\blacktriangleleft$

4.2 Security

We next bound the leakage of the scheme.

Lemma 33.

Assume there exists an algorithm ${\sf Eve}$ such that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve% }(1^{n},({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n}),{\bf M_{n}})-{\bf O}% _{n}}}\right|\leq\Delta(1^{n})}}\right]\geq 1-n^{-6}

for infinitely many $n$ ’s. Then ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\in{\sf ioBPP}$ .

In the following, let ${\sf Eve}$ be an algorithm that uses $r_{\sf Eve}(n)$ bits of randomness and guesses ${\bf M}$ with probability at least $1-n^{-6}$ . Recall that for $w\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r_{\sf Eve}(n)}$ , ${\sf Eve}(1^{n},(r,k,\ell,P),m;w)$ denotes the execution of ${\sf Eve}$ when its randomness is fixed to be $w$ .

Algorithm 34.

Input: $P\in\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}$ .

Operation:

1.

Let $r,k,\ell$ be such that $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ .
2.

Sample $w\leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r_{\sf Eve}(n)}$ uniformly at random.
3.

Construct a circuit $C^{\prime}$ such that $C^{\prime}(x)={\sf Eve}(1^{n},(r,k,\ell,P),x;w)$ .
4.

Return $C^{\prime}$ .

.

We prove the following lemma.

Lemma 35.

Assume that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve% }(1^{n},({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n}),{\bf M}_{n})-{\bf O}% _{n}}}\right|\leq\Delta(1^{n})}}\right]\geq 1-n^{-6}

for infinitely many $n$ ’s. Then the following holds for infinitely many $n$ ’s. For every input $P\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}\cap{\cal Q}^{\Delta}_{t}$ , 34 outputs $C\in{\sf Comp}^{\Delta}_{1/4}(P)$ with probability at least $1/2$ .

We prove Lemma 35 below, but first we use it to prove Lemma 33.

Proof of Lemma 33.

Assume there exists an algorithm ${\sf Eve}$ such that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve% }(1^{n},({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n}),{\bf M}_{n})-{\bf O}% _{n}}}\right|\leq\Delta(1^{n})}}\right]\geq 1-n^{-6}

for infinitely many $n$ ’s.

By Lemma 35, for infinitely many $n$ ’s, 34 outputs $C\in{\sf Comp}^{\Delta}_{1/4}(P)$ with probability at least $1/2$ , for every $P\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}\cap{\cal Q}^{\Delta}_{t}$ . We want to construct an algorithm Sol that, given $P$ and $1^{1/\delta}$ , outputs $C\in{\sf Comp}_{1/3}(P)$ with probability at least $1-\delta$ , for every such $n$ .

Let Sol be the algorithm that, given input $(P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}% \mathclose{{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}% \right\}^{\ell},1^{1/\delta})$ , first run 34 on $P$ for $2\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}\right\rceil$ times, to get circuits $C_{1},\dots,C_{2\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}\right\rceil}$ . Then, for every circuit $C_{i}$ , Sol samples $(x_{1},s_{1}),\dots,(x_{(100\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}% \right\rceil)^{2}},s_{(100\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}% \right\rceil)^{2}})\leftarrow P({\bf U}_{r})$ , and computes

p_{i}={\mathrm{Pr}}_{j\leftarrow[(100\mathopen{}\mathclose{{\left\lceil\log 1/% \delta}}\right\rceil)^{2}]}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{% {\left|C_{i}(x_{j})-s_{j}}}\right|\leq\Delta(1^{n})}}\right].

Finally, Sol outputs the circuit $C_{i}$ for the index $i$ with maximal value of $p_{i}$ .

We now analyze the success probability of Sol. Using 21, for every $i$ , the probability that $\mathopen{}\mathclose{{\left|p_{i}-{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r% })}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|C_{i}(x)=s}}% \right|\leq\Delta(1^{n})}}\right]}}\right|\geq 1/25$ is at most

2^{-4(\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}\right\rceil)^{2}+1}% \leq 1/(\delta(4\mathopen{}\mathclose{{\left\lceil\log 1/\delta}}\right\rceil)).

Thus, by the union bound, the probability that $\mathopen{}\mathclose{{\left|p_{i}-{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r% })}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|C_{i}(x)=s}}% \right|\leq\Delta(1^{n})}}\right]}}\right|\geq 1/25$ for some $i$ is at most $\delta/2$ . Next, by the success probability of 34, with probability at least

1-(1-1/2)^{2\log 1/\delta}\geq 1-\delta/2,

at least one of the circuits $C_{1},\dots,C_{2\log 1/\delta}$ is in ${\sf Comp}^{\Delta}_{1/4}(P)\subseteq{\sf Comp}^{\Delta}_{1/3}(P)$ . Let $i^{*}$ be the index of such a circuit. Then, with probability at least $1-\delta/2-\delta/2=1-\delta$ , such $i^{*}$ exists, and $p_{i^{*}}$ is at least $(3/4-1/25)$ , while for every $i$ with $C_{i}\notin{\sf Comp}^{\Delta}_{1/3}(P)$ , $p_{i}$ is at most

(2/3+1/25)<(3/4-1/25)\leq p_{i^{*}},

which implies that the output of Sol is in ${\sf Comp}^{\Delta}_{1/3}(P)$ . $\hfill\blacktriangleleft$

4.3 Proving Lemma 35

To prove Lemma 35, we start with the following claim.

Claim 36.

Let $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ be a circuit, and assume that

{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(r,k,\ell,P),x)-s}}\right|\leq% \Delta(n)}}\right]\geq 9/10.

Then, on input $P$ , 34 outputs $C\in{\sf Comp}^{\Delta}_{1/4}(P)$ with probability at least $1/2$ .

Proof of 36.

Let $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ be a circuit such that

{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(r,k,\ell,P),x)-s}}\right|\leq% \Delta(n)}}\right]\geq 9/10.

Then, by definition it holds that

{\mathrm{E}}_{w\leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r_{\sf Eve% }(n)}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r% })}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(% r,k,\ell,P),x)-s}}\right|>\Delta(n)}}\right]}}\right]\leq 1/10.

Using Markov’s inequality, we gets that

{\mathrm{Pr}}_{w\leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r_{\sf Eve% }(n)}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r% })}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(% r,k,\ell,P),x)-s}}\right|>\Delta(n)}}\right]>1/4}}\right]\leq 1/2,

which implies that the circuit $C^{\prime}={\sf Eve}(1^{n},(r,k,\ell,P),\cdot;w)$ is in ${\sf Comp}^{\Delta}_{1/4}(P)$ with probability at least $1/2$ over the choice of $w$ , as we wanted to show. $\hfill\vartriangleleft$

Given 36, we are now ready to prove Lemma 35.

Proof of Lemma 35.

Assume that ${\sf Eve}$ is such that

{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve% }(1^{n},({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n}),{\bf M}_{n})-{\bf O}% _{n}}}\right|\leq\Delta(1^{n})}}\right]\geq 1-n^{-6}

for infinitely many $n$ ’s. In the following, fix such large enough $n\in{\mathbb{N}}$ . We show that

\displaystyle{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}% \mathclose{{\left[\mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(r,k,\ell,P),x)% -s}}\right|\leq\Delta(n)}}\right]\geq 9/10

(1)

for every $P\colon\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}\to\mathopen{}\mathclose% {{\left\{0,1}}\right\}^{k}\times\mathopen{}\mathclose{{\left\{0,1}}\right\}^{\ell}$ with $P\in{\cal Q}^{\Delta}_{t}\cap{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ . The proof then follows by 36. To see the above, let ${\bf Pk}=({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n}))$ , and notice that

		$\displaystyle{\mathrm{Pr}}_{\begin{subarray}{c}{\bf R}_{n},{\bf K}_{n},{\bf L}% _{n},{\bf P}_{n},{\bf C}_{n}\\ {\bf Pk}=({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n})\end{subarray}}% \mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{{\bf M}_{n},{\bf O}_{n},{\bf W}% \leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}_{\sf Eve}}\mathopen{% }\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M% }_{n};{\bf W})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$		(2)
		$\displaystyle\leq 10n^{-6}.$

Indeed, it holds that

	$\displaystyle 1-n^{-6}$	$\displaystyle\leq{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}% \mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|% \leq\Delta(n)}}\right]$
		$\displaystyle\leq{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm% {Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n}% ,{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]\geq 9/10}}\right]$
		$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ % \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode% \nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ +9/10\cdot{\mathrm{Pr}}_{{% \bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}\mathopen{}\mathclose{{\left% [\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n% }}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$
		$\displaystyle=(1-{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm% {Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n}% ,{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}}\right])$
		$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ % \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode% \nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ +9/10\cdot{\mathrm{Pr}}_{{% \bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}\mathopen{}\mathclose{{\left% [\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n% }}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$
		$\displaystyle=1-1/10\cdot{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[% {\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve% }(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}% }\right]$

which implies that Equation 2 holds. Next, we use Equation 2 and the upper bound on the computational depth of instances in ${\cal Q}^{\Delta}_{t}$ , to show that Equation 1 holds for every $P\in{\cal Q}^{\Delta}_{t}$ . To do so, fix $P\in{\cal Q}^{\Delta}_{t}\cap{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ and let $C\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}\cap{\sf Comp}^{\Delta/n^% {10}}_{n^{-10}}(P)$ with $cd^{t,\infty}(C,P)\leq 2\log n$ be the circuit promised by the definition of ${\cal Q}^{\Delta}_{t}$ . Assume towards a contradiction that

{\mathrm{Pr}}_{(x,s)\leftarrow P({\bf U}_{r})}\mathopen{}\mathclose{{\left[% \mathopen{}\mathclose{{\left|{\sf Eve}(1^{n},(r,k,\ell,P),x)-s}}\right|\leq% \Delta(n)}}\right]<9/10.

We want to upperbound ${\sf K}(P,C)$ , to get a lower bound on the computational depth of $(P,C)$ . To this end, let ${\cal S}$ be the set of all pairs $(P^{\prime},C^{\prime})\in{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}% \times{\mathopen{}\mathclose{{\left\{0,1}}\right\}^{n}}$ , such that $C^{\prime}\in{\sf Comp}^{\Delta/n^{10}}_{n^{-10}}(P^{\prime})$ , ${\sf K}^{t}(P^{\prime},C^{\prime})={\sf K}^{t}(P,C)$ , and on which ${\sf Eve}$ fails to approximate $s$ with probability more than $1/10$ .

By our assumption on $(P,C)$ , it holds that $(P,C)\in{\cal S}$ . We next bound the size of ${\cal S}$ . First, we claim that for every $(P^{\prime},C^{\prime})\in{\cal S}$ ,

\displaystyle{\mathrm{Pr}}\mathopen{}\mathclose{{\left[{\bf P}_{n}=P^{\prime},% {\bf C}_{n}=C^{\prime}}}\right]\geq 1/6n\cdot 2^{-{\sf K}^{t}(C,P)}.

(3)

Indeed, by definition there exists a program $\Pi$ of length ${\sf K}^{t}(C,P)$ that outputs $(C^{\prime},P^{\prime})$ in at most $t(2n)$ steps. Thus, $\mathsf{Gen}$ samples $\Pi$ with probability at least $1/3n\cdot 2^{-{\sf K}^{t}(C,P)}$ . Next, by 32, the test in Step 3 of $\mathsf{Gen}$ passes with probability at least $1-2^{-n}>1/2$ , since by definition of ${\sf Comp}^{\Delta/n^{10}}_{n^{-10}}$ , ${\mathrm{Pr}}_{(x,s)\leftarrow P^{\prime}({\bf U}_{r}}\mathopen{}\mathclose{{% \left[\mathopen{}\mathclose{{\left|C^{\prime}(x)-s}}\right|\leq\Delta(1^{n})/n% ^{10}}}\right]-n^{-9}\geq 1-n^{-10}-n^{-9}\geq 1-2n^{-7}$ . In this case that the test passes, $P^{\prime}$ and $C^{\prime}$ are the output of $\mathsf{Gen}$ , and thus Equation 3 holds.

By Equation 3 and the definition of ${\cal S}$ , we get that

	$\displaystyle{\mathrm{Pr}}_{\begin{subarray}{c}{\bf R}_{n},{\bf K}_{n},{\bf L}% _{n},{\bf P}_{n},{\bf C}_{n}\\ {\bf Pk}=({\bf R}_{n},{\bf K}_{n},{\bf L}_{n},{\bf P}_{n})\end{subarray}}% \mathopen{}\mathclose{{\left[{\mathrm{Pr}}_{{\bf M}_{n},{\bf O}_{n},{\bf W}% \leftarrow\mathopen{}\mathclose{{\left\{0,1}}\right\}^{r}_{\sf Eve}}\mathopen{% }\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M% }_{n};{\bf W})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$
	$\displaystyle\geq\mathopen{}\mathclose{{\left\|{\cal S}}}\right\|\cdot 1/6n\cdot 2% ^{-{\sf K}^{t}(C,P)}.$

Combining the above with Equation 2 yields that

\mathopen{}\mathclose{{\left|{\cal S}}}\right|\leq\frac{10n^{-6}}{1/6n\cdot 2^% {-{\sf K}^{t}(C,P)}}\leq 2^{{\sf K}^{t}(C,P)+\log n-6\log n+6}.

Observe that ${\cal S}$ can be (inefficiently) computed given $n,{\sf K}^{t}(C,P)$ and ${\sf Eve}$ . Thus, to encode $(P,C)$ , it is enough to encode ${\cal S}$ and the index of $(P,C)$ in ${\cal S}$ (according to the lexicographic order). We conclude that,

	$\displaystyle{\sf K}(P,C)$	$\displaystyle\leq{\sf K}(n,\lambda,{\sf Eve})+2\log({\sf K}(n,\lambda,{\sf Eve% }))+\log\mathopen{}\mathclose{{\left\|{\cal S}}}\right\|+O(1)$
		$\displaystyle\leq 2\log n+4\log\log n+{\sf K}^{t}(C,P)-5\log n+O(1)$
		$\displaystyle<{\sf K}^{t}(C,P)-2\log n,$

where the last inequality holds for every large enough $n$ . By the above, ${\sf K}^{t}(C,P)-{\sf K}(P,C)>2\log n$ , in contradiction to the choice of $(P,C)$ . This yields that Equation 1 holds for every $P\in{\cal Q}_{t}$ , as we wanted to show. $\hfill\blacktriangleleft$

4.4 Proving Theorem 26

We are now ready to use Lemma 15 in order to prove Theorem 26.

Proof of Theorem 26.

By Lemmas 31 and 33, $(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})$ is a $(n^{-7},1-n^{-6},n^{10})$ -weak-PKE (for $d(1^{n})=\Delta(1^{n})/n^{10}$ ), and thus it is also a $(n^{-6.9}/2,1-10n^{-6.1},2n^{6.9})$ -weak-PKE. Thus, by Lemma 15, $(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})$ can be amplified into a PKE. $\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 37 (The non-uniform setting).

A similar theorem can be proven when assuming that ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}\notin{\sf ioP}/{\mathrm{poly}}$ , and when the PKE is secure against non-uniform adversaries. In this case, we assume that ${\sf Eve}$ is a non-uniform algorithm that breaks the PKE protocol, and want to construct a non-uniform (randomized) algorithm that decides ${\sf WBLearn}|_{{\cal Q}_{t}}$ .

The issue with the above proof is that we cannot simply use ${\sf Eve}$ to bound the Kolmogorov complexity of $(C,P)$ as done in the proof of Lemma 33, as ${\sf Eve}$ does not have constant size. However, we can find ${\sf Eve}$ using a small Turing machine: Let $M$ be the (inefficient) Turing machine that, given a constant $c$ such that $n^{c}$ is a bound on the size of ${\sf Eve}$ , and an input for ${\sf Eve}$ , first find the circuit $E^{\prime}_{n}$ of size at most $n^{c}$ that maximize the advantage in predicting $M$ given an encryption of $M$ by $\mathsf{Enc}$ , and then execute $E^{\prime}$ on the input. Observe that $M$ has prediction advantage at least as the advantage of ${\sf Eve}$ . The theorem now follows using the same proof, by replacing ${\sf Eve}$ in the proof of Lemma 33 with $M$ , and replacing ${\sf Eve}$ in 34 with $E^{\prime}=\mathopen{}\mathclose{{\left\{E^{\prime}_{n}}}\right\}_{n\in{% \mathbb{N}}}$ .

References

[1] Miklós Ajtai and Cynthia Dwork. A public-key cryptosystem with worst-case/average-case equivalence. In stoc29, pages 284–293, 1997. See also ECCC TR96-065. doi:10.1145/258533.258604.
[2] Michael Alekhnovich. More on average case vs approximation complexity. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 298–307. IEEE, 2003. doi:10.1109/SFCS.2003.1238204.
[3] Luis Antunes and Lance Fortnow. Worst-case running times for average-case algorithms. In 2009 24th Annual IEEE Conference on Computational Complexity, pages 298–303. IEEE, 2009. doi:10.1109/CCC.2009.12.
[4] Luis Antunes, Lance Fortnow, Dieter Van Melkebeek, and N Variyam Vinodchandran. Computational depth: concept and applications. Theoretical Computer Science, 354(3):391–404, 2006. doi:10.1016/J.TCS.2005.11.033.
[5] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptography from different assumptions. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 171–180, 2010. doi:10.1145/1806689.1806715.
[6] Marshall Ball, Yanyi Liu, Noam Mazor, and Rafael Pass. Kolmogorov comes to cryptomania: On interactive kolmogorov complexity and key-agreement. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 458–483. IEEE, 2023. doi:10.1109/FOCS57990.2023.00034.
[7] Avrim Blum, Merrick Furst, Michael Kearns, and Richard J Lipton. Cryptographic primitives based on hard learning problems. In Annual International Cryptology Conference, pages 278–291. Springer, 1993. doi:10.1007/3-540-48329-2_24.
[8] Andrej Bogdanov, Miguel Cueto Noval, Charlotte Hoffmann, and Alon Rosen. Public-key encryption from homogeneous clwe. In Theory of Cryptography: 20th International Conference, TCC 2022, Chicago, IL, USA, November 7–10, 2022, Proceedings, Part II, pages 565–592. Springer, 2022. doi:10.1007/978-3-031-22365-5_20.
[9] Zvika Brakerski, Adeline Langlois, Chris Peikert, Oded Regev, and Damien Stehlé. Classical hardness of learning with errors. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 575–584, 2013. doi:10.1145/2488608.2488680.
[10] Zvika Brakerski and Vinod Vaikuntanathan. Efficient fully homomorphic encryption from (standard) lwe. Journal of the ACM, 43(2):831–871, 2014.
[11] Gregory J. Chaitin. On the simplicity and speed of programs for computing infinite sets of natural numbers. J. ACM, 16(3):407–422, 1969. doi:10.1145/321526.321530.
[12] Whitfield Diffie and Martin E. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, pages 644–654, 1976. doi:10.1109/TIT.1976.1055638.
[13] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Annual International Cryptology Conference (CRYPTO), pages 10–18, 1984.
[14] Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 169–178, 2009. doi:10.1145/1536414.1536440.
[15] Oded Goldreich and Leonid A. Levin. A hard-core predicate for all one-way functions. In Proceedings of the twenty-first annual ACM symposium on Theory of computing (STOC), pages 25–32, 1989. doi:10.1145/73007.73010.
[16] Danny Harnik, Joe Kilian, Moni Naor, Omer Reingold, and Alon Rosen. On robust combiners for oblivious transfer and other primitives. In Advances in Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May 22-26, 2005. Proceedings 24, pages 96–113. Springer, 2005. doi:10.1007/11426639_6.
[17] J. Hartmanis. Generalized kolmogorov complexity and the structure of feasible computations. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983), pages 439–445, 1983. doi:10.1109/SFCS.1983.21.
[18] Shuichi Hirahara and Mikito Nanashima. Learning in pessiland via inductive inference. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 447–457. IEEE, 2023. doi:10.1109/FOCS57990.2023.00033.
[19] Thomas Holenstein. Strengthening key agreement using hard-core sets. PhD thesis, ETH Zurich, 2006. doi:10.3929/ETHZ-A-005205852.
[20] Russell Impagliazzo and Leonid A. Levin. No better ways to generate hard NP instances than picking uniformly at random. In focs31, pages 812–821, 1990. doi:10.1109/FSCS.1990.89604.
[21] Michael Kearns and Leslie Valiant. Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM (JACM), 41(1):67–95, 1994. doi:10.1145/174644.174647.
[22] Ker-I Ko. On the notion of infinite pseudorandom sequences. Theor. Comput. Sci., 48(3):9–33, 1986. doi:10.1016/0304-3975(86)90081-2.
[23] A. N. Kolmogorov. Three approaches to the quantitative definition of information. International Journal of Computer Mathematics, 2(1-4):157–168, 1968.
[24] Yanyi Liu, Noam Mazor, and Rafael Pass. On white-box learning and public-key encryption. Cryptology ePrint Archive, Paper 2024/1931, 2024. URL: https://eprint.iacr.org/2024/1931.
[25] Yanyi Liu and Rafael Pass. On one-way functions and kolmogorov complexity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1243–1254. IEEE, 2020. doi:10.1109/FOCS46700.2020.00118.
[26] Yanyi Liu and Rafael Pass. On one-way functions and the worst-case hardness of time-bounded kolmogorov complexity. Cryptology ePrint Archive, 2023.
[27] Robert J McEliece. A public-key cryptosystem based on algebraic. Coding Thv, 4244:114–116, 1978.
[28] Chris Peikert. Public-key cryptosystems from the worst-case shortest vector problem. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 333–342, 2009.
[29] Michael O Rabin. Digitalized signatures and public-key functions as intractable as factorization. Technical report, Massachusetts Inst of Tech Cambridge Lab for Computer Science, 1979.
[30] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM (JACM), 56(6):1–40, 2009. doi:10.1145/1568318.1568324.
[31] Ronald L Rivest, Adi Shamir, and Leonard Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120–126, 1978. doi:10.1145/359340.359342.
[32] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999. doi:10.1137/S0036144598347011.
[33] Michael Sipser. A complexity theoretic approach to randomness. In Proceedings of the fifteenth annual ACM symposium on Theory of computing, pages 330–335, 1983. doi:10.1145/800061.808762.
[34] R.J. Solomonoff. A formal theory of inductive inference. part i. Information and Control, 7(1):1–22, 1964. doi:10.1016/S0019-9958(64)90223-2.
[35] Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. doi:10.1145/1968.1972.
[36] Andrew C. Yao. Theory and applications of trapdoor functions. In Annual Symposium on Foundations of Computer Science (FOCS), pages 80–91, 1982. doi:10.1109/SFCS.1982.45.

[bib.bib1] [1] Miklós Ajtai and Cynthia Dwork. A public-key cryptosystem with worst-case/average-case equivalence. In stoc29, pages 284–293, 1997. See also ECCC TR96-065. doi:10.1145/258533.258604.

[bib.bib2] [2] Michael Alekhnovich. More on average case vs approximation complexity. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 298–307. IEEE, 2003. doi:10.1109/SFCS.2003.1238204.

[bib.bib3] [3] Luis Antunes and Lance Fortnow. Worst-case running times for average-case algorithms. In 2009 24th Annual IEEE Conference on Computational Complexity, pages 298–303. IEEE, 2009. doi:10.1109/CCC.2009.12.

[bib.bib4] [4] Luis Antunes, Lance Fortnow, Dieter Van Melkebeek, and N Variyam Vinodchandran. Computational depth: concept and applications. Theoretical Computer Science, 354(3):391–404, 2006. doi:10.1016/J.TCS.2005.11.033.

[bib.bib5] [5] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptography from different assumptions. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 171–180, 2010. doi:10.1145/1806689.1806715.

[bib.bib6] [6] Marshall Ball, Yanyi Liu, Noam Mazor, and Rafael Pass. Kolmogorov comes to cryptomania: On interactive kolmogorov complexity and key-agreement. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 458–483. IEEE, 2023. doi:10.1109/FOCS57990.2023.00034.

[bib.bib7] [7] Avrim Blum, Merrick Furst, Michael Kearns, and Richard J Lipton. Cryptographic primitives based on hard learning problems. In Annual International Cryptology Conference, pages 278–291. Springer, 1993. doi:10.1007/3-540-48329-2_24.

[bib.bib8] [8] Andrej Bogdanov, Miguel Cueto Noval, Charlotte Hoffmann, and Alon Rosen. Public-key encryption from homogeneous clwe. In Theory of Cryptography: 20th International Conference, TCC 2022, Chicago, IL, USA, November 7–10, 2022, Proceedings, Part II, pages 565–592. Springer, 2022. doi:10.1007/978-3-031-22365-5_20.

[bib.bib9] [9] Zvika Brakerski, Adeline Langlois, Chris Peikert, Oded Regev, and Damien Stehlé. Classical hardness of learning with errors. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 575–584, 2013. doi:10.1145/2488608.2488680.

[bib.bib10] [10] Zvika Brakerski and Vinod Vaikuntanathan. Efficient fully homomorphic encryption from (standard) lwe. Journal of the ACM, 43(2):831–871, 2014.

[bib.bib11] [11] Gregory J. Chaitin. On the simplicity and speed of programs for computing infinite sets of natural numbers. J. ACM, 16(3):407–422, 1969. doi:10.1145/321526.321530.

[bib.bib12] [12] Whitfield Diffie and Martin E. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, pages 644–654, 1976. doi:10.1109/TIT.1976.1055638.

[bib.bib13] [13] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Annual International Cryptology Conference (CRYPTO), pages 10–18, 1984.

[bib.bib14] [14] Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 169–178, 2009. doi:10.1145/1536414.1536440.

[bib.bib15] [15] Oded Goldreich and Leonid A. Levin. A hard-core predicate for all one-way functions. In Proceedings of the twenty-first annual ACM symposium on Theory of computing (STOC), pages 25–32, 1989. doi:10.1145/73007.73010.

[bib.bib16] [16] Danny Harnik, Joe Kilian, Moni Naor, Omer Reingold, and Alon Rosen. On robust combiners for oblivious transfer and other primitives. In Advances in Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May 22-26, 2005. Proceedings 24, pages 96–113. Springer, 2005. doi:10.1007/11426639_6.

[bib.bib17] [17] J. Hartmanis. Generalized kolmogorov complexity and the structure of feasible computations. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983), pages 439–445, 1983. doi:10.1109/SFCS.1983.21.

[bib.bib18] [18] Shuichi Hirahara and Mikito Nanashima. Learning in pessiland via inductive inference. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 447–457. IEEE, 2023. doi:10.1109/FOCS57990.2023.00033.

[bib.bib19] [19] Thomas Holenstein. Strengthening key agreement using hard-core sets. PhD thesis, ETH Zurich, 2006. doi:10.3929/ETHZ-A-005205852.

[bib.bib20] [20] Russell Impagliazzo and Leonid A. Levin. No better ways to generate hard NP instances than picking uniformly at random. In focs31, pages 812–821, 1990. doi:10.1109/FSCS.1990.89604.

[bib.bib21] [21] Michael Kearns and Leslie Valiant. Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM (JACM), 41(1):67–95, 1994. doi:10.1145/174644.174647.

[bib.bib22] [22] Ker-I Ko. On the notion of infinite pseudorandom sequences. Theor. Comput. Sci., 48(3):9–33, 1986. doi:10.1016/0304-3975(86)90081-2.

[bib.bib23] [23] A. N. Kolmogorov. Three approaches to the quantitative definition of information. International Journal of Computer Mathematics, 2(1-4):157–168, 1968.

[bib.bib24] [24] Yanyi Liu, Noam Mazor, and Rafael Pass. On white-box learning and public-key encryption. Cryptology ePrint Archive, Paper 2024/1931, 2024. URL: https://eprint.iacr.org/2024/1931.

[bib.bib25] [25] Yanyi Liu and Rafael Pass. On one-way functions and kolmogorov complexity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1243–1254. IEEE, 2020. doi:10.1109/FOCS46700.2020.00118.

[bib.bib26] [26] Yanyi Liu and Rafael Pass. On one-way functions and the worst-case hardness of time-bounded kolmogorov complexity. Cryptology ePrint Archive, 2023.

[bib.bib27] [27] Robert J McEliece. A public-key cryptosystem based on algebraic. Coding Thv, 4244:114–116, 1978.

[bib.bib28] [28] Chris Peikert. Public-key cryptosystems from the worst-case shortest vector problem. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 333–342, 2009.

[bib.bib29] [29] Michael O Rabin. Digitalized signatures and public-key functions as intractable as factorization. Technical report, Massachusetts Inst of Tech Cambridge Lab for Computer Science, 1979.

[bib.bib30] [30] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM (JACM), 56(6):1–40, 2009. doi:10.1145/1568318.1568324.

[bib.bib31] [31] Ronald L Rivest, Adi Shamir, and Leonard Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120–126, 1978. doi:10.1145/359340.359342.

[bib.bib32] [32] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999. doi:10.1137/S0036144598347011.

[bib.bib33] [33] Michael Sipser. A complexity theoretic approach to randomness. In Proceedings of the fifteenth annual ACM symposium on Theory of computing, pages 330–335, 1983. doi:10.1145/800061.808762.

[bib.bib34] [34] R.J. Solomonoff. A formal theory of inductive inference. part i. Information and Control, 7(1):1–22, 1964. doi:10.1016/S0019-9958(64)90223-2.

[bib.bib35] [35] Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. doi:10.1145/1968.1972.

[bib.bib36] [36] Andrew C. Yao. Theory and applications of trapdoor functions. In Annual Symposium on Foundations of Computer Science (FOCS), pages 80–91, 1982. doi:10.1109/SFCS.1982.45.

	$\displaystyle 1-n^{-6}$	$\displaystyle\leq{\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}% \mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|% \leq\Delta(n)}}\right]$
		$\displaystyle\leq{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm% {Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n}% ,{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]\geq 9/10}}\right]$
		$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ % \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode% \nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ +9/10\cdot{\mathrm{Pr}}_{{% \bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}\mathopen{}\mathclose{{\left% [\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n% }}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$
		$\displaystyle=(1-{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm% {Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n}% ,{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}}\right])$
		$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ % \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode% \nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ +9/10\cdot{\mathrm{Pr}}_{{% \bf Pk}}\mathopen{}\mathclose{{\left[{\mathrm{Pr}}\mathopen{}\mathclose{{\left% [\mathopen{}\mathclose{{\left\|{\sf Eve}(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n% }}}\right\|\leq\Delta(n)}}\right]<9/10}}\right]$
		$\displaystyle=1-1/10\cdot{\mathrm{Pr}}_{{\bf Pk}}\mathopen{}\mathclose{{\left[% {\mathrm{Pr}}\mathopen{}\mathclose{{\left[\mathopen{}\mathclose{{\left\|{\sf Eve% }(1^{n},{\bf Pk},{\bf M}_{n})-{\bf O}_{n}}}\right\|\leq\Delta(n)}}\right]<9/10}% }\right]$

On White-Box Learning and Public-Key Encryption

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

White-box Learning

Hardness of Learning v.s. (Public-Key) Cryptography

1.1 Our Results

Characterizing PKE through Hardness of White-Box Learning

Theorem 1.

On the Use of Computational Depth

Relating Exact and Approximate White-Box Learning

Corollary 2.

Bounded-Degree Learning and LWE

Theorem 3.

Lemma 4 (Informal).

Lemma 5.

Black-box Learning

Theorem 6.

Corollary 7.

Open Problems

1.2 Proof Overview

Weak PKE

The Weak PKE protocol

Agreement

Security

Hardness of 𝗘𝘅𝗮𝗰𝘁𝗪𝗕𝗟𝗲𝗮𝗿𝗻|𝗖𝗦𝒕 from PKE

Comparison with [6]

Comparison with Universal Constructions

2 Preliminaries

2.1 Notations

2.2 Distributions and Random Variables

Lemma 8.

2.3 Circuits

2.4 Entropy

Fact 9.

2.5 Complexity Classes

Definition 10 (Infinitely-often FBPP (𝗂𝗈𝖥𝖡𝖯𝖯)).

2.6 One-Way Function

Definition 11 (One-way function).

Definition 12 (Weak one-way function).

Theorem 13 (Weak to strong OWFs, [36]).

2.7 Public-Key Encryption

Definition 14 (Public-key encryption scheme (PKE)).

Lemma 15 (PKE amplification, [19]).

Definition 16 (Weak Public-key encryption scheme (weak-PKE)).

Lemma 17 (Weak-PKE amplification).

2.8 Kolmogorov Complexity and Computational Depth

Definition 18.

Fact 19.

Lemma 20.

Proof.

Fact 21 (Hoeffding’s inequality).

3 White-Box Distributional Learning

Definition 22 (Δ-WBLearn).

Lemma 23.

Proof.

Theorem 24.

Corollary 25.

Theorem 26.

Theorem 27.

4 Worst-Case hardness of 𝚫⁢-⁢𝗪𝗕𝗟𝗲𝗮𝗿𝗻|𝗖𝗦𝒕 ⟹ KE

Algorithm 28 (𝖦𝖾𝗇).

Algorithm 29 (𝖤𝗇𝖼).

Algorithm 30 (𝖣𝖾𝖼).

4.1 Correctness

Lemma 31.

Claim 32.

Proof.

Proof of Lemma 31.

4.2 Security

Lemma 33.

Algorithm 34.

Hardness of ${\sf ExactWBLearn}|_{{\sf CS}^{t}}$ from PKE

Definition 10 (Infinitely-often FBPP ( ${\sf ioFBPP}$ )).

Definition 22 ( $\Delta$ -WBLearn).

4 Worst-Case hardness of ${\Delta\textit{-}{\sf WBLearn}|_{{\sf CS}^{t}}}$ $\implies$ KE

Algorithm 28 ( $\mathsf{Gen}$ ).

Algorithm 29 ( $\mathsf{Enc}$ ).

Algorithm 30 ( $\mathsf{Dec}$ ).

$\blacktriangleright$ Remark 37 (The non-uniform setting).