Abstract 1 Introduction 2 Preliminaries 3 White-Box Distributional Learning 4 Worst-Case hardness of 𝚫-𝗪𝗕𝗟𝗲𝗮𝗿𝗻|𝗖𝗦𝒕 KE References

On White-Box Learning and Public-Key Encryption

Yanyi Liu Cornell Tech, New York, NY, USA Noam Mazor Tel Aviv University, Israel Rafael Pass Cornell Tech, New York, NY, USA
Technion, Haifa, Israel
Tel Aviv University, Israel
Abstract

We consider a generalization of the Learning With Error problem, referred to as the white-box learning problem: You are given the code of a sampler that with high probability produces samples of the form y,f(y)+ϵ where ϵ is small, and f is computable in polynomial-size, and the computational task consist of outputting a polynomial-size circuit C that with probability, say, 1/3 over a new sample y according to the same distributions, approximates f(y) (i.e., |C(y)f(y)| is small). This problem can be thought of as a generalizing of the Learning with Error Problem (LWE) from linear functions f to polynomial-size computable functions.

We demonstrate that worst-case hardness of the white-box learning problem, conditioned on the instances satisfying a notion of computational shallowness (a concept from the study of Kolmogorov complexity) not only suffices to get public-key encryption, but is also necessary; as such, this yields the first problem whose worst-case hardness characterizes the existence of public-key encryption. Additionally, our results highlights to what extent LWE “overshoots” the task of public-key encryption.

We complement these results by noting that worst-case hardness of the same problem, but restricting the learner to only get black-box access to the sampler, characterizes one-way functions.

Keywords and phrases:
Public-Key Encryption, White-Box Learning
Funding:
Yanyi Liu: Research partly supported by NSF CNS-2149305.
Noam Mazor: Research partly supported by NSF CNS-2149305, AFOSR Award FA9550-23-1-0312 and AFOSR Award FA9550-23-1-0387 and ISF Award 2338/23.
Rafael Pass: Supported in part by NSF Award CNS 2149305, AFOSR Award FA9550-23-1-0387, AFOSR Award FA9550-23-1-0312 and AFOSR Award FA9550-24-1-0267 and ISF Award 2338/23. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government, or the AFOSR.
Copyright and License:
[Uncaptioned image] © Yanyi Liu, Noam Mazor, and Rafael Pass; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Computational complexity and cryptography
Related Version:
Full Version: https://eprint.iacr.org/2024/1931 [24]
Editors:
Raghu Meka

1 Introduction

Public-Key Encryption (PKE) [12, 31] is one of the most central cryptographic primitives enabling secure communication on the Internet: it is the primitive that enables entities that have not physically met to engage in confidential conversations and collaborations.

In contrast to private-key primitives, such as symmetric-key encryption and digital signatures, that can be securely build from the minimal primitive of one-way functions (and for which many candidate problems are known), we only know of a handful of candidate hard problems from which public-key encryption can be constructed. More specifically, these include (a) number-theory problems based on either factoring [31, 29] or discrete logarithms [12, 13], (b) coding-theory based problems [27], (c) lattice problems as finding shortest/longest vectors in lattices [1, 30, 8], and (d) noisy linear-algebra based problems [2, 5]. Out of these, the number-theory based problems can be efficiently solved by quantum algorithms [32], and the coding-theory, lattice and noisy linear algebra problems are all very related – in essence, they can all be viewed as different instances of solving noisy systems of linear equations (on either particular natural distributions, or even in the worst-case, when restricting attention to systems of equations satisfying the condition that appropriate solutions exist.111In more detail, we require worst-case hardness to hold when conditioning on instances that define a lattice where the shortest vector is long compared to amount of noise.

The main purpose of this paper is to provide an assumption that generalizes all these assumptions (i.e., is implied by all of them), yet suffices for obtaining secure PKE. Indeed, the main result of this paper is that hardness of a notion of white-box learning achieves this goal.

White-box Learning

Perhaps the most common noisy linear algebra-based assumption is the hardness of the Learning With Error (LWE) problem [30], which, in essence, stipulates the hardness of recovering a vector 𝐱 given 𝐀,𝐀𝐱+𝐞, where 𝐀 is a matrix, 𝐞 is some “small” noise vector and all arithmetic is modulo some prime p. In more detail, we typically require an stronger condition: not only that is hard to recover x, but also that it is hard to compute a value “close” to 𝐚T𝐱 for a random vector 𝐚. In other words, we can think of 𝐱 as the description of a function f𝐱(𝐚)=𝐚T𝐱 that we are trying to improperly approximately learn given noisy samples – thus the name “learning with error”.

In fact, there is also a different way to think about the LWE problem that will be useful for our purposes (which follows from the construction of Regev’s PKE [30]): We are given the code of a sampler Px that enables providing samples of the form (𝐚,f𝐱(𝐚)+𝐞), and the goal is to approximate f𝐱(𝐚) on a new fresh sample 𝐚 according to the same distribution as 𝐚.222The reason for this is that given 𝐀,𝐀𝐱+𝐞, we can generate noisy samples of f𝐱(𝐚) by taking linear combinations. See [30] and the full version of this paper for more details. We refer to this alternative way of thinking of LWE as an instance of (improper) white-box learning, where, more generally, we are given the code of a sampler P that generates noisy samples of the form (y,f(y)+ϵ) and the goal is to approximate f(y) for a fresh sample y according to the same distribution, using a polynomial-size circuit. In essence, this problem is generalizing the LWE problem from (improperly) learning linear functions from noisy samples, to (improperly) learning a polynomial-size circuit from noisy samples, and given white-box access to the sampler333As we will discuss in more detail later, it is also more general than the LWE problem in the aspect the LWE sampler has a particular form, but we here may consider more general classes of samplers.; the white-box access feature can be viewed as a generalization of Valiant classic PAC-learning model [35] to a setting where the learner not only gets random samples, but also gets the code of the sampler.

In more detail, given a sampler circuit P that samples “labeled instances” (y,z) (where we think of z as a, perhaps, noisy label for y), let 𝖢𝗈𝗆𝗉ϵΔ(P) denote the set of circuits f that with probability 1ϵ over (y,z)P satisfy the property that |zf(y)|Δ (when interpreting both z and f(z) as elements in ). For a function Δ:{1}, let Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 denote the following learning problem:

  • Input: Circuit P{0,1}n, with the promise that there exists a circuit C^{0,1}n such that C^𝖢𝗈𝗆𝗉n10Δ(1n)/n10(P).

  • Output: Circuit C𝖢𝗈𝗆𝗉1/3Δ(1n)(P).444We remark that for efficient algorithms, outputting a circuit that approximates a function f is essentially equivalent to approximating the value of f(y) on a random sample y. Indeed, the circuit implementation of an algorithm for the latter task, is a valid output for the first task.

In other words, we are given a sampler that with very high (1n10) probability outputs labelled samples where the label is very close (Δ(1n)/n10) to being correct, and the goal is to, given the code of the sampler P, simply find a circuit C that with probability 2/3 approximates the label within Δ(1n) (i.e., with a factor n10 higher error). We use 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 to denote Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 when Δ(1n)=0.

Hardness of Learning v.s. (Public-Key) Cryptography

Roughly speaking, the main result of this paper will be to show that under certain restrictions on samplers P (which come from the study of time-bounded Kolmogorov complexity [23, 34, 11, 22, 17, 33] and in particular the notion of computational depth [4]), this generalization of the LWE problem not only suffices to realize a PKE, but will also be necessary. In more detail, worst-case hardness of Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 under these conditions will characterize the existence of public-key encryption. As such, our results yield insight on the extent with which LWE “overshoots” the task of public-key encryption.

We highlight that we are not the first ones to consider connections between learning-theory and cryptography; however, as far as we know, all earlier connections were between the hardness of learning and private-key cryptography (i.e., the notion of one-way functions), as opposed to public-key cryptography. Indeed, classic results from [21, 7] demonstrate the equivalence of the hardness of a notion of average-case PAC-learning of polynomial-size circuits (i.e., black-box learning) and one-way functions.555And the results of [20] can be thought of as characterizing one-way functions through a different type of learning. In contrast, our focus here is on PKE, and instead of considering black-box learning, we consider white-box learning. An additional difference is that we consider worst-case hardness, as opposed to average-case hardness, of the learning theory problem. As was recently shown in [18], this issue can be overcome (in the context of characterizing one-way functions) through the use of a (alternative) notion of computational depth (more details on the relationship with this work below). To put out result in context, we additionaly show that exactly the same problem that we demonstrate characterizes PKE, also characterizes one-way functions once modified to only provide the learner black-box access to the sampler.

1.1 Our Results

Towards explaining our results in more detail, let us first recall the notion of computational depth [4]. Let the t-computational depth of x, denoted cdt(x), be defined as cdt(x)=Kt(x)K(x) where K(x) denotes the Kolmogorov complexity of x and Kt denotes the t-bounded Kolmogorov complexity of x. That is, cdt(x) measures how much more x could be compressed if the decompression algorithm may be computationally unbounded, as opposed to it running in time bounded by t(|x|).

We will focus on instances P of the Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 problem having the property that (P,C^) is “computationally shallow”, where C^ is a circuit that agrees with P with high probability: let Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t denote Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 with the additional promise that cdt(P,C^)2logn.666We note that there is nothing special about the constant 2; it can be anything that is strictly larger than 1.

Characterizing PKE through Hardness of White-Box Learning

Our main results is that the hardness of this problem characterizes the existence of public-key encryption:

Theorem 1.

Let ϵ>0 be a constant and let Δ:{1} be an efficiently computable function such that Δ(1n)2n(1ϵ), and t: be polynomial such that t(n)n1+ϵ. Then, following are equivalent:

  • PKE exists.

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯.

Computational depth is typically thought of as measure of “unnaturality” of strings: strings with low computational depth are considered “natural” and those with high computational depth are considered “unnatural”.777The reason why low computational depth captures “natural string” is as follows: random strings are known to have low computational depth; furthermore, known results (c.f. slow growth laws [18]) show (at least under derandomization assumptions) that one needs to have a long running time to even find a string with high computational depth. So strings with high computational depth are rare and “hard to find”, which is why they can be thought of as “unnatural”. Given this interpretation, our characterization of public key encryption is thus in terms of the worst-case hardness of white-box learning for “natural” instances.

As far as we know, this thus yields the first problem whose worst-case hardness not only suffices for public-key encryption (such as e.g., [30]) but also is necessary.

On the Use of Computational Depth

We note that Antunes and Fortnow [3] elegantly used computational depth to connect worst-case hardness of a problem when restricting attention to elements with small computational depth and errorless average-case hardness on sampleable distributions; errorless hardness, however, is not sufficient for cryptographic applications. Nevertheless, inspired by the work of [3], worst-case hardness conditioned on instances with small computational depth was used in [26] (and independently using a variant of this notion in [18]) to characterize one-way functions; additionally, an (interactive) variant of such a notion was also implicitly used in [6] to characterize key exchange protocols. Our techniques are similar to those employed in [26, 6] but instead of applying them to study the hardness of a time-bounded Kolmogorov complexity problem (following [25]), we here instead apply them to study a learning theory problem (namely, white-box learning).

We note that learning theory problems conditioned on small computational depth were recently used in [18] to characterize one-way functions, but our techniques here are more similar to [26, 6]. In particular, [18] does not actually use the standard notion of computational depth but instead define a new alternative variant; in contrast, we here rely on just the standard notion.

Relating Exact and Approximate White-Box Learning

Note that in Theorem 1, the equivalence hold for any (sufficiently small) choice of Δ, as such we directly get as a corollary the equivalence of Exact and Approximate White-box Learning:

Corollary 2.

Let ϵ>0 be a constant and let Δ:{1} be any efficiently computable function such that Δ(1n)2n(1ϵ), and let t: be any polynomial such that t(n)n1+ϵ. Then the following are equivalent:

  • 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

Bounded-Degree Learning and LWE

While in the Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 problem, we allow the function we are trying to learn to be any polynomial-size circuit, we may also consider a restricted version of the problem, denoted Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇qd, where we restrict attention to functions that can be computed by a degree d polynomial, and we assume that arithmetic is now over q.

We first remark that our main theorem can next be generalized (basically using padding) to show that it suffices to use learning of degree-nϵ polynomials to characterize PKE:

Theorem 3.

Let ϵ>0 be a constant and let Δ,q:{1} be efficiently computable functions with Δ(1n)q(1n)/4 and q(1n)2n(1ϵ), and t: be polynomial such that t(n)n1+ϵ. The following are equivalent:

  • PKE exists.

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯.

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇qn|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯.

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇qnϵ|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯.

Additionally, as informally discussed above, we note that the hardness of the LWE problem, implies hardness of the Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇q1 problem (i.e., white-box learning of linear functions.)

Lemma 4 (Informal).

Assuming the hardness of LWE, there exists a polynomial t such that Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇q1|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯 where Δ=q/4.

Lemma 4 thus shows that white-box learning of linear functions is at least as weak an assumption as LWE. At first sight, one would expect that a converse result may also hold due to known worst-case to average-case reductions for the LWE problem [30, 28, 9] and thus that LWE is equivalent to white-box learning of linear functions. However, the problem with proving the converse direction is that the known worst-case to average-case reductions for LWE only work when the LWE instance defines a lattice where the shortest vector is long compared to amount of noise. Instances sampled from P may not necessarily satisfy this promise, and thus is it not clear how to use an LWE oracle to generally solving white-box learning of linear functions. Thus, it would seem that hardness even of just Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇q1|𝖢𝖲t is seemingly a weaker (and therefore more general) assumption than LWE. (Of course, it may be that a stronger worst-case to average-case reduction can be established for the LWE problem, in which case equivalence would hold.)

We finally investigate what happens in the regime of “intermediate-degree” polynomials. We remark that using standard linearization techniques, the constant-degree problem is equivalent to the case of degree 1:

Lemma 5.

For every constant d, and functions t,q,Δ:,
there exist t,q,Δ: such that

Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇qd|𝖢𝖲tpΔ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇q1|𝖢𝖲t,

and

Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇qdpΔ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇q1.

Black-box Learning

Finally, to put our results in context, we consider the standard PAC learning model [35] where the learner only get access to samples: Let Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇 denote identically the same problem as 𝖶𝖡𝖫𝖾𝖺𝗋𝗇 with the exception that the learner gets oracle access (i.e., black-box access) to the sampler (as opposed to white-box access to the sampler). This notion is equivalent to the notion of improper Δ-approximate PAC learning for polynomial-size circuits (and when Δ=0 to simply improper PAC learning).

As before, let Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t denote the problem Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇 with the additional promise that cdt(P,C^)2logn, and let 𝖤𝗑𝖺𝖼𝗍𝖡𝖡𝖫𝖾𝖺𝗋𝗇 and 𝖤𝗑𝖺𝖼𝗍𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t to denote Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇 and Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t when Δ(1n)=0.

The following theorem can be viewed as the worst-case analog of the classic result of [21, 7] characterizing one-way functions through the hardness of average-case PAC learning.

Theorem 6.

Let ϵ>0 be a constant and let Δ:{1} be any efficiently computable function such that Δ(1n)2n(1ϵ), and let t: be any polynomial such that t(n)n1+ϵ. Then the following are equivalent:

  • One-way function exists

  • Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

As mentioned, [18] also recently obtained a worst-case characterization of one-way functions through a learning problem, and using a notion of computational depth. The problems, however, are somewhat different. As opposed to [18], we here consider the standard PAC learning problem (whereas they consider a more general learning problem), and condition on the standard notion of low computational depth (whereas they condition on a new notion of low computational depth that they introduce).888Of course, on a conceptual level, these results are similar; the key point we are trying to make here is that exactly the same learning problem characterizes either one-way functions or PKE, depending on whether the learner gets black-box or white-box access to the sampler.

We remark that our Theorem 6 differs from [21, 7] not only in the worst-case condition, but also generalizes those results in the sense that we handle the hardness of Δ-approximate learning for any Δ. As a consequence, we again get an equivalence of approximate and exact black-box learning:

Corollary 7.

Let ϵ>0 be a constant and let Δ:{1} be any efficiently computable function such that Δ(1n)2n(1ϵ), and let t: be any polynomial such that t(n)n1+ϵ. Then the following are equivalent:

  • 𝖤𝗑𝖺𝖼𝗍𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

  • Δ-𝖡𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

Open Problems

We leave as an intriguing open problem the question of whether white-box learning of polynomial, or even logarithmic-degree polynomials, also can be collapsed down to the case of constant-degree functions (and thus to linear functions); if this were possible, it would show that PKE, in essence, inherently requires the structure of the LWE problem.

Additionally, as discussed above, even when just restricting attention to learning linear functions, our learning problem generalizes LWE. It would appear that despite this, cryptographic applications of LWE (e.g., to obtain fully homomorphic encryption [14, 10]) nevertheless may still be possible from this generalized version; we leave an exploration of this for future work.

Finally, it is an intriguing open problem to relate black-box and white-box learning. By our results, doing so is equivalent to relating public-key and secret-key encryption. We note that relating black-box and white-box learning is interesting even just for the case of linear functions (which by our results is equivalent to O(1)-degree polynomials). Indeed, Regev’s construction of a PKE [30] can be thought of a reduction from black-box learning to white-box learning of linear functions for a specific distribution; it is possible that a similar reduction may be applicable more generally.

1.2 Proof Overview

We here provide a detailed proof overview for the proof of Theorem 1. For simplicity, we will show the equivalence between PKE and the worst-case hardness of the exact version, 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t. We start with the construction of PKE based on the hardness of 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t.

Weak PKE

First, we use the well-known fact that a PKE is simply a two-rounds key-agreement protocol. Moreover, by the Key-agreement Amplification Theorem of Holenstein [19] and an application of the Goldreich-Levin theorem [15], to obtain (full-fledged) PKE, it suffices to obtain a weak form of two-rounds key-agreemnt, which we simply refer to as Weak PKE defined as follows: There exist some ϵ=1/poly such that agreement between A and B happens with probability 1ϵ. Security requires that Eve cannot guess the key (output by Alice) with probability better than 120ϵ.

The Weak PKE protocol

We will next define the weak PKE protocol. We note that this construction resembles the universal key-agreement construction from [16], but with some crucial difference that enable our security proof. The parties A and B on input n perform the following steps:

  • Sample random program: A samples a random length λ[2n], and a random program Π of length λ.

  • Run random program: Next, A runs the program Π for at most t(n) steps to get an output, and interprets the output as a pair of circuits P:{0,1}r{0,1}k×{0,1} and C:{0,1}k{0,1}. (Think of P as the sampler for a white-box learning instance, and of C as a potential solution to the problem.) If the output of Π is not a valid encoding of such a pair, A sets P=P0 and C=C0 for two fixed circuits P0,C0 that always output 0.

  • Agreement estimation: A estimates the “agreement probability” of P and C (i.e., checking whether C indeed is a good solution): it samples n20 random inputs w1,,wn20 for P, and computes (xi,si)=P(wi). It then lets α^=Pri[n20][C(xi)=si]. If α^1n9, A reset the pair (P,C)=(P0,C0).

  • First message: A sends (the “sampler”) P to B.

  • Second message: B applies P on a random input w, and computes (x,s)=P(w). It then sends x to A.

  • Outputs: A outputs C(x) and Bob outputs s.

Agreement

We claim that with probability 11/n8, Alice and Bob will agree (i.e., the final outputs are the same). Note that if (P,C)=(P0,C0), Alice and Bob always agree. Moreover, let α=Pr(x,s)P(Ur)[C(x)=s]. Then, the probability of Alice and Bob to agree given that Alice uses (P,C) as the circuits in the protocol, is exactly α. We observe that by the Chernoff bound, the probability that α^ is far from α is small, and thus Alice uses (P,C) only when α is larger than 1n8.

Security

We claim that 𝖤𝗏𝖾 that can guess s with probability 1n7 can be used to solve
𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t. In more detail, consider such 𝖤𝗏𝖾, and let P be an input for
𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t. The idea is to construct an algorithm 𝖫𝖾𝖺𝗋𝗇𝖾𝗋, that on input P outputs a circuit C, such that C(x) simply simulates 𝖤𝗏𝖾 on the messages P and x. C then outputs 𝖤𝗏𝖾’s output. For simplicity, in the following we assume that 𝖤𝗏𝖾 (and thus 𝖫𝖾𝖺𝗋𝗇𝖾𝗋) is deterministic. (We note that this is a non-black-box reduction: We are using the code of Eve to generate C – in particular, we are including the code of Eve into this circuit.999This particular non-black usage of Eve is not inherent. We could have considered a different formalization of the learning theory problem which simply requires the attacker to succeed on a randomly sampled instance. Subsequent parts of the argument, however, will use non-black-box access to Eve more inherently.)

Let 𝐏 be a random variable distributed according to the same distribution of the first message in the above protocol. By assumption, we have that

Pr[C=𝖫𝖾𝖺𝗋𝗇𝖾𝗋(𝐏);(x,s)𝐏(Ur);C(x)=s]1n7.

It follows by a simple averaging argument that

Pr𝐏,C=𝖫𝖾𝖺𝗋𝗇𝖾𝗋(𝐏)[Prw[(x,s)𝐏(w);C(x)=s]2/3]13n7.

Namely, 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 solves 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 with probability at least 13n7 over the distribution of 𝐏. We next use ideas from [26, 6] to show this implies that 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 solves 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t in the worst case.

Indeed, assume that 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 fails on some instance P of 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t. By the promise of the problem, there exists a circuit C^ such that cdt(P,C^)2logn, and C^ agrees with P with probability at least 1n10. Let =Kt(P,C^). Our goal is to show that K(P,C^)<2logn, which is a contradiction.

Toward this goal, let Let 𝒮 be the set of all pairs of circuits (P,C^) with Kt(P,C^)= that agree with probability at least 1n10, and on which 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 fails, so that (P,C^)𝒮. Fix (P,C^)𝒮, and let Π be the length program that outputs (P,C^) in time t. Observe that the probability of A to sample (P,C^) in the first step of the above protocol is at least its probability to sample the program Π, which is 1/2n2. Since (P,C^) agree with high probability, the equality test it the third step of the protocol will pass with high probability, and thus A will send P to B with probability at least 1/4n2. In other words,

Pr[𝐏=P]1/4n2

for every (P,C^)𝒮, and thus

Pr[(𝐏,)𝒮]|𝒮|1/4n2

(here we say that (P,)𝒮 if there is some C such that (P,C)𝒮).

On the other hand, by definition of 𝒮, 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 fails on every (P,C^)𝒮. Since 𝖫𝖾𝖺𝗋𝗇𝖾𝗋 fails with probability at most 3n7 on 𝐏, it must holds that

3n7Pr[(𝐏,)𝒮].

Combining the above, we get that

|𝒮|12n62.

We can now use the bound on 𝒮 to bound the Kolmogorov complexity of (P,C^). To describe (P,C^), it is enough to describe the set 𝒮, and the index of the pair (P,C^) in this set. That is,

K(P,C^)K(𝒮)+log|𝒮|+O(loglogn)K(𝒮)+6logn+O(loglogn).

We conclude the proof with the observation that to describe 𝒮 it is enough to describe n (which can be done using logn+O(loglogn) bits), (logn bits) and 𝖤𝗏𝖾 (that can be described with constant many bits101010This non-black-box usage of Eve (which is taken from [26, 6]) is seemingly inherent to our proof technique. Note that we are here relying on the fact that Eve is a uniform algorithm, but as we discuss in the formal section, the argument can be extended to work also in the non-uniform setting.). Thus, K(𝒮)3logn, and we get that

K(P,C^)<2logn,

as we wanted to show.

Hardness of 𝗘𝘅𝗮𝗰𝘁𝗪𝗕𝗟𝗲𝗮𝗿𝗻|𝗖𝗦𝒕 from PKE

We next show that the existence of PKE implies the hardness of 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t. We now sketch the proof. Let 𝖦𝖾𝗇 be the algorithm the generate pair (pk,sk) of public and secret key, and let 𝖤𝗇𝖼,𝖣𝖾𝖼 be the encryption and decryption protocols. For a random pair of keys (pk,sk)𝖦𝖾𝗇(r), we construct two circuits, P(w,s)=(x=𝖤𝗇𝖼pk(s;w),s), and C^(x)=𝖣𝖾𝖼sk(x).

By the security of the PKE scheme, it follows that with high probability over the randomness of 𝖦𝖾𝗇, it is hard to learn the function of C, as the circuit P only uses the public key, and the function C computes is the decryption of a random encryption. This already implies that 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 is hard, but we still need to show that P is inside the promise of the problem 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t.

It follows by the correctness of the PKE scheme that C^ computes the function that is sampled by P. Thus, to be in the promise of 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t, we only need that (P,C^) has small computational depth. This, however, is not necessarily the case. To solve this, we simply pad C^ with the randomness used by 𝖦𝖾𝗇. That is, we let C^ be a functionally equivalent circuit to C^, with r encoded to it, where r is such that 𝖦𝖾𝗇(r)=(pk,sk). It follows that when t is large enough, Kt(P,C^)|r|+O(1) (as we can describe them by simply describing the randomness and the algorithm 𝖦𝖾𝗇). On the other hand, K(P,C^)K(r) (since r can be obtained from C^), which is at least |r|O(1) with high probability. Together, we conclude that with high probability over the randomness of 𝖦𝖾𝗇, the circuits (P,C^) are in the promise of 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t.

Comparison with [6]

We remark that at a high-level, our construction of a two-message key-agreement protocol shares many similaries with the key-agreement protocol developing in [6], relying on the hardness of an interactive notion of time-bounded Kolmogorov complexity, conditionned on an analog of computational shallowness. They central difference is that the protocol in [6] requires at least 3 rounds due to the use of an equility check protocol to determine whether Alice and Bob managed to agree on a key. In contrast, our protocol does not rely on such an equality check step (and thus it can be executed in only two rounds, which is crucial to get PKE); indeed, we replaced the equality protocol with a step where Alice on her own can determine whether the message she sends will enable agreement to happen. Of course, the reason why we can do this is that we are reducing security from a different problem.

Other features of the protocol are quite similar; this is because we also rely on the worst-case hardness of a problem “conditioning on computationally shallow instances”. Indeed, as described above, our security proof shares many features with those of [25, 6].

Comparison with Universal Constructions

Universal constructions of PKE are known (see [16]); that is, constructions having the property that they are secure if (any) PKE exist. We emphasize that while the details of the protocol from [16] are somewhat different, the “agreement estimation” step performed there is very similar to what we do. Furthermore, we can interpret our protocol as an alternative (variant) universal PKE protocol in which Alice chooses a random (key-generation) program 𝖦𝖾𝗇 and executes it to get an encryption scheme 𝖤𝗇𝖼pk and description scheme 𝖣𝖾𝖼sk, with the keys pk,sk hardcoded to the scheme.111111In the protocol of [16], Alice would instead choose random programs for Alice and Bob to run; we do not know how to prove the security of such a protocol under our assumption. Alice then estimates the agreement probability of the encryption, and if it is high enough, she sends the encryption scheme as the public-key, and uses the decryption scheme as the private-key. If PKE exists, with a noticeable probability over the choice of 𝖦𝖾𝗇, this scheme will be secure. We emphasize that since we base the security of our protocol (and thus also the above universal one) on the hardness of the white-box learning problem, it enables an approach for measuring the concrete security of the protocol by relating it to the security of the learning theory problem.

2 Preliminaries

2.1 Notations

All logarithms are taken in base 2. We use calligraphic letters to denote sets and distributions, bold uppercase for random variables, and lowercase for values and functions. Let poly stand for the set of all polynomials. Let ppt stand for probabilistic poly-time, and n.u.-poly-time stand for non-uniform poly-time. An n.u.-poly-time algorithm 𝖠 is equipped with a (fixed) poly-size advice string set {zn}n (that we typically omit from the notation), and we let 𝖠n stand for 𝖠 equipped with the advice zn (used for inputs of length n). For a randomized algorithm 𝖠, we denote by 𝖠(;r) the algorithm 𝖠 with fixed randomness r{0,1}. Let neg stand for a negligible function. Given a vector vΣn, let vi denote its ith entry, let v<i=(v1,,vi1) and vi=(v1,,vi). Similarly, for a set [n], let v be the ordered sequence (vi)i. For x,y{0,1}, we use x||y to denote the concatenation of x and y. For a set 𝒮{0,1}, we use 𝒮||y to denote the set {x||y:x𝒮}.

2.2 Distributions and Random Variables

When unambiguous, we will naturally view a random variable as its marginal distribution. The support of a finite distribution 𝒫 is defined by Supp(𝒫):={x:Pr𝒫[x]>0}. For a (discrete) distribution 𝒫, let x𝒫 denote that x was sampled according to 𝒫. Similarly, for a set 𝒮, let x𝒮 denote that x is drawn uniformly from 𝒮. For m, we use 𝐔m to denote a uniform random variable over {0,1}m (that is independent from other random variables in consideration). The statistical distance (also known as, variation distance) of two distributions 𝒫 and 𝒬 over a discrete domain 𝒳 is defined by SD(𝒫,𝒬):=max𝒮𝒳|𝒫(𝒮)𝒬(𝒮)|=12x𝒮|𝒫(x)𝒬(x)|.

The following lemma is proven in the full version of this paper.

Lemma 8.

There exists an efficient oracle-aided algorithm 𝖠 such that the following holds. Let (𝐗,𝐘) be a pair of jointly distributed random variables over {0,1}×{0,1}n, and let 𝐑 be a uniform independent random variable over {0,1}n. Let 𝖤 be an algorithm such that Pr[𝖤(𝐗,𝐑)=𝐘,𝐑]1ϵ, for 0ϵ. Then Pr[𝖠𝖤(1n,𝐗)=𝐘]18ϵneg(n).

2.3 Circuits

In this paper we consider circuits over the De-Morgan Basis, which contains the following gates: (“and” gate with fan-in 2), (“or” gate with fan-in 2), and ¬ (“not” gate with fan-in one). The size of a circuit C, is the number of gates in C.

We consider encoding of a circuit as a string over {0,1} in the following natural way: Given a circuit C, we first encode the length of C using a prefix-free encoding, and then for every gate g, according to a topological order, we encode its type (input, output, ,, or ¬), and the (up to two) gates wired into g.

Observe that every circuit C of size s can be encoded to a string of length O(slogs). Moreover, given the encoding and an input x to C, C(x) can be computed efficiently (in polynomial time in the encoding length).

We identify a string in {0,1} with the circuit it encodes. For example, for C{0,1}n we use C(x) to denote the value of the circuit encoded by C applied on the input x. We denote by |C| the length of the encoding of C.

Finally, by encoding first the size of C, we assume that for every encoding of a circuit C, and for every z{0,1}, the string C||z encodes the same circuit as C.121212We actually only use the fact that we can encode z into the circuit. This can be done by adding dummy gates that do not change the output of C, where each gate gi is either or according to the ith bit of z.

2.4 Entropy

For a random variable 𝐗, let H(𝐗)=E[log1Pr[𝐗=x]] denote the (Shannon) entropy of 𝐗, and let H(𝐗)=minxSupp(𝐗)log1Pr[𝐗=x] denote the min-entropy of 𝐗.

minxSupp(𝐗)log1Pr[𝐗=x].

For a random variable 𝐗 and an event E, we use H(𝐗E) to denote the min-entropy of the distribution 𝐗|E. We will use the following facts.

Fact 9.

Let 𝐗 and 𝐘 be independent random variables. Then H(𝐗,𝐘)=H(𝐗)+H(𝐘).

2.5 Complexity Classes

We define the complexity classes FBPP and 𝗂𝗈𝖯/poly.

Definition 10 (Infinitely-often FBPP (𝗂𝗈𝖥𝖡𝖯𝖯)).

A binary relation {0,1}×{0,1} is in 𝗂𝗈𝖥𝖡𝖯𝖯 if there exists ppt algorithms 𝖠 such that the following holds for infinitely many n’s:

For every x{0,1}n such that there exists y{0,1} with (x,y),

Pr[(x,𝖠(x,1k))]11/k,

for every k>0. is in 𝗂𝗈𝖯/poly if the above holds with respect to n.u.polytime algorithms 𝖠.

2.6 One-Way Function

Definition 11 (One-way function).

A polynomial-time computable function f:{0,1}{0,1} is called a one-way function if for every polynomial-time algorithm 𝖠,

Prx{0,1}n[𝖠(1n,f(x))f1(f(x))]=neg(n)
Definition 12 (Weak one-way function).

Let mpoly be a polynomial-time computable function. A polynomial-time computable function f:{0,1}m(n){0,1} is called α-weak one-way function if for every polynomial-time algorithm 𝖠, for every large enough n,

Prx{0,1}m(n)[𝖠(1n,f(x))f1(f(x))]1α(n)

f is a weak one-way function if it is 1/p-weak one way function, for some ppoly.

Theorem 13 (Weak to strong OWFs, [36]).

One-way functions exist if and only if weak-one way functions exist.

2.7 Public-Key Encryption

Definition 14 (Public-key encryption scheme (PKE)).

A triplet of randomized, efficiently computable functions (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) is a (α(n),β(n))-public-key encryption scheme (PKE) if the following holds:

  • Correctness: For every large enough n and any m{0,1},

    Pr(sk,pk)𝖦𝖾𝗇(1n)[𝖣𝖾𝖼(sk,𝖤𝗇𝖼(pk,m))=m]1α(n)
  • Security: For every PPT 𝖤𝗏𝖾, for every large enough n,

    Pr(sk,pk)𝖦𝖾𝗇(1n),m{0,1}[𝖤𝗏𝖾(pk,𝖤𝗇𝖼(pk,m))=m]1/2+β(n).

Such a scheme is a PKE if it is (1/nc,1/nc)-PKE for every c.

The following lemma shows that it is possible to amplify an 1-bit weak key-agreement protocol into a key-agreement. This lemma is a simple case of the more general result of [19].

Lemma 15 (PKE amplification, [19]).

The following holds for every constants c1>c2. Assume there exists an (nc1,1/2nc2)-PKE. Then, there exists a PKE.

We also define weak-PKE.

Definition 16 (Weak Public-key encryption scheme (weak-PKE)).

For an efficiently computable function d:{1}, a triplet of randomized, efficiently computable functions (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) is a (α(n),β(n),γ(n))-weak-public-key encryption scheme (weak-PKE) if the following holds:

  • For every n, given pk and randomness r, 𝖤𝗇𝖼(pk;r) outputs a message m(pk;r) and an output o(pk;r).

  • Correctness: For every large enough n

    Pr(sk,pk)𝖦𝖾𝗇(1n),r[|𝖣𝖾𝖼(sk,m(pk,r))o(pk,r)|d(1n)]1α(n)
  • Security: For every PPT 𝖤𝗏𝖾, for every large enough n,

    Pr(sk,pk)𝖦𝖾𝗇(1n),r[|𝖤𝗏𝖾(pk,m(pk,r))o(pk,r)|γ(n)d(1n)]β(n).

In the full version of this paper we prove the following lemma, stating that weak-PKE can be used to construct PKE.

Lemma 17 (Weak-PKE amplification).

The following holds for every constants c1>c2. Assume there exists an (nc1/2,110nc2,2nc1)-weak-PKE. Then, there exists a PKE.

2.8 Kolmogorov Complexity and Computational Depth

Roughly speaking, the t-time-bounded Kolmogorov complexity, Kt(x), of a string x{0,1} is the length of the shortest program Π=(M,y) such that, when simulated by an universal Turing machine, Π outputs x in t(|x|) steps. Here, a program Π is simply a pair of a Turing Machine M and an input y, where the output of P is defined as the output of M(y). When there is no running time bound (i.e., the program can run in an arbitrary number of steps), we obtain the notion of Kolmogorov complexity.

In the following, fix universal TM 𝖴 with polynomial simulation overhead, and let 𝖴(Π,1t) denote the output of Π when emulated on 𝖴 for t steps. We now define the notion of Kolmogorov complexity with respect to the universal TM 𝖴.

Definition 18.

Let t be a polynomial. For all x{0,1}, define the t-bounded Kolmogorov complexity of x

Kt(x)=minΠ{0,1}{|Π|:𝖴(Π,1t(|x|))=x}

where |Π| is referred to as the description length of Π. When there is no time bound, the Kolmogorov complexity of x is defined as

K(x)=minΠ{0,1}{|Π|:t𝖴(Π,1t)=x}.

The computational depth of x [4], denoted by cdt,(x), is defined to be

cdt,(x)=Kt(x)K(x).

We use 𝖪(x,y) to denote the Kolmogorov complexity of some generic self-delimiting encoding of the pair x,y. Recall that we use 𝖪(x||y) to denote the complexity of the concatenation of x and y. We will use the following well-known fact:

Fact 19.

For every x,y{0,1},

𝖪(x,y)𝖪(x)+𝖪(y)+log(𝖪(x))+2loglog(𝖪(x))+O(1).

We will also use the following bound on the Kolmogorov complexity of strings sampled from distributions with high min-entropy.

Lemma 20.

For every n, and every distribution 𝒟, it holds that

Prx𝒟[𝖪(x)H(𝒟)logn]11/n.
Proof.

There are at most 2H(𝒟)logn strings x with 𝖪(x)<H(𝒟)logn. Thus,

Prx𝒟[𝖪(x)<H(𝒟)logn]2H(𝒟)logn2H(𝒟)=1/n.

We will also use the well-known Chernoff bound in our proof.

Fact 21 (Hoeffding’s inequality).

Let 𝐀1,,𝐀n be independent random variables s.t. 𝐀i{0,1}. Let 𝐀^=1/nΣi=1n𝐀i and μ=E[𝐀^]. For every ϵ[0,1] it holds that:

Pr[|𝐀^μ|ϵ]2eϵ2n.

3 White-Box Distributional Learning

Let P:{0,1}r{0,1}k×{0,1} be a circuit that, given r bits of randomness, samples labeled instances (x,s). In the following we view s as a binary representation of a number in (and respectively all the operations below are over , and we use || to denote the absolute value). We define the set of all circuits that approximate P with high probability,

𝖢𝗈𝗆𝗉ϵΔ(P)={C:{0,1}k{0,1}:Pr(x,s)P(𝐔r)[|C(x)s|Δ]1ϵ}.

We define the following white-box learning problem WBLearn:

Definition 22 (Δ-WBLearn).

For a function Δ:{1}, let Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 be the following learning problem:

  • Input: Circuit P{0,1}n, with the promise that there exists a circuit C^{0,1}n such that C^𝖢𝗈𝗆𝗉n10Δ(1n)/n10(P).

  • Output: Circuit C𝖢𝗈𝗆𝗉1/3Δ(1n)(P)

In this work we are focusing on inputs P for which the circuit C^ that agree with P with high probability, is such that the description of C^ and P is computational shallow. Formally, for a time function t:, we denote by Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t the problem Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 with the additional promise that cdt(P,C^)2logn.

We use 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 and 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t to denote Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇 and Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t when Δ(1n)=0. Note that 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 and Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t are incomparable: While in Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t we only need to find a circuit that approximates P, the promise in 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 is stronger. Yet, there is a simple reduction from 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇 to
Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t.

Lemma 23.

For every Δ:{1} such that Δ2o(n/logn), it holds that

𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇pΔ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇.

Similarly, for any such Δ and a function t:, there exists t:, such that

𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲tpΔ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t.
Proof.

We start with the first reduction 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇pΔ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇. Given a circuit P:{0,1}r{0,1}k×{0,1} of length n, the reduction outputs a circuit P:{0,1}r{0,1}k×{0,1}+n, which is equivalent to P, with additional n output gates that always output 0. That is, P(w)=(x,s), where P(w)=(x,s) and s=2ns. As we only added n gates to P, P can be encoded using O(nlogn) bits. Using padding we assume that |P|Θ(nlogn).

For correctness, observe that if there exists C^𝖢𝗈𝗆𝗉n100(P), then there exists such C^𝖢𝗈𝗆𝗉n100(P) (where C^ is defined similarly to P using C^). Moreover, an approximation of the output s of P within a distance of 2n/4 is equivalent to the exact output s of P. Indeed, given an 2n/4-approximation of s we can find s by simply dividing s by 2n and rounding to the closest integer.

Finally, to see the second reduction, it is enough to show that cdt(P,C^)2log|P|. Since P,C^ can be efficiently constructed given P,C^, and similarly P,C^ can be efficiently constructed given P,C^, it holds that for large enough (polynomial time t, cdt(P,C^)cdt(P,C^)+O(1)2logn+O(1)2logn+2loglogn2log|P|.

We prove the following theorem.

Theorem 24.

Let ϵ>0 be a constant and let Δ:{1} be any efficiently computable function such that Δ(1n)2n(1ϵ), and let t: be any polynomial such that t(n)n1+ϵ. Then the following are equivalent:

  • PKE exists

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

As a corollary, we get a result of independent interest relating exact and approximate white-box learning:

Corollary 25.

Let ϵ>0 be a constant and let Δ:{1} be any efficiently computable function such that Δ(1n)2n(1ϵ), and let t: be any polynomial such that t(n)n1+ϵ. Then the following are equivalent:

  • 𝖤𝗑𝖺𝖼𝗍𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

  • Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯

Theorem 24 follows by the following two theorems.

Theorem 26.

Let t(n) be a polynomial and Δ:{1} an efficiently computable function. Then if Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯, PKE exist.

Theorem 27.

Assume there exists a PKE. Then for any constant ϵ>0 and any t(n)n1+ϵ and any efficiently computable Δ:{1} such that Δ(1n)2n(1ϵ), Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖥𝖡𝖯𝖯.

Theorem 26 is proven in Section 4, and Theorem 27 is proven in the full version of this paper.

4 Worst-Case hardness of 𝚫-𝗪𝗕𝗟𝗲𝗮𝗿𝗻|𝗖𝗦𝒕 KE

In this section we prove Theorem 26, that states that the worst-case hardness of 𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝒬t implies the existence of public-key encryption scheme. In the following, let C0:{0,1}{0,1} be the circuit that always outputs 0, and P0:{0,1}{0,1}×{0,1} be the circuit that always outputs (0,0).

To prove the above theorem, fix t=t(n) and Δ=Δ(1n), and consider the following scheme (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼):

Algorithm 28 (𝖦𝖾𝗇).

Parameter: function t:.

Input: 1n.

Operation:

  1. 1.

    Sample λ[3n] and Π{0,1}λ.

  2. 2.

    Run Π for t(2n) steps to get circuits C:{0,1}k{0,1},P:{0,1}r{0,1}k×{0,1} for some k,r,. If the output of Π is not two such circuits, set C=C0, P=P0, r=k==1.

  3. 3.

    Randomly sample (x1,s1),,(xn20,sn20)P(Ur), and compute

    α^=Pri[n20][|C(xi)si|Δ(1n)/n10].

    If α^<12n8, reset C=C0 and P=P0.

  4. 4.

    Output (k,,C) as the secret key, and (r,k,,P) as the public key.

.

Algorithm 29 (𝖤𝗇𝖼).

Input: public-key (r,k,,P:{0,1}r{0,1}k×{0,1}).

Operation:

  1. 1.

    Sample randomness z{0,1}r.

  2. 2.

    Compute P(z) to get (x,s).

  3. 3.

    Output m=x and o=s.

.

Algorithm 30 (𝖣𝖾𝖼).

Input: secret-key (k,,C:{0,1}k{0,1}), cipher x{0,1}k.

Operation:

  1. 1.

    Compute C(x)=s.

  2. 2.

    Output s.

.

Observe that the size of the circuits C and P sampled by 𝖦𝖾𝗇(1n) is at most t(2n). Thus, all of the above algorithms can be implemented efficiently. Below we bound the correctness and the security of the above scheme. For every n, let (𝐊n,𝐋n,𝐂n) and (𝐑n,𝐊n,𝐋n,𝐏n) be the random variables distributed according to the secret and public keys (k,,C),(r,k,,P) in a random execution of 𝖦𝖾𝗇(1n). Let 𝐌n and 𝐎n be the output of 𝖤𝗇𝖼(𝐑n,𝐊n,𝐋n,𝐏n) in a random execution.

4.1 Correctness

We start by analyzing the correctness probability of the scheme.

Lemma 31.

For every n, it holds that

Pr[|𝖣𝖾𝖼((𝐊n,𝐋n,𝐂n),𝐌𝐧)𝐎n|Δ(1n)/n10]1n7.

To prove Lemma 31 we will use the following simple claim, which is immediate from the Hoffeding bound.

Claim 32.

Let C:{0,1}k{0,1} and P:{0,1}r{0,1}k×{0,1} be two circuits. Let 𝐀 be a random variable distributed according to the following process:

Sample (x1,s1),s(xn20,sn20)P(𝐔r), and let 𝐀=Pri[n20][|C(xi)si|Δ].
Let μ=Pr(x,s)P(𝐔r)[|C(x)s|Δ]. Then Pr[|𝐀μ|n9]2n.

Proof.

Immediate by 21.

Proof of Lemma 31.

When 𝖦𝖾𝗇(1n) outputs C0 and P0, the scheme has perfect correctness. Moreover, when the secret and public key are (k,,C) and (r,k,,P), it holds that

Pr[|𝖣𝖾𝖼((k,,C),𝐌))𝐎|Δ(1n)/n10]=Pr(x,s)P(𝐔r)[|C(x)s|Δ(1n)/n10].

Thus,

Pr[|𝖣𝖾𝖼((𝐊n,𝐋n,𝐂n),𝐌n)𝐎n|Δ(1n)/n10]
(13n8)Pr𝐑n,𝐊n,𝐏n,𝐂n[Pr(x,s)𝐏n(𝐔𝐑n)[|𝐂n(x)s|Δ(1n)/n10]13n8]
13n8Pr𝐑n,𝐊n,𝐏n,𝐂n[Pr(x,s)𝐏n(𝐔𝐑n)[|𝐂n(x)s|Δ(1n)/n10]<13n8].

The lemma now follows since by Lemma 31, the probability of circuits P,C with

Pr(x,s)P(𝐔r)[|𝐂n(x)s|Δ(1n)/n10]<13n8

to pass the test in Step 3 is at most 2n.

4.2 Security

We next bound the leakage of the scheme.

Lemma 33.

Assume there exists an algorithm 𝖤𝗏𝖾 such that

Pr[|𝖤𝗏𝖾(1n,(𝐑n,𝐊n,𝐋n,𝐏n),𝐌𝐧)𝐎n|Δ(1n)]1n6

for infinitely many n’s. Then Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖡𝖯𝖯.

In the following, let 𝖤𝗏𝖾 be an algorithm that uses r𝖤𝗏𝖾(n) bits of randomness and guesses 𝐌 with probability at least 1n6. Recall that for w{0,1}r𝖤𝗏𝖾(n), 𝖤𝗏𝖾(1n,(r,k,,P),m;w) denotes the execution of 𝖤𝗏𝖾 when its randomness is fixed to be w.

Algorithm 34.

Input: P{0,1}n.

Operation:

  1. 1.

    Let r,k, be such that P:{0,1}r{0,1}k×{0,1}.

  2. 2.

    Sample w{0,1}r𝖤𝗏𝖾(n) uniformly at random.

  3. 3.

    Construct a circuit C such that C(x)=𝖤𝗏𝖾(1n,(r,k,,P),x;w).

  4. 4.

    Return C.

.

We prove the following lemma.

Lemma 35.

Assume that

Pr[|𝖤𝗏𝖾(1n,(𝐑n,𝐊n,𝐋n,𝐏n),𝐌n)𝐎n|Δ(1n)]1n6

for infinitely many n’s. Then the following holds for infinitely many n’s. For every input P{0,1}n𝒬Δt, 34 outputs C𝖢𝗈𝗆𝗉1/4Δ(P) with probability at least 1/2.

We prove Lemma 35 below, but first we use it to prove Lemma 33.

Proof of Lemma 33.

Assume there exists an algorithm 𝖤𝗏𝖾 such that

Pr[|𝖤𝗏𝖾(1n,(𝐑n,𝐊n,𝐋n,𝐏n),𝐌n)𝐎n|Δ(1n)]1n6

for infinitely many n’s.

By Lemma 35, for infinitely many n’s, 34 outputs C𝖢𝗈𝗆𝗉1/4Δ(P) with probability at least 1/2, for every P{0,1}n𝒬Δt. We want to construct an algorithm Sol that, given P and 11/δ, outputs C𝖢𝗈𝗆𝗉1/3(P) with probability at least 1δ, for every such n.

Let Sol be the algorithm that, given input (P:{0,1}r{0,1}k×{0,1},11/δ), first run 34 on P for 2log1/δ times, to get circuits C1,,C2log1/δ. Then, for every circuit Ci, Sol samples (x1,s1),,(x(100log1/δ)2,s(100log1/δ)2)P(𝐔r), and computes

pi=Prj[(100log1/δ)2][|Ci(xj)sj|Δ(1n)].

Finally, Sol outputs the circuit Ci for the index i with maximal value of pi.

We now analyze the success probability of Sol. Using 21, for every i, the probability that |piPr(x,s)P(𝐔r)[|Ci(x)=s|Δ(1n)]|1/25 is at most

24(log1/δ)2+11/(δ(4log1/δ)).

Thus, by the union bound, the probability that |piPr(x,s)P(𝐔r)[|Ci(x)=s|Δ(1n)]|1/25 for some i is at most δ/2. Next, by the success probability of 34, with probability at least

1(11/2)2log1/δ1δ/2,

at least one of the circuits C1,,C2log1/δ is in 𝖢𝗈𝗆𝗉1/4Δ(P)𝖢𝗈𝗆𝗉1/3Δ(P). Let i be the index of such a circuit. Then, with probability at least 1δ/2δ/2=1δ, such i exists, and pi is at least (3/41/25), while for every i with Ci𝖢𝗈𝗆𝗉1/3Δ(P), pi is at most

(2/3+1/25)<(3/41/25)pi,

which implies that the output of Sol is in 𝖢𝗈𝗆𝗉1/3Δ(P).

4.3 Proving Lemma 35

To prove Lemma 35, we start with the following claim.

Claim 36.

Let P:{0,1}r{0,1}k×{0,1} be a circuit, and assume that

Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|Δ(n)]9/10.

Then, on input P, 34 outputs C𝖢𝗈𝗆𝗉1/4Δ(P) with probability at least 1/2.

Proof of 36.

Let P:{0,1}r{0,1}k×{0,1} be a circuit such that

Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|Δ(n)]9/10.

Then, by definition it holds that

Ew{0,1}r𝖤𝗏𝖾(n)[Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|>Δ(n)]]1/10.

Using Markov’s inequality, we gets that

Prw{0,1}r𝖤𝗏𝖾(n)[Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|>Δ(n)]>1/4]1/2,

which implies that the circuit C=𝖤𝗏𝖾(1n,(r,k,,P),;w) is in 𝖢𝗈𝗆𝗉1/4Δ(P) with probability at least 1/2 over the choice of w, as we wanted to show.

Given 36, we are now ready to prove Lemma 35.

Proof of Lemma 35.

Assume that 𝖤𝗏𝖾 is such that

Pr[|𝖤𝗏𝖾(1n,(𝐑n,𝐊n,𝐋n,𝐏n),𝐌n)𝐎n|Δ(1n)]1n6

for infinitely many n’s. In the following, fix such large enough n. We show that

Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|Δ(n)]9/10 (1)

for every P:{0,1}r{0,1}k×{0,1} with P𝒬tΔ{0,1}n. The proof then follows by 36. To see the above, let 𝐏𝐤=(𝐑n,𝐊n,𝐋n,𝐏n)), and notice that

Pr𝐑n,𝐊n,𝐋n,𝐏n,𝐂n𝐏𝐤=(𝐑n,𝐊n,𝐋n,𝐏n)[Pr𝐌n,𝐎n,𝐖{0,1}r𝖤𝗏𝖾[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n;𝐖)𝐎n|Δ(n)]<9/10] (2)
10n6.

Indeed, it holds that

1n6 Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]
Pr𝐏𝐤[Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]9/10]
+9/10Pr𝐏𝐤[Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]<9/10]
=(1Pr𝐏𝐤[Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]<9/10])
+9/10Pr𝐏𝐤[Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]<9/10]
=11/10Pr𝐏𝐤[Pr[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n)𝐎n|Δ(n)]<9/10]

which implies that Equation 2 holds. Next, we use Equation 2 and the upper bound on the computational depth of instances in 𝒬tΔ, to show that Equation 1 holds for every P𝒬tΔ. To do so, fix P𝒬tΔ{0,1}n and let C{0,1}n𝖢𝗈𝗆𝗉Δ/n10n10(P) with cdt,(C,P)2logn be the circuit promised by the definition of 𝒬tΔ. Assume towards a contradiction that

Pr(x,s)P(𝐔r)[|𝖤𝗏𝖾(1n,(r,k,,P),x)s|Δ(n)]<9/10.

We want to upperbound 𝖪(P,C), to get a lower bound on the computational depth of (P,C). To this end, let 𝒮 be the set of all pairs (P,C){0,1}n×{0,1}n, such that C𝖢𝗈𝗆𝗉n10Δ/n10(P), 𝖪t(P,C)=𝖪t(P,C), and on which 𝖤𝗏𝖾 fails to approximate s with probability more than 1/10.

By our assumption on (P,C), it holds that (P,C)𝒮. We next bound the size of 𝒮. First, we claim that for every (P,C)𝒮,

Pr[𝐏n=P,𝐂n=C]1/6n2𝖪t(C,P). (3)

Indeed, by definition there exists a program Π of length 𝖪t(C,P) that outputs (C,P) in at most t(2n) steps. Thus, 𝖦𝖾𝗇 samples Π with probability at least 1/3n2𝖪t(C,P). Next, by 32, the test in Step 3 of 𝖦𝖾𝗇 passes with probability at least 12n>1/2, since by definition of 𝖢𝗈𝗆𝗉n10Δ/n10, Pr(x,s)P(𝐔r[|C(x)s|Δ(1n)/n10]n91n10n912n7. In this case that the test passes, P and C are the output of 𝖦𝖾𝗇, and thus Equation 3 holds.

By Equation 3 and the definition of 𝒮, we get that

Pr𝐑n,𝐊n,𝐋n,𝐏n,𝐂n𝐏𝐤=(𝐑n,𝐊n,𝐋n,𝐏n)[Pr𝐌n,𝐎n,𝐖{0,1}r𝖤𝗏𝖾[|𝖤𝗏𝖾(1n,𝐏𝐤,𝐌n;𝐖)𝐎n|Δ(n)]<9/10]
|𝒮|1/6n2𝖪t(C,P).

Combining the above with Equation 2 yields that

|𝒮|10n61/6n2𝖪t(C,P)2𝖪t(C,P)+logn6logn+6.

Observe that 𝒮 can be (inefficiently) computed given n,𝖪t(C,P) and 𝖤𝗏𝖾. Thus, to encode (P,C), it is enough to encode 𝒮 and the index of (P,C) in 𝒮 (according to the lexicographic order). We conclude that,

𝖪(P,C) 𝖪(n,λ,𝖤𝗏𝖾)+2log(𝖪(n,λ,𝖤𝗏𝖾))+log|𝒮|+O(1)
2logn+4loglogn+𝖪t(C,P)5logn+O(1)
<𝖪t(C,P)2logn,

where the last inequality holds for every large enough n. By the above, 𝖪t(C,P)𝖪(P,C)>2logn, in contradiction to the choice of (P,C). This yields that Equation 1 holds for every P𝒬t, as we wanted to show.

4.4 Proving Theorem 26

We are now ready to use Lemma 15 in order to prove Theorem 26.

Proof of Theorem 26.

By Lemmas 31 and 33, (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) is a (n7,1n6,n10)-weak-PKE (for d(1n)=Δ(1n)/n10), and thus it is also a (n6.9/2,110n6.1,2n6.9)-weak-PKE. Thus, by Lemma 15, (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) can be amplified into a PKE.

 Remark 37 (The non-uniform setting).

A similar theorem can be proven when assuming that Δ-𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝖢𝖲t𝗂𝗈𝖯/poly, and when the PKE is secure against non-uniform adversaries. In this case, we assume that 𝖤𝗏𝖾 is a non-uniform algorithm that breaks the PKE protocol, and want to construct a non-uniform (randomized) algorithm that decides 𝖶𝖡𝖫𝖾𝖺𝗋𝗇|𝒬t.

The issue with the above proof is that we cannot simply use 𝖤𝗏𝖾 to bound the Kolmogorov complexity of (C,P) as done in the proof of Lemma 33, as 𝖤𝗏𝖾 does not have constant size. However, we can find 𝖤𝗏𝖾 using a small Turing machine: Let M be the (inefficient) Turing machine that, given a constant c such that nc is a bound on the size of 𝖤𝗏𝖾, and an input for 𝖤𝗏𝖾, first find the circuit En of size at most nc that maximize the advantage in predicting M given an encryption of M by 𝖤𝗇𝖼, and then execute E on the input. Observe that M has prediction advantage at least as the advantage of 𝖤𝗏𝖾. The theorem now follows using the same proof, by replacing 𝖤𝗏𝖾 in the proof of Lemma 33 with M, and replacing 𝖤𝗏𝖾 in 34 with E={En}n.

References

  • [1] Miklós Ajtai and Cynthia Dwork. A public-key cryptosystem with worst-case/average-case equivalence. In stoc29, pages 284–293, 1997. See also ECCC TR96-065. doi:10.1145/258533.258604.
  • [2] Michael Alekhnovich. More on average case vs approximation complexity. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 298–307. IEEE, 2003. doi:10.1109/SFCS.2003.1238204.
  • [3] Luis Antunes and Lance Fortnow. Worst-case running times for average-case algorithms. In 2009 24th Annual IEEE Conference on Computational Complexity, pages 298–303. IEEE, 2009. doi:10.1109/CCC.2009.12.
  • [4] Luis Antunes, Lance Fortnow, Dieter Van Melkebeek, and N Variyam Vinodchandran. Computational depth: concept and applications. Theoretical Computer Science, 354(3):391–404, 2006. doi:10.1016/J.TCS.2005.11.033.
  • [5] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptography from different assumptions. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 171–180, 2010. doi:10.1145/1806689.1806715.
  • [6] Marshall Ball, Yanyi Liu, Noam Mazor, and Rafael Pass. Kolmogorov comes to cryptomania: On interactive kolmogorov complexity and key-agreement. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 458–483. IEEE, 2023. doi:10.1109/FOCS57990.2023.00034.
  • [7] Avrim Blum, Merrick Furst, Michael Kearns, and Richard J Lipton. Cryptographic primitives based on hard learning problems. In Annual International Cryptology Conference, pages 278–291. Springer, 1993. doi:10.1007/3-540-48329-2_24.
  • [8] Andrej Bogdanov, Miguel Cueto Noval, Charlotte Hoffmann, and Alon Rosen. Public-key encryption from homogeneous clwe. In Theory of Cryptography: 20th International Conference, TCC 2022, Chicago, IL, USA, November 7–10, 2022, Proceedings, Part II, pages 565–592. Springer, 2022. doi:10.1007/978-3-031-22365-5_20.
  • [9] Zvika Brakerski, Adeline Langlois, Chris Peikert, Oded Regev, and Damien Stehlé. Classical hardness of learning with errors. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 575–584, 2013. doi:10.1145/2488608.2488680.
  • [10] Zvika Brakerski and Vinod Vaikuntanathan. Efficient fully homomorphic encryption from (standard) lwe. Journal of the ACM, 43(2):831–871, 2014.
  • [11] Gregory J. Chaitin. On the simplicity and speed of programs for computing infinite sets of natural numbers. J. ACM, 16(3):407–422, 1969. doi:10.1145/321526.321530.
  • [12] Whitfield Diffie and Martin E. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, pages 644–654, 1976. doi:10.1109/TIT.1976.1055638.
  • [13] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Annual International Cryptology Conference (CRYPTO), pages 10–18, 1984.
  • [14] Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 169–178, 2009. doi:10.1145/1536414.1536440.
  • [15] Oded Goldreich and Leonid A. Levin. A hard-core predicate for all one-way functions. In Proceedings of the twenty-first annual ACM symposium on Theory of computing (STOC), pages 25–32, 1989. doi:10.1145/73007.73010.
  • [16] Danny Harnik, Joe Kilian, Moni Naor, Omer Reingold, and Alon Rosen. On robust combiners for oblivious transfer and other primitives. In Advances in Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May 22-26, 2005. Proceedings 24, pages 96–113. Springer, 2005. doi:10.1007/11426639_6.
  • [17] J. Hartmanis. Generalized kolmogorov complexity and the structure of feasible computations. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983), pages 439–445, 1983. doi:10.1109/SFCS.1983.21.
  • [18] Shuichi Hirahara and Mikito Nanashima. Learning in pessiland via inductive inference. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 447–457. IEEE, 2023. doi:10.1109/FOCS57990.2023.00033.
  • [19] Thomas Holenstein. Strengthening key agreement using hard-core sets. PhD thesis, ETH Zurich, 2006. doi:10.3929/ETHZ-A-005205852.
  • [20] Russell Impagliazzo and Leonid A. Levin. No better ways to generate hard NP instances than picking uniformly at random. In focs31, pages 812–821, 1990. doi:10.1109/FSCS.1990.89604.
  • [21] Michael Kearns and Leslie Valiant. Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM (JACM), 41(1):67–95, 1994. doi:10.1145/174644.174647.
  • [22] Ker-I Ko. On the notion of infinite pseudorandom sequences. Theor. Comput. Sci., 48(3):9–33, 1986. doi:10.1016/0304-3975(86)90081-2.
  • [23] A. N. Kolmogorov. Three approaches to the quantitative definition of information. International Journal of Computer Mathematics, 2(1-4):157–168, 1968.
  • [24] Yanyi Liu, Noam Mazor, and Rafael Pass. On white-box learning and public-key encryption. Cryptology ePrint Archive, Paper 2024/1931, 2024. URL: https://eprint.iacr.org/2024/1931.
  • [25] Yanyi Liu and Rafael Pass. On one-way functions and kolmogorov complexity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1243–1254. IEEE, 2020. doi:10.1109/FOCS46700.2020.00118.
  • [26] Yanyi Liu and Rafael Pass. On one-way functions and the worst-case hardness of time-bounded kolmogorov complexity. Cryptology ePrint Archive, 2023.
  • [27] Robert J McEliece. A public-key cryptosystem based on algebraic. Coding Thv, 4244:114–116, 1978.
  • [28] Chris Peikert. Public-key cryptosystems from the worst-case shortest vector problem. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 333–342, 2009.
  • [29] Michael O Rabin. Digitalized signatures and public-key functions as intractable as factorization. Technical report, Massachusetts Inst of Tech Cambridge Lab for Computer Science, 1979.
  • [30] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM (JACM), 56(6):1–40, 2009. doi:10.1145/1568318.1568324.
  • [31] Ronald L Rivest, Adi Shamir, and Leonard Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120–126, 1978. doi:10.1145/359340.359342.
  • [32] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999. doi:10.1137/S0036144598347011.
  • [33] Michael Sipser. A complexity theoretic approach to randomness. In Proceedings of the fifteenth annual ACM symposium on Theory of computing, pages 330–335, 1983. doi:10.1145/800061.808762.
  • [34] R.J. Solomonoff. A formal theory of inductive inference. part i. Information and Control, 7(1):1–22, 1964. doi:10.1016/S0019-9958(64)90223-2.
  • [35] Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. doi:10.1145/1968.1972.
  • [36] Andrew C. Yao. Theory and applications of trapdoor functions. In Annual Symposium on Foundations of Computer Science (FOCS), pages 80–91, 1982. doi:10.1109/SFCS.1982.45.