Witness Encryption and NP-Hardness of Learning
Abstract
We study connections between two fundamental questions from computer science theory. (1) Is witness encryption possible for [11]? That is, given an instance of an -complete language , can one encrypt a secret message with security contingent on the ability to provide a witness for ? (2) Is computational learning (in the sense of [46, 30]) hard for ? That is, is there a polynomial-time reduction from instances of to instances of learning?
Our main contribution is that certain formulations of -hardness of learning characterize the existence of witness encryption for . More specifically, we show:
-
witness encryption for a language is equivalent to a half-Levin reduction from to the Computational Gap Learning problem (denoted [2]),
where a half-Levin reduction is the same as a Levin reduction but only required to preserve witnesses in one direction, and formalizes agnostic learning as a decision problem. We show versions of the statement above for witness encryption secure against non-uniform and uniform adversaries. We also show that witness encryption for with ciphertexts of logarithmic length, along with a circuit lower bound for , are together equivalent to -hardness of a generalized promise version of .
We complement the above with a number of unconditional -hardness results for agnostic PAC learning. Extending a result of [16] to the standard setting of boolean circuits, we show -hardness of “semi-proper” learning. Namely:
-
for some polynomial , it is -hard to agnostically learn circuits of size by circuits of size .
Looking beyond the computational model of standard boolean circuits enables us to prove -hardness of improper learning (ie. without a restriction on the size of hypothesis returned by the learner). We obtain such results for:
-
learning circuits with oracle access to a given randomly sampled string, and
-
learning RAM programs.
In particular, we show that a variant of [31] for RAM programs is -hard with parameters corresponding to the setting of improper learning. We view these results as partial progress toward the ultimate goal of showing -hardness of learning boolean circuits in an improper setting.
Lastly, we give some consequences of -hardness of learning for private- and public-key cryptography. Improving a main result of [2], we show that if improper agnostic PAC learning is -hard under a randomized non-adaptive reduction (with some restrictions), then implies the existence of i.o. one-way functions. In contrast, if is -hard under a half-Levin reduction, then implies the existence of i.o. public-key encryption.
Keywords and phrases:
agnostic PAC learning, witness encryption, NP-hardnessCopyright and License:
2012 ACM Subject Classification:
Theory of computation Computational complexity and cryptographyEditors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction and Background
It has long been recognized that learning and cryptography are two sides of the same coin. As Valiant already observed in his paper introducing PAC learning, its easiness would preclude the existence of pseudorandom functions [46]. In 1993, building on a work of Impagliazzo and Levin [24], Blum, Furst, Kearns, and Lipton sharpened Valiant’s observation by showing that private-key cryptography can be broken if and only if PAC learning is easy in an average-case setting [6]. A number of more recent results have developed such connections further (eg. [41, 39, 18]).
In the present work, we turn to more structured kinds of hardness of learning (namely, -hardness) and cryptography (namely, witness encryption), which we show to be likewise equivalent.
NP-hardness of PAC learning
Learning theory asks whether an efficient entity can hope to learn general truths about reality from necessarily limited input from experience. Valiant formalized this idea in 1984 with his definition of Probably Approximately Correct (PAC) Learning [46]. In this framework, a learner for some representation class gets access to labeled examples for some unknown concept and randomly sampled according to an arbitrary distribution . The learner is asked to produce, with high probability, a hypothesis that approximates well over . One can distinguish between “proper” and “improper” settings: in proper learning, the learner should produce a hypothesis that also belongs to , whereas in improper learning, there is no restriction on the complexity of the hypothesis. One can also consider various “semi-proper” settings, in which the hypothesis must belong to some specific concept class containing . Lastly, one can relax the model of learning to be “agnostic” in that the labeled examples do not necessarily reflect the values of a function , and the learner is only asked to do as well as the best function in does [30].
A long line of work has established -hardness of PAC learning for various concept classes (eg. [43, 1, 16, 32]). Early results obtained -hardness of proper learning of restricted circuit classes. Over time, progress has pushed toward -hardness of “less proper” learning of less restricted circuit classes. However, -hardness of learning unrestricted boolean circuits in the improper setting remains elusive. By this, we refer to the problem of learning the concept class for some fixed polynomial by a hypothesis of arbitrary polynomial size. Proving -hardness in this setting seems especially challenging in light of a 2008 result of Applebaum et al.:
Theorem 1 ([2]).
Suppose there is a randomized non-adaptive honest111Roughly, “honest” means that the input length of the learning instance is polynomially related to the length of the input to the reduction. See Definition 28. reduction and a polynomial such that, for every constant , reduces to agnostically PAC learning by . Then, if is hard on average, infinitely-often one-way functions exist.
Of course, there are various ways in which one could formulate reductions to PAC learning. The aforementioned paper [2] describes two (of many) possibilities. The statement of Theorem 1 refers to one kind of -hardness reduction described in which a randomized oracle machine on an input of length makes non-adaptive queries to its oracle, with being a -size circuit sampling example-label pairs . The oracle returns, for each query, a polynomial-size circuit having high agreement over (i.e. close to 1), which the reduction may use in any way.
In the other kind of -hardness reduction that [2] describes, the reduction only uses its learning oracle to decide if a small circuit consistent with exists. The relevant decision problem introduced by [2] is the Computational Gap Learning Problem (). The authors show that if is -hard under a deterministic many-one reduction, then every language in has a statistical zero-knowledge argument.
Remark 2.
The results in [2] indicate it is a challenge to show -hardness of agnostic PAC learning, as it would prove breakthrough results in cryptography, which, although conjectured to be true, seem beyond our reach at the moment. However, we should note that the assumption in Theorem 1 is actually quite strong in its requirement that the same reduction must work for every constant . Theorem 1 also requires that queries made by the reduction sample pairs with the length of polynomially related to the length of the given instance. It is possible that PAC learning is -hard in the improper setting without either of these conditions being met. For example, if a different reduction exists for each (perhaps taking longer time for larger ), then the easiness of improper learning would still imply the easiness of .
Witness encryption for NP
First introduced in a 2013 paper of Garg, Gentry, Sahai, and Waters [11], witness encryption (WE) is a cryptographic primitive in which security is contingent on the ability to provide a proof of an answer to a hard problem. As Garg et al. put it,
What if we don’t really care if [the receiver] knows a secret key, but we do care if he knows a solution to a crossword puzzle that we saw in the Times? [11]
More formally, the functionality of a witness encryption scheme is based on the structure of a particular language . An instance of is made public, and anyone can encrypt a secret message using the witness encryption scheme along with . It is guaranteed that if belongs to , then anyone with a witness proving can decrypt, but if does not belong to , then no one can do so.
Witness encryption is intimately connected to another cryptographic primitive known as indistinguishability obfuscation (iO) [3], which has been studied extensively (eg. [33, 10, 23, 37]) and has many candidate constructions (see [28]). iO is known to imply WE, though the converse is not known [10]. (We give a sketch of the proof that iO implies WE in Appendix A.) WE is also closely related to secret-sharing, another important cryptographic tool [4, 34]. In fact, the candidate construction of WE in [11] also yields a Rudich-type secret sharing scheme. As with much of cryptography, we have no unconditional proof that WE is possible.
WE alone does not imply the existence of one-way functions, since it is trivially possible if . However, WE along with one-way functions becomes very powerful, together yielding public key encryption, as demonstrated in [11]. By combining this result with two subsequent works, one obtains the following.
Theorem 3 ([11, 33, 19]; see [35]).
Assume . If -secure witness encryption exists for , then public-key encryption is possible.
In this paper, we show that is -hard if and only if witness encryption exists for . We prove statements along these lines for a few different settings: WE secure against PPT adversaries, WE secure against polynomial-size circuits, and WE with logarithmic-length ciphertexts; the latter roughly characterizes the -hardness of a promise version of the Minimum Circuit Size Problem [29].
In contrast to the aforementioned results of [2] (such as Theorem 1), not only do we show that proving -hardness of learning is a challenge as it would imply a breakthrough cryptographic construction (namely, WE), but we also show the converse: progress in cryptography (showing that WE is possible) would imply that agnostic PAC learning is -hard. Hence, if one believes in cryptography such as iO or WE, then one should also believe in the -hardness of improper agnostic learning, even in the extremely restrictive sense of a deterministic, many-one, half-Levin reduction to . We interpret our results as evidence that improper learning is likely -hard.
As a complement, we prove a number of unconditional -hardness results for agnostic PAC learning. For one, we obtain -hardness of agnostic learning for unrestricted boolean circuits in a semi-proper setting that improves on the previous state of the art. In other settings, we obtain -hardness of improper agnostic learning. Though we have made some partial progress toward showing -hardness unconditionally, we feel that it remains a promising direction for further research.
Lastly, we give some applications of our results for the possibility of private- and public-key cryptography. We improve Theorem 1 to conditionally obtain infinitely-often one-way functions from worst-case hardness of . In contrast, if learning is -hard in a more restricted sense (namely, a half-Levin reduction to ), then one obtains infinitely-often public-key encryption assuming . Along the way, we prove Theorem 3 under the weaker uniform assumption .222We believe that this last statement follows from a combination of techniques used in prior work ([11, 33, 19]; see [35]), but we have not seen the uniform version stated. In any case, we offer an alternative proof that does not rely on properties of statistical zero-knowledge arguments.
1.1 Our Results
Connections between witness encryption and NP-hardness of learning
We first discuss a set of results that characterize witness encryption in terms of -hardness reductions to PAC learning.
Witness encryption for an language consists of a pair of algorithms such that, for an -instance and a secret bit , outputs a ciphertext string using some randomness. It is guaranteed that, if and is a witness for the membership of under a fixed witness relation for , then deterministically recovers the bit . On the other hand, if , then and are indistinguishable (within a negligible difference) for polynomial-size circuits (if the witness encryption is -secure) or PPT algorithms (if the witness encryption is -secure). See Definition 39.
Our first result characterizes -secure witness encryption in terms of a deterministic many-one reduction to a promise problem known as , introduced in a work of Applebaum et al. [2]. Inputs to are distributions over string-label pairs , where is represented by a -size sampling circuit. For parameters and , a distribution is a yes-instance of if there is a circuit of size at most such that, for , ; is a no-instance of if for every circuit of size at most , for , .
Note that a many-one reduction to is quite a restrictive notion of reducibility to agnostic PAC learning: the reduction only uses its learning oracle to decide if a small circuit exists for the given distribution. As discussed earlier, one can imagine other kinds of reduction to learning that make more robust use of a learner: for example, actually examining the circuits returned by the learning oracle.
We further restrict the definition of a many-one reduction to by requiring that it be a half-Levin reduction. This is a term that we introduce here for a kind of reduction that is intermediate between a many-one reduction and a Levin reduction. A half-Levin reduction from to is a pair of polynomial-time machines such that is a many-one reduction from to , and transforms witnesses for into witnesses for . In contrast to a standard Levin reduction, we do not require a third algorithm transforming -witnesses into -witnesses. We emphasize that the half-Levin reductions considered in this work are deterministic unless stated otherwise.
We call a reduction to honest if, on inputs of length , it outputs a distribution supported over .
We are now ready to state our main result. In general, fixing parameters , and , we show that a half-Levin reduction from a language to is equivalent to a witness encryption scheme for with running time roughly , security against circuits of size roughly , and with advantage parameter roughly ; see Lemmas 52 and 54. For the special case of -secure witness encryption, we get the following.
Theorem 4.
Consider any language . The following are equivalent.
-
1.
-secure witness encryption exists for ;
-
2.
there exists a polynomial and a pair of machines such that, for all polynomials , is an honest half-Levin reduction from to .
We also note that the above characterization is related to a very recent result from Liu, Mazor, and Pass, which shows that -secure witness encryption for an -language is equivalent to the existence of a laconic special-honest verifier zero-knowledge argument for [36]. We refer to that paper for the definition of this kind of protocol, though we mention that the term “laconic” means that the total length of the prover’s messages is bounded by . As a corollary of [36] and our Theorem 4 above, we obtain the following.
Corollary 5.
Consider any language . The following are all equivalent.
-
1.
-secure witness encryption exists for ;
-
2.
there exists a polynomial and a pair of machines such that, for all polynomials , is an honest half-Levin reduction from to ;
-
3.
there exists a laconic special-honest verifier zero-knowledge argument for .
One may also be interested in the possibility that has witness encryption schemes secure against PPT algorithms but not against polynomial-size circuits. We therefore present a kind of reduction to learning that is equivalent to -secure witness encryption. This kind of reduction is intermediate between a half-Levin reduction to and the more general kind of reduction described in [2] wherein the reduction may actually inspect the hypotheses returned by the learner and use them in any way.
More specifically, in our case, the reduction makes a single query to the search version of , getting back a small circuit consistent with if is a yes-instance of . We still require the reduction to be half-Levin. Furthermore, we require the reduction to be -black-box;333The standard notion of a class-specific black-box reduction from a language to a language was defined by [13] as a way to formalize an intermediate case between treating as an oracle (“black box”) and requiring a source code for an algorithm implemented in some complexity class (“white box”); see Definitions 30 and 31. see Definition 32.
Theorem 6.
Consider any language . The following are equivalent.
-
1.
-secure witness encryption exists for ;
-
2.
there exist a polynomial and a pair of machines such that, for all polynomials and , is an honest -black-box half-Levin reduction from to .
So far, we have only discussed learning in the “polynomial regime”, where the distribution over pairs is in . However, the original PAC learning framework of Valiant makes no such restriction: a PAC learner should be able to learn over arbitrary distributions [46]. We thus consider reductions to learning that make further use of this capacity. In particular, imagine a polynomial-time reduction that, on input length , queries a learner for a function on -length inputs. In this case, it will be possible to ask the learner to work on inefficiently samplable distributions, and a truth-table of the function in question can be given to the learner explicitly. In this particular setup, the distributions queried represent well-defined functions, which may not be the case in the setup above with . In this sense, we get closer to standard PAC learning rather than agnostic PAC learning.
This notion of learning in the “exponential regime” is closely related to , which asks, given a truth-table and a size threshold , whether there exists a circuit of size consistent with . We generalize the problem to the context of learning by providing, along with , a probability distribution represented as a table. We call this problem “Gap Distributional ”, denoted . Specifically, is a yes-instance of if there exists a circuit of size at most such is consistent with , and is a no-instance if for every circuit of size at most , the probability over that is at most . Note that this definition naturally generalizes as defined in [45] (and in [7], denoted ), where the definition is the same except the distribution fixed to uniform.
We show that -hardness of learning in this sense is equivalent to witness encryption for such that the ciphertexts output by have only logarithmic length, along with a circuit lower bound for . Below we use to denote the class of -input boolean functions computable by circuits of size ; for , we use to denote the class of -input boolean functions that can be computed by circuits of size on average, with probability at least over inputs sampled according to the distribution .
Theorem 7.
Consider any language , constant , polynomial , and . The following are equivalent.
-
1.
-
-secure -length -decryption-time witness encryption exists for ; and
-
for some -computable distribution family .
-
-
2.
There is a half-Levin reduction mapping -length instances of to -length instances of , where runs in time at most .
Note that we obtain correspondences between the output length of the reduction and the output length of , the parameter of and the running time of , the parameter of and the security of witness encryption, and the parameter of and the advantage parameter of witness encryption.
Unconditional NP-hardness of semi-proper learning for
As discussed in the introduction, -hardness of proper agnostic PAC-learning for polynomial-size circuits is known. However, -hardness of improper learning is a major open question that remains elusive. In this section, we advance toward -hardness of “less proper” agnostic PAC learning for polynomial-size circuits.
First, we show that for polynomials and , letting , is -hard under a randomized many-one reduction. In particular,
Theorem 8.
For some constant and with for , for any polynomials and such that for , there is a randomized many-one reduction from to .
As a straightforward consequence of Theorem 8, we obtain -hardness of agnostic PAC-learning in the semi-proper setting of learning size- circuits by size- circuits.
Corollary 9.
For some polynomial , it is -hard under a randomized one-query reduction to agnostically PAC learn by over a flat -samplable distribution.
This improves on a prior work of Hirahara [16], which contains two main results. The first shows that, for polynomials , it is -hard to agnostically learn programs of size by programs of size . This is accomplished by reduction from the “minimum monotone satisfying assignment” problem, , which is known to be -hard to approximate by way of a PCP theorem [8, 9]. We note that Hirahara’s proof makes use of efficient secret sharing schemes, a cryptographic primitive closely related to witness encryption and known to exist unconditionally (with statistical security) for access structures that are monotone formulae [27, 5] (see [4, 11]).
The second main result of [16] shows that , a partial function version of , is likewise -hard. This proof requires even more sophisticated techniques in order to produce a truth-table on logarithmic-length inputs. In particular, Hirahara reduces from a gap version of the “collective minimum (weight) monotone satisfying assignment problem” (), which generalizes . Using the techniques of that proof and padding the inputs to have polynomial length, one can obtain -hardness of learning , but with a smaller gap.
Theorem 10 (Implicit in [16]).
For any constant , there exists a constant such that for some polynomial , it is -hard under a randomized one-query reduction to agnostically PAC learn by over a flat -samplable distribution.
However, those ideas do not extend to obtain the larger vs. gap for circuits rather than programs as in Corollary 9.
Remark 11.
We note that Theorem 8 is in fact proved by a randomized half-Levin reduction from to ! That is, there is a poly-time machine such that, with high probability over , if is the output of the reduction of Theorem 8 run on a instance with randomness , and is a witness for , then outputs a circuit certifying that is a yes-instance of . However, unfortunately, we do not know how to extend the results of the previous section to obtain non-trivial witness encryption from a randomized half-Levin reduction, particularly since the amount of randomness required is greater than the parameter (which translates to the security of the witness encryption scheme). In any case, our results show that if the reduction of Theorem 8 can be suitably derandomized (with the running time of less than the parameter ) then one would unconditionally obtain non-trivial witness encryption.
Unconditional NP-hardness of improper learning for other concept classes
Though in the case of we are unable to obtain -hardness with a gap greater than , we show in this section that for some related concept classes, improper learning is -hard. We demonstrate such results for learning oracle circuits and learning RAM programs.
In the following, for a fixed and distribution family , where each is supported over strings of length and samplable uniformly in time , is a generalization of in which all circuits and samplers take access to an oracle randomly sampled from . is a yes-instance of if there exists a circuit of size such that for every in the support of ,
and is a no-instance of if, with probability over , for any circuit of size ,
We prove the following -hardness of , from which we will derive further corollaries.
Theorem 12.
There exist a distribution family , an -complete language , and a polynomial such that, for all polynomials , there is a function such that for and reduces to under an honest half-Levin reduction.
Theorem 12 is proved using a framework developed by Huang, Ilango, and Ren for “witness encryption in oracle worlds” [21]. More specifically, drawing on the candidate witness encryption scheme proposed in [11], the authors show that there exist an -complete language and a distribution family such that, with high probability over oracles sampled randomly from , witness encryption for exists with respect to . That is, , , and adversaries all get oracle access to .
We may then apply the techniques of one direction of our main equivalence, Theorem 4, to obtain -hardness of learning with respect to . Crucially, in [21], an oracle with truth-table length gives rise to a witness encryption scheme secure against non-uniform adversaries of size polynomially related to . So, to achieve -hardness of with an arbitrary polynomial gap, one only requires oracle truth-tables of polynomial length.
As a consequence of Theorem 12, we show that agnostic learning is -hard in an “oracle-PAC” model that we introduce here, which generalizes standard agnostic PAC learning of . In this model, a learner for on input length is explicitly given the whole truth table of an oracle function sampled randomly from some distribution . The learner is asked to output an oracle circuit such that approximates the target function almost as well as the best size- -oracle circuit; see Definition 23.
Corollary 13.
For every constant and sufficiently large polynomial , it is -hard under a randomized one-query reduction to agnostically oracle-PAC learn by .
We stress that the above holds under a standard polynomial-time reduction that does not itself require any oracle.
As a second consequence of the -hardness of , we show that it is -hard to agnostically learn RAM programs in an improper setting. In particular, we focus on a decision problem based on the problem defined in [31]. In Ko’s definition, the input to consists of pairs for some along with size and running time parameters represented in unary. The input is a yes-instance if and only if there exists a program of size at most and running time at most such that for all . Ko exhibited an oracle relative to which is not -hard [31].
Hirahara, using non-relativizing techniques, showed that a certain generalization of (called ) is in fact -hard [16]. In this promise-problem, the pairs are represented implicitly by a sampling circuit (as in the definition of ), and there is a gap, eg., in program size between yes-instances and no-instances. It is easy to see that -hardness of this problem (Theorem 1.1 in [16]) implies -hardness of Ko’s .
We formulate our definition of analogously to the definition of in [16] but considering programs that are granted random access to their inputs. Namely, the input to is a sampler over pairs along with parameters . The input is a yes-instance if there exists a program of size at most and running in time at most such that , and is a no-instance if for any program of size at most and running in time at most , . Here, denotes the program with random access to its input .
We obtain the following -hardness of . Note that the parameters we obtain imply -hardness of learning RAM programs in an improper setting.
Corollary 14.
For all polynomials and , is -hard under a randomized many-one reduction.
In contrast, [16] considers programs that do not require random access to their inputs; the gap in program size achieved there is small (namely, vs. ), though the no-instances extend to time-unbounded programs.
Applications for the possibility of cryptography
Finally, we prove a few results demonstrating what cryptography is possible assuming worst-case hardness of along with various assumptions concerning witness encryption and -hardness of learning.
Assuming a randomized non-adaptive reduction from to improper agnostic learning as described in [2] (see Definition 28), we show that the worst-case assumption is sufficient to imply one-way functions secure infinitely often against uniform PPT adversaries. This improves a main result of [2] (Theorem 1 in the introduction), which requires average-case hardness of . In terms of Impagliazzo’s five worlds [22], the result of [2] conditionally excludes Pessiland. In contrast, we conditionally exclude both Heuristica and Pessiland.
Theorem 15.
Suppose there is a randomized non-adaptive honest reduction and a polynomial such that, for every constant , reduces to agnostically PAC learning by . Then, unless , there exist infinitely-often one-way functions.
As discussed in the introduction, Garg et al. show that WE for an -complete language along with a one-way function is sufficient for public-key encryption [11]. Combining this result, our main equivalence in Theorem 4, and Theorem 15, we obtain that witness encryption simultaneously excludes Heuristica, Pessiland, and Minicrypt.
Theorem 16.
Suppose -secure witness encryption exists for . Then, unless , there exists public-key encryption secure infinitely often against polynomial-time adversaries.
We believe that this statement can also be proved by combining prior works [33, 19, 11] (see also [35]), though we have not seen the uniform version stated (ie. the version with the assumption rather than ). In any case, we provide an alternative proof that does not rely on properties of zero-knowledge, such as the “SZK/OWF” characterization from [42].
Theorem 16 and our our main equivalence in Theorem 4 together imply that if learning is -hard under a half-Levin reduction to working for arbitrary polynomials and , one can rule out Minicrypt in addition to Heuristica and Pessiland. Thus, we obtain a stark distinction between the two kinds of reduction to learning defined in [2]. Namely, assuming worst-case hardness of , a randomized non-adaptive reduction to agnostic learning over yields one-way functions, whereas a deterministic many-one reduction to , provided it is half-Levin, yields public-key encryption.
Theorem 17.
Suppose there exist a polynomial and a pair of machines such that, for all polynomials , is an honest half-Levin reduction from to . Then, unless , there exists public-key encryption secure infinitely often against polynomial-time adversaries.
1.2 Related Work
As mentioned above, a very recent work of Liu, Mazor, and Pass shows that witness encryption for is equivalent to “laconic special-honest verifier zero-knowledge arguments” for [36]. Our main result shows that half-Levin reductions from to are equivalent to both. This sharpens the result of Applebaum et al. showing that -hardness of under a many-one (not necessarily half-Levin) reduction implies [2].
Huang, Ilango, and Ren [21] used an assumption of the existence of witness encryption secure against subexponential-size circuits to conclude the -hardness of approximating the time-bounded conditional Kolmogorov complexity , in the regime where , with essentially optimal gap parameters. This is similar to one direction of our Theorem 4 showing an equivalence between the existence of witness encryption (secure against ) and -hardness of a learning problem (under restricted reductions). We note that in our framework, proving such a connection between witness encryption and -hardness of learning is much simpler than in the setting considered by [21].
Our Theorem 12 relies on a framework for “witness encryption in oracle worlds” developed in [21] (building on ideas from [11]). The authors of [21] obtain -hardness of the Minimum Oracle Circuit Size Problem with a large circuit size gap [21]. This is somewhat like an “exponential regime” analogue of our oracle-PAC learning model, but we stress that, in [21], an algorithm for needs to work in the worst case over oracle truth-tables. In our case, a learner only needs to work with high probability over oracle truth-tables sampled from some distribution in .
An earlier example of an equivalence between -hardness of a meta-complexity problem and the existence of a secure cryptographic primitive is the result by Hirahara [17]. It shows an equivalence between the existence of one-way functions and -hardness of distributional time-bounded Kolmogorov complexity under a certain kind of restricted reductions.
An earlier example of an equivalence between hardness of PAC learning and the existence of a cryptographic primitive is due to Nanashima [39]. In that work, Nanashima shows that auxiliary-input one-way functions exist if and only if a certain average-case, or “heuristic”, variant of PAC learning is hard for efficient algorithms. In contrast, we focus on -hardness of worst-case agnostic PAC learning.
1.3 Overview of Techniques
Proof of Theorem 4
We start with an informal overview of the proof of our main equivalence, Theorem 4. In both directions, the constructions employed and the proofs of correctness are straightforward. We first show how to obtain a half-Levin reduction from to from a -secure witness encryption scheme for . In particular, given an -instance , the reduction outputs a distribution as follows.
Sample . Let . Output .
Note that we can assume that the output length of is at least without loss of generality, so the reduction described here is honest.
In the case that with witness , is a yes-instance of , since the circuit defined by recovers the bit from with probability 1 over . This is true by the correctness of . Also note that, since is a uniform polynomial-time algorithm, the circuit can be constructed in polynomial time given and . Thus, our reduction is half-Levin. In the case that , we observe that if there were a circuit of polynomial size such that
then by definition of ,
violating the security of . So, must be a no-instance of .
Now we show how to obtain a -secure witness encryption scheme from a half-Levin reduction to . We define a witness encryption scheme as follows. Let be an instance of , and if , let be a witness. Let be a secret bit to encrypt.
computes , samples and then outputs .
computes and then outputs .
The correctness of follows from the fact that, for , is a witness for being a yes-instance of : that is, a circuit of polynomial size such that
It follows that . The security of follows from Yao’s argument that a distinguisher implies a next-bit predictor. Specifically, for , from a circuit of size distinguishing from , we would obtain a circuit of size predicting (with inverse polynomial advantage) from . By the honesty of the reduction, we have . This violates the fact that is a no-instance of for every polynomial , a contradiction.
Proof of Theorem 6
To modify the argument above for the case of -secure witness encryption, we note that, in the first direction discussed, the security of WE is only violated if the circuit can be constructed uniformly in polynomial time. Therefore, we consider -black-box reductions to search-. That is, roughly, the reduction is only guaranteed to work when its oracle is a PPT algorithm, and we require the oracle to actually construct a small circuit with inverse polynomial advantage over the queried distribution if one exists. For the other direction, we employ the observation that, from a uniform PPT distinguisher algorithm , one can construct a next-bit predictor uniformly.
Proof of Theorem 7
For the case of in Theorem 7, a challenge is that an arbitrary witness encryption scheme may not yield a distribution that represents a well-defined function. On one hand, if , then there is guaranteed to be a unique bit such that is in the support of , by the correctness of the witness encryption scheme. However, if , this may not be true. Luckily, since is supported over logarithmic-length strings, we can check in polynomial time whether each has a unique label. If not, we use our assumption of a circuit lower bound for to output a no-instance of . In the other direction, from a (deterministic) half-Levin reduction from to , we obtain a circuit lower bound for by applying the reduction to a trivial no-instance of , as in [29].
Proof of Theorem 8 and corollary
Our proof of -hardness of with parameters corresponding to semi-proper learning works within the framework developed in [16]. It differs from Theorem 7.2 in that paper mainly in the “completeness” argument. Referring to the notation in the proof of that theorem and of our Lemma 64, we need to construct a circuit of size not exceeding .
For one thing, in the proof of [16], using generators to encode the high-complexity truth-tables , the number of input gates to the circuit alone would exceed the desired threshold. Therefore, we use Nisan-Wigderson generators rather than generators to allow for a short random seed. (A similar idea is used elsewhere in [16].)
More importantly, we apply a recent result of Holmgren and Rothblum showing that “multiselection” of indices from a string of length can be done by a circuit of size at most [20]. See Lemma 61. So, we can make queries to truth-tables , each of length , with total circuit size at most . This allows us to compute values of the amplified functions and recover the secret labels without exceeding the desired size threshold. Note that Uhlig’s Theorem as in [16] cannot accommodate queries.
Proof of Theorem 12 and corollaries
To obtain -hardness of , we apply an argument analogous to the first direction of our main equivalence, along with the fact that, for some distribution family , witness encryption for exists unconditionally (with high probability) with respect to oracles sampled from . This was demonstrated in [21]. More specifically, the authors show the following.
Lemma 18 ([21] – informal; see Lemma 41 and definition 40).
For some -complete language , distribution , and constant , there is a witness encryption scheme for such that, for all , with high probability over (with ), is secure against -oracle adversaries of size .
Our idea is to apply the foregoing “oracle witness encryption” along with our main technique for proving that witness encryption implies -hardness of . We modify the definition of accordingly so that all circuits and samplers have access to an oracle randomly sampled from . See Definition 25. Crucially, to achieve -hardness of for a constant , we require witness encryption with security against adversaries of size , and thus we require an oracle with truth-table length only .
Corollary 13 follows from Theorem 12 analogously to Corollary 9 and Theorem 8, as discussed above: a straightforward reduction from to oracle-PAC learning.
To prove Corollary 14, we give a randomized reduction from to the problem . Given an instance of , the reduction starts by sampling a random truth-table . Then, it defines an instance of which always outputs the entire truth-table along with a string sampled according to ; one can think of as sampling pairs with . Clearly, this results in a non-oracle instance of . In the proof of correctness, we also rely on the fact that a program of size and running time at most can be simulated by a circuit of size (and vice versa). Since the oracles obtained from [21] require length polynomially greater than the size of adversary they are secure against, we need to consider machines running in time less than the length of their inputs. Given these observations and the definitions of and , the corollary follows easily.
Proof of Theorem 15
We prove Theorem 15 with an argument from [2] as a starting point.444Note that we include a proof below for the reader’s convenience; see Lemma 72. Let be a randomized non-adaptive reduction from an -complete language to agnostic learning over . Assume that i.o. one-way functions do not exist. Then, for any and auxiliary-input function , there is a machine that distributionally inverts with high probability over auxiliary inputs . We think of instances of as auxiliary inputs for the function that, roughly, simulates to obtain a sampling circuit over string/label pairs and then samples from . Applebaum et al. show that there is a polynomial-time-computable hypothesis , using a distributional inverter for , having near-maximum agreement over . Running the reduction and answering oracle queries with hypotheses defined in this way, we obtain
where “”, informally, refers to average-case heuristics that may err by incorrectly outputting 1 or 0.
We next use the fact that the inversion of polynomial-time-computable functions is verifiable. That is, if a machine fails to invert on some instance, it can be defined to “realize this” in polynomial time. More concretely, the machine above can be defined to output on input if it fails to invert often on some particular , using empirical estimation. From this observation, we further obtain
where refers to heuristics that do not err but may output rather than 1 or 0 with inverse polynomial probability. Lastly, we apply a recent result of [12], which, from , yields worst-case easiness of agnostic PAC learning over distributions in . We then run the reduction for a second time, answering oracle queries with the worst-case agnostic learner guaranteed by [12], to obtain .
Proofs of Theorem 16 and Theorem 17
To obtain Theorem 16, start by assuming the existence of witness encryption for an -complete language. Then, apply the characterization given by Theorem 4 to obtain a half-Levin reduction from to . Since reduces to agnostic PAC learning, we may apply Theorem 15 as described above to exclude Heuristica and Pessiland. To summarize, informally,
| WE | (Theorem 4) | |||
| (Theorem 15) |
where refers to a half-Levin reduction and refers to a randomized non-adaptive (ie. “truth-table”) reduction. Furthermore, Garg et al. show that witness encryption for , together with the existence of a one-way function, yields public-key encryption [11]. This, together with the above, conditionally yields public-key encryption. To summarize,
| ([11]) |
2 Preliminaries
Definition 19.
We say that a distribution is flat if it is equal to the uniform distribution over its support.
Definition 20.
For a distribution family , , and , denotes the class of function families such that, for all sufficiently large , there exists a circuit of size at most such that
2.1 PAC Learning Problems
Definition 21 (Optimal hypotheses).
For , a function , a distribution family supported over , a parameter , and a hypothesis , we say that is -optimal for if
For , an oracle function , and a hypothesis , we say that is -optimal for if
Definition 22 (Agnostic PAC learning circuits).
Consider . We say that an algorithm agnostically PAC learns by if, for every , , , and joint distribution family where each is supported over , the following holds. With probability at least , outputs an -optimal hypothesis for of size at most .
is given the parameters , and in unary.
We say that agnostically PAC learns in the improper setting if the above is true except there is no restriction on the complexity of (except that it cannot exceed size the run time of ).
Definition 23 (Agnostic Oracle-PAC learning circuits).
The definition is analogous to Definition 22, except we also consider distribution families for sampling oracle truth-tables, where is supported over and uniformly samplable in time . We say that an algorithm agnostically oracle-PAC learns by if, for any such , for every and , the following holds with probability at least over .
For any joint distribution family where each is supported over , with probability at least over the randomness of , outputs an -optimal hypothesis for of circuit size at most .
is also given the parameters , and in unary.
Applebaum et al. define the following decision version of the PAC learning problem for circuits.
Definition 24 (Computational gap-learning problem (CGL) [2]).
The computational gap-learning problem with size parameters and security parameter , denoted , is the promise problem defined as follows.555We note that [2] define for a given completeness parameter (ie. a yes-instance has such that ). In this work, we only consider . The input consists of a distribution over pairs represented as a circuit of size .666In this work, we abuse notation by taking to refer both to the distribution and the circuit sampling it.
-
is a yes-instance if there exists a circuit of size at most such that
-
is a no-instance if for every circuit of size at most ,
Definition 25 (Oracle computational gap-learning problem ()).
Let be a polynomial-time-samplable distribution family such that each is supported over truth-tables of oracle functions . The -oracle computational gap-learning problem with size parameters , security parameter , and oracle parameter , denoted , is defined analogously to except all circuits (including the sampler ) have access to an oracle randomly drawn from . More specifically,
-
is a yes-instance if there exists an oracle circuit of size at most such that, for every oracle in the support of ,
-
is a no-instance if, with probability at least over , for every oracle circuit of size at most ,
Definition 26 ().
with gap parameter and soundness parameter , denoted , is the decision problem defined as follows. The input consists of a distribution over pairs represented as a circuit of size . The input also includes a threshold parameter and a running time parameter represented in unary.
-
is a yes-instance of if there exists a -time program of size at most such that for all in the support of .
-
is a no-instance of if for every -time program of size at most ,
2.2 Reductions to Learning
Here we define a form of reduction that is intermediate between the standard notions of a Karp reduction and a Levin reduction.
Definition 27 (Half-Levin reduction).
Consider languages with witness relations . A half-Levin reduction from to consists of polynomial-time machines and with the following properties.
-
For any -instance , ;
-
For any , it holds that .
A number of the results in this paper involve half-Levin reductions to , which belongs to (a promise version of) but not necessarily . Note that the definition of a half-Levin reduction can be extended to in the natural way. Specifically, in a half-Levin reduction from to , the machine , given some and witness , is required to produce a circuit of size at most such that
where is the output of . In addition, we say that a reduction to is honest if, on an input of length , it outputs a distribution supported over .
Definition 28 (Reduction to agnostic PAC learning circuits [2]).
For polynomials and and , a reduction from a language to agnostic PAC learning by over with failure probability consists of a PPT machine , which on input , makes queries that are joint distributions over for some , each represented by a sampling circuit. The oracle provides a hypothesis , represented as a -size circuit, as a response to each query. Based on the hypotheses and its own computation, accepts or rejects the input .
It is guaranteed that, for some inverse polynomial , if the oracle returns an -optimal hypothesis of size for each query , then with probability at least , .
If the failure probability is omitted, we take it to be .
If , we call the reduction honest.
Definition 29 (Reduction to agnostic oracle-PAC learning circuits).
We define reductions to agnostic oracle-PAC learning analogously with Definition 28. Specifically, the reduction samples random and then queries pairs to its oracle. The guarantee of the reduction holds if the oracle responds with an -optimal hypothesis of size at most for each query .
For our results concerning -secure witness encryption, we will require non-black-box notions of reduction. We start by recalling the standard definition of a class-specific black-box reduction.
Definition 30 (Class-specific black-box reduction [13]).
For a complexity class and languages and , a PPT oracle machine is a -black-box reduction from to if, for any oracle deciding , decides .
We will actually require a slightly less strict kind of reduction: rather than insisting everywhere, to be correct on a given input , the reduction will only require that an oracle agree with on the queries actually made given that input . For our purposes, we only need to consider the case of a reduction that generates its queries deterministically.
Definition 31 (Instance-wise class-specific black-box reduction).
Consider a complexity class , languages and , and a non-adaptive oracle machine that generates its queries deterministically but may use randomness elsewhere. is an instance-wise -black-box reduction from to if the following holds. For any oracle and instance of , if are the oracle queries made by on input and for all , then correctly decides .
We now explicitly define our required notion of an instance-wise -black-box half-Levin reduction to the search version of . Note that we may drop the qualifier “instance-wise” elsewhere in the presentation. Also note that the definition is extended from those above to the case of a reduction to a search problem.
Definition 32 (Instance-wise -black-box half-Levin reduction to ).
For polynomials and and a language , an instance-wise -black-box half-Levin reduction from to is an oracle machine as follows. On an input , deterministically makes one query, : a joint distribution supported on for some polynomial , represented by a sampling circuit.
(Instance-wise -black-box) Consider an input for some , and suppose the oracle is a PPT algorithm meeting the following condition:
Given the query made by on input , if is not a no-instance of , then with probability at least over its internal randomness, returns a hypothesis of size at most such that
Then, the following are guaranteed for such .
-
if , is a yes-instance of , and accepts with probability ;
-
if , is a no-instance of , and rejects with probability at least .
(Half-Levin) Further, there exists a polynomial-time computable function such that, for any input , if is a yes-instance of with witness , then outputs a witness for : that is, a circuit of size at most such that .
The lemma below shows that a reduction to implies a reduction to agnostic PAC learning as in Definition 28.
Lemma 33.
For a language , , and polynomials with for , suppose there is a randomized many-one reduction from to with failure probability . Then, there is a randomized poly-time reduction from to agnostic PAC learning by with failure probability , where for .
Proof.
Let be the assumed reduction to . Define a reduction as follows:
On input an instance of and a choice of randomness , simulate to produce a polynomial-size sampling circuit supported over . Query to the oracle, obtaining a hypothesis in return. If the size of is greater than , reject. By sampling times from , obtain an empirical estimate of . If the value obtained is less than , reject. Otherwise, accept.
We will show that constitutes a reduction to agnostic PAC learning by .
Consider an -instance . First suppose . Then, with probability at least over , is a yes-instance of . In this case,
Assume that the oracle returns an -optimal hypothesis of size : that is,
Then, by a Chernoff bound, with probability at least , accepts. Overall, for , accepts with probability greater than .
Now suppose is a no-instance of . In this case, with probability at least over , is a no-instance of : that is, for every circuit of size at most ,
Thus, cannot return a hypothesis with success better than over . So rejects with probability at least . Overall, for , accepts with probability at most , as desired.
By a very similar argument, we obtain the following for the oracle-PAC setting. We omit the proof here.
Lemma 34.
For a language , , , with for , and polynomials with for , suppose there is a randomized many-one reduction from to with failure probability . Then, there is a randomized poly-time reduction from to agnostic oracle-PAC learning by with failure probability , where for .
2.3 Meta-complexity
The following is a natural generalization of to the distributional version, where in addition to the truth table of a function , one is also given an explicit distribution over (as a table of probabilities), and is asked to distinguish “easy” ’s that can be well approximated by small boolean circuits from “hard” ’s that are almost impossible to approximate even by much larger boolean circuits.
Definition 35 ().
The gap Distributional Minimum Circuit Size problem with size parameters and advantage parameter , denoted , is the decision problem defined as follows. The input consists of the truth table of a function (represented as a truth-table of length bits) and a probability distribution (represented as a table of length bits).
-
is a yes-instance if there exists a circuit of size at most such that
-
is a no-instance if for every circuit of size at most ,
2.4 Secret Sharing
See [4] for a survey.
Definition 36 (Access Structures Represented by Monotone Formulae).
An access structure is a monotone collection of non-empty subsets of . For , we say that is authorized if it belongs to and unauthorized otherwise. We say that a monotone formula represents if exactly when , where is the characteristic vector of .
Definition 37 (Efficient Secret Sharing Scheme).
An efficient secret sharing scheme for an access structure represented by a monotone formula consists of a pair of poly-time algorithms and with the following properties. takes three inputs; the first is the formula , the second is a “secret” , and the third is a choice of “randomness” . also takes three inputs: the formula , a subset of “shares” , and the set .
-
Correctness: Authorized parties can recover the secret. Consider any and , let be the output of , and let be an authorized set. Then .
-
Privacy: Unauthorized parties cannot learn anything about the secret, in an information theoretic sense. Let be an unauthorized set. Then, for and , the random variable is statistically independent from the random variable .
Lemma 38 ([27, 5]).
For every access structure , there exists an efficient secret sharing scheme as in Definition 37. The circuit size required to simulate Rec is at most .
Proof sketch..
We will describe the construction given in [5]. Let be a monotone formula representing . Starting at the root of the tree representation of , the Share algorithm assigns values to the nodes in a recursive manner as follows. For a secret , assign to the root, and instantiate all shares for to be empty. Then:
-
If the current gate is and the current value is , assign to both children.
-
if the current gate is and the current value is , randomly choose . Assign to one child and to the other.
-
If the current gate is a variable and the current value is , include in the share .
Note that the amount of randomness required is at most . The Rec algorithm proceeds from the leaves of the tree back to the root. Start by assigning shares to leaves using the set . Then:
-
If the current gate is and both children have value , assign to the parent.
-
If the current gate is and the children have values and , assign to the parent.
-
If the current gate is the root, return the value assigned.
2.5 Witness Encryption
Definition 39 (Witness Encryption [11]).
Consider a language with witness relation , a class of function families , a security function , and a length function . A -secure -length witness encryption scheme for consists of polynomial-time algorithms and as follows.
-
Given an -instance , a message bit , and randomness , outputs a ciphertext of length at most .
-
Given a ciphertext and an instance/witness pair , deterministically outputs a bit .
Moreover, and have the properties below.
-
Correctness: For all sufficiently large , any , , and such that ,
-
Security: For any , sufficiently large , and ,
If is omitted, we assume it is some negligible function. If is omitted, we allow the output length of to be any polynomial.
Definition 40 (Witness Encryption in Oracle Worlds [21]).
Consider a language with witness relation , a class of function families , a security function , and a family of distributions , where each is supported over . A -secure witness encryption scheme for with respect to is defined as in Definition 39, except the algorithms and both take an oracle, and they satisfy the following properties.
-
Correctness: For all sufficiently large , in the support of , , , and such that ,
-
Security: For all sufficiently large , with probability over , for any and any ,
The following result from Huang, Ilango, and Ren was originally stated with security against oracle programs of polynomial size and query complexity; we observe that the same holds for polynomial-size circuits.
Lemma 41 ([21]).
There exist an -complete language , a constant , and a family of distributions , where each is supported over and samplable in time , such that there is a -secure witness encryption scheme for with respect to .
2.6 Hardness of Approximation
Definition 42 ().
For , denotes the following promise problem.
The input consists of a monotone formula over variables and of length at most as a binary string. Let denote .
Lemma 43 ([8, 9, 16]).
There exist with , , and a polynomial with for as follows. For every language , there is a half-Levin reduction from to .
Definition 44 ( [16]).
For , denotes the following promise problem. The input consists of a collection of monotone formulas over variables , where each has length at most as a binary string, along with a weight function . For an assignment , let denote .
2.7 One-way functions and Inversion
Definition 46 (-invertible (auxiliary-input) functions).
For , a function , a PPT algorithm , and , we say that is a -inverter for if
In the case of an auxiliary-input function , where , we say that is a -inverter for if
Definition 47 (Infinitely-often one-way function).
A polynomial-time computable function family is an infinitely-often one-way function if, for every PPT algorithm , there exists an inverse polynomial such that, for infinitely many , is not a -inverter for .777It is known that a weak one-way function is equivalent to a standard (strong) one-way function [47], so we will not distinguish between the two in this work.
Definition 48 (Auxiliary-input one-way function).
A polynomial-time computable auxiliary-input function family is an auxiliary-input one-way function if, for every PPT algorithm , there exists an inverse polynomial such that, for infinitely many , is not a -inverter for .
Definition 49 (Statistical Distance).
The statistical distance between two probability distributions and over domain , denoted , is defined as
or equivalently,
Definition 50 (Distributionally invertible auxiliary-input functions).
Consider an auxiliary input function computable uniformly in polynomial time given and an input . The function is said to be distributionally invertible if for every constant there is a PPT algorithm such that
where . We refer to the machine as an -distributional inverter for .
Lemma 51 (Strong inversion implies distributional inversion [25]).
For every polynomial-time computable auxiliary-input function and constant , there is a constant such that the following holds. If there exists an -inverter for , then there exists an -distributional inverter for .
3 Proofs of Main Results
3.1 Connections between WE and learning
3.1.1 Proof of Theorem 4
Lemma 52.
Consider a language . For , suppose a -secure -length witness encryption scheme exists for . Let the polynomial be such that upper bounds the running times of and on -instances of length . Let , , and .
Then, there is a half-Levin reduction from to mapping -instances of length to distributions supported over .
Proof.
Given an -instance , the reduction to will output a distribution as follows.
Sample . Let . Output .
First consider the case that . Let be a witness for . Let be the circuit that, with and hard-wired, given as input, outputs . Note that the size of is less than . Moreover, since is efficiently computable given and , the reduction is a half-Levin reduction. By the correctness of the witness encryption scheme, we have
Thus, is a yes-instance of .
Now consider the case that . Toward a contradiction, suppose there exists a circuit of size at most such that
Then,
which implies that
contradicting the security of the witness encryption scheme. Thus, is a no-instance of .
This completes the proof of the lemma.
Observe that a -secure encryption scheme is -secure for all polynomials and . Thus, we obtain the corollary below, which gives the first direction of Theorem 4.
Corollary 53.
Consider any language , and suppose a -secure witness encryption scheme exists for . Then, there exist a polynomial and a pair of machines such that, for all polynomials , is an honest half-Levin reduction from to .
Lemma 54.
Consider any polynomials , any , and any language . Suppose there is a half-Levin reduction from to , where on -instances of length , outputs distributions supported over in time , and runs in time .
Then, there is a -secure -length witness encryption scheme for , where , , runs in time , and runs in time .
Proof.
We define a witness encryption scheme for as follows.
For and , computes , samples and then outputs .
computes and then outputs .
We first show the correctness of . Suppose is a yes-instance of with witness . Then, by the correctness of the half-Levin reduction, is a witness for ; that is, is a circuit of size at most such that
Then, for any and any random choice of made by , it holds that , so correctly outputs .
We now argue for security. Suppose is a no-instance of . Then is a no-instance of ; that is, for any circuit of size at most ,
Let be any circuit of size at most . Toward a contradiction, suppose888We have removed the absolute value signs without loss of generality: a similar argument can be applied if the left-hand side is at most .
By definition of , the above can be written as
Let denote Yao’s next-bit predictor applied to . In particular,
On input , samples a random bit and outputs iff .
By a standard argument, it holds for some fixed choice of that
Finally, observe that with randomness fixed as above can be implemented as a circuit of size at most . This contradicts the assumption that is a no-instance of . We conclude that is -secure.
This completes the proof of the lemma.
Since a witness encryption scheme that is -secure for all polynomials and is -secure, we obtain the corollary below, which gives the second direction of Theorem 4.
Corollary 55.
Consider any language . Suppose there exist a polynomial and a pair of machines such that, for all polynomials , is an honest half-Levin reduction from to . Then, there is a -secure witness encryption scheme for .
3.1.2 Proof of Theorem 6
Lemma 56.
Consider any language , and suppose a -secure witness encryption scheme exists for .
Then, there exist a polynomial and a pair of machines such that, for all polynomials and , is an honest -black-box half-Levin reduction from to
Proof.
Consider a language with witness relation for some constant and an -instance . Let (, ) be a -secure -length witness encryption scheme for . Let the polynomial be such that upper bounds both and the running time of on -instances of length . As in Lemma 52, we consider the following distribution .
Sample . Let and output .
Our reduction will make one query to its oracle: namely, . After receiving a hypothesis in return, will sample from and simulate to obtain an empirical estimate of . will accept iff the estimate of this value is greater than .
Let be a PPT algorithm, an instance of , and arbitrary polynomials. Assume that meets the condition in Definition 32 on : namely, letting be the query made by , if is not a no-instance of , then with probability at least over its internal randomness, returns a hypothesis with agreement at least over .
First consider the case that . Let be a witness for . Let be the circuit that, with and hard-wired, given as input, outputs . Note that the size of is at most . Without loss of generality, assume . By the correctness of the witness encryption scheme, we have
This implies that is a yes-instance of . Moreover, by the assumption on , outputs a hypothesis with agreement at least over with probability at least , so the reduction accepts with probability at least overall.
Now consider the case that . Toward a contradiction, suppose that with probability at least , returns a hypothesis such that
Let be the algorithm that, on input , constructs the distribution from as above, simulates on to obtain a hypothesis , and then outputs . Then,
By the reasoning in Lemma 52 and the definition of , this implies that
contradicting the security of the witness encryption scheme on input . So, with probability greater than , returns a hypothesis with agreement over less than , in which case the reduction rejects with probability . Overall, the reduction rejects with probability greater than , as desired. Finally, by the assumption on , must be a no-instance of .
Also note: it is easy to see that the reduction is half-Levin, since a circuit simulating can always be constructed from an instance and witness in polynomial time.
The correctness of -black-box half-Levin reduction follows. This completes the proof of the lemma.
Lemma 57.
Consider any language . Suppose there exist a polynomial and a pair of machines such that, for all polynomials , is a -black-box half-Levin reduction from to . Then, -secure witness encryption exists for .
Proof.
Let be a -black-box half-Levin reduction from to , and let the polynomial be such that maps -instances of length to distributions supported over . We define a witness encryption scheme for as in Lemma 54. In particular:
For and , computes the query made by , samples and then outputs .
computes and then outputs .
We will first show the correctness of . Suppose is a yes-instance of with witness . By the definition of a half-Levin reduction to , is a circuit of size at most such that
Then, for any ,
as desired.
We will now show the security of . Toward a contradiction, suppose that for some PPT algorithm and polynomial , for infinitely many and no-instances of ,
Let the polynomial be such that upper bounds the running time of on inputs of length , and let .
Consider the PPT algorithm that, on input , outputs a circuit describing as in Lemma 54, fixing the bit that maximizes success probability over , which determines by sampling from and taking empirical estimates. By the same reasoning as in Lemma 54, with probability at least , outputs a hypothesis of size at most such that
Thus, for infinitely many inputs , meets the condition on the oracle of an instance-wise -black-box reduction to -: namely, given the query made by , returns a hypothesis with non-trivial success probability over . However, is not a no-instance of for the no-instance of . This contradicts the correctness of .
3.1.3 Proof of Theorem 7
Lemma 58.
Consider a language . For and a constant , assume that -secure -length witness encryption exists for . Let the polynomial be such that upper bounds the running time of on -instances of length . Let , , and . Assume, for some -computable distribution family , that .
Then, there is a half-Levin reduction from to mapping -instances of length to truth-tables of length , where runs in time at most .
Proof.
The reduction will first define a distribution as in the proof of Lemma 52: namely,
Sample . Let . Output .
Let denote the marginal distribution of over .
The reduction then proceeds as follows. If, for all , there is a unique such that , constructs a truth-table such that
and outputs . If, for any , both and are in , then outputs , where and . This function exists by our assumption, and both it and a table representing can be constructed uniformly in time . Note that is a no-instance of .
In the case that , for all , there is a unique such that ; this follows from the correctness of the witness encryption scheme. Thus, using a circuit for as in the proof of Lemma 52, we get that is a yes-instance of . Let be the function that, given and a witness for , outputs a circuit describing . It follows that is a half-Levin reduction. Note that runs in time at most .
In the case that , if both , then correctly outputs a no-instance of . Now supposing that the labels associated with strings are unique, let be any circuit of size . Toward a contradiction, suppose
As in Lemma 52, this implies
contradicting the security of the witness encryption scheme. Thus, is a no-instance of .
This completes the proof of the lemma.
Lemma 59.
Consider a constant and . Suppose that there is a half-Levin reduction from to , where maps -instances of length to truth-tables of length , and runs in time at most on -instances of length . Let , , and .
Then, there is a -secure -decryption-time -length witness encryption scheme for . Moreover, there exists an -computable distribution family such that .
Proof.
We argue as in the proof of Lemma 54. In particular, the witness encryption scheme for is defined as follows.
For and , computes , samples and then outputs .
computes and then outputs .
Note that, the output length of is and that the running time of is at most .
For , by the same reasoning as in Lemma 54, we have that is -secure.
To see the second conclusion of the lemma, let be a trivial no-instance of , constructible uniformly in time . Let . By the correctness of the reduction and the definition of , we have that, for all circuits of size ,
Also note that tables representing both and are uniformly constructible in time , as desired.
3.2 NP-hardness of learning polynomial-size circuits
We start with the following auxiliary lemmas.
Lemma 60 ([40, 44]).
For all sufficiently large with , for some there are sets constructible in time such that and
for all .
Lemma 61 ([20]).
For , consider the “multiselector” function that, given a string and indices , outputs .
For all , there is a logspace-uniform boolean circuit of size computing .
Lemma 62 ([26]; see [16] and [15]).
There exist constants as follows. Consider a function and a parameter . There is a function such that the truth-table of can be computed from that of in time given , and has the following properties.
-
Local encodability: There is a non-adaptive -oracle circuit of size at most computing , making at most queries to .
-
Decodability: For any oracle and sufficiently large time bound ,
The following is the “algorithmic information extraction” lemma of [16] modified to give a set defined in terms of rather than . It is easy to see from the proof in [16] that can be defined in this way.
Lemma 63.
Consider an algorithm running in time , parameters , and for . Let
be a family of -sized subsets of . Define
where for ,
Define a set
Then, for any advice string ,
where is such that for and for . The probabilities above are over and .
Lemma 64.
For some polynomial and constant , for any polynomials with , , and for , there exists a polynomial-time machine satisfying the following. For a polynomial with for , let
and consider an advice string such that, for every subset ,
Then, given as input and an instance of over variables, outputs a circuit sampling a flat distribution supported over as follows.
-
If is a yes-instance of , then there exists a circuit of size at most such that
-
If is a no-instance of , then for every circuit of size at most
it holds that
Proof.
Let be a given instance of , where is over variables. Let be as in Lemma 62. Consider an arbitrary polynomial such that . Define . Let satisfy the condition stated in the lemma.
The reduction .
Identifying each with a function , let be the encoded function as in Lemma 62 with . Let be the sets of size guaranteed by Lemma 60 with .
For any choice of , , and , define , and define the string as
padded to have length . The output of the reduction is a circuit sampling the following distribution :
For uniformly random , , and , output .
Note that has a -size sampling circuit and that the marginal distribution of over is flat.
Completeness.
To see the completeness of the reduction, suppose , and let where is an assignment of Hamming weight at most satisfying .
We will construct a circuit of size at most such that for all in the support of . takes as input a string with and for , padded to have length . uses encoding circuits as in Lemma 62 to compute the values
by making at most total queries to . answers these queries with the string hard-wired along with a multiselector circuit as in Lemma 61. then XORs the outputs of the encoding circuits with the appropriate inputs to obtain an authorized set of shares . Finally, applies with and hard-wired to obtain the secret .
consists of input gates, gates hard-wiring , gates for multiselection, encoding circuits each of size , gates to XOR, and gates for Rec. This amounts to at most gates in total.
Thus, there exists a circuit of size at most such that
Soundness.
We now argue for the soundness of the reduction. Suppose . For a fixed circuit of size , define a program as follows. takes as advice a string , where , and accepts its input if and only if
Note that runs in time at most . Now, define a set as in Lemma 63, applying the lemma with for . In particular,
By Lemma 63,
| (1) |
where and are as defined in that lemma.
We claim that is not too large. Note that for , by the decodability statement of Lemma 62, for some polynomial ,
| (definition of ) |
Thus, for ,
It follows that
so is not an authorized set.
Now observe that
| (definition of ) | ||||
| (Eq. (3.2)) | ||||
where the last line follows from the security of . We conclude that for any circuit of size at most ,
This completes the proof of the lemma.
Lemma 65 (Claim 7.3 of [16]).
For a uniformly random string , it holds with probability that for all subsets ,
Theorem 66 (Restatement of Theorem 8).
For some constant and with for , for any polynomials such that for , there is a randomized many-one reduction from to .
Proof.
By Lemma 43, there is a reduction from to for some
polynomial , and . Let be the constant of Lemma 64, and define the constant such that . For arbitrary polynomials , define and for . We will apply Lemma 64 with parameters , , , and .
More specifically, our reduction will first apply to obtain an instance of on variables, and then it will sample a random string . It will then apply as in Lemma 64 to obtain an instance of supported over , where
where we define .
3.3 NP-hardness of improper learning in other settings
Theorem 67 (Restatement of Theorem 12).
There exist a distribution , an -complete language , and a polynomial such that, for all polynomials , there is a function such that for and reduces to under an honest half-Levin reduction.
Proof.
Let be the distribution family and the -complete language of Lemma 41 with witness relation . That is, each is supported over truth-tables of functions , and there is an oracle witness encryption scheme for with respect to secure against programs of size and query complexity and with advantage for some constant .
Let be such that upper-bounds the running time of on -instances of length , and define . Let and be given, and define for .
We now give a reduction from to . Given an -instance for some , the reduction outputs as follows.
Given access to an oracle , sample , and let . Output .
First suppose . Let be such that . Consider the oracle circuit that, with hard-wired, given input and oracle , simulates and returns its output . Note that has size at most . By the correctness guarantee of , for every in the support of and every in the support of , it holds that . Therefore, in this case, the reduction outputs a yes-instance of .
Now suppose , and let be any oracle circuit of size at most . Note that
so is secure against . Suppose for some oracle that
By the same reasoning as in Lemma 52,
By the security of the witness encryption scheme, this occurs with probability at most over . Thus, in this case, the reduction outputs a no-instance of .
Corollary 68 (Restatement of Corollary 13).
For some polynomial , for any constant , it is -hard under a randomized one-query reduction to agnostically oracle-PAC learn by .
Theorem 12 also implies -hardness of learning RAMs in the improper setting, as indicated below.
Corollary 69 (Restatement of Corollary 14).
For all polynomials and , the problem is -hard under a randomized many-one reduction.
Proof.
We show that as in Theorem 12 reduces to . The main idea is to include the whole truth-table of an oracle sampled from in all the inputs to a RAM.
Let the polynomials and be given. Let be as in Theorem 12. Let , and let be as in Theorem 12 applied with and . Let be the distribution guaranteed by Theorem 12, and recall that is supported over truth-tables of length , where .
The reduction, given an instance of , samples and then defines the following distribution :
Sample , let , and output .
The reduction outputs .
First suppose that is a yes-instance of . In this case, there is an oracle circuit of size at most such that, for all in the support of and all in the support of , . Note that can be simulated by a program of size and running time at most . We obtain that for all ,
so is a yes-instance of .
Now suppose that is a no-instance of . Consider any random-access machine of size and running time at most . Note that can be simulated by an oracle circuit of size at most , where we may let the “” part of the input be a standard (non-oracle) input to . Then, since is a no-instance of , with probability at least over ,
This implies, for such oracles , that
Thus, with probability at least over the randomness of the reduction,
is a no-instance of , as desired.
3.4 Excluding Pessiland, Heuristica, and Minicrypt
We show that if the hardness of learning can be based on the hardness of (under a non-adaptive reduction), then so can the existence of OWFs. Applying our main results connecting witness encryption and the -hardness of learning, we obtain Theorems 16 and 17 as consequences.
3.4.1 Proof of Theorem 15
We will make use of the following lemmas.
Lemma 70.
Let be a polynomial-time computable auxiliary input function. If infinitely-often one-way functions do not exist, then for every distribution family and constant , there exists a PPT machine such that for all sufficiently large ,
Moreover, if does not -invert for some , then for any ,
Proof.
Assume that i.o. one-way functions do not exist. Let be a sampler for . For a constant , let be an -inverter for the function defined as . Define the PPT machine as follows.
On input , sample uniformly random strings . Compute the fraction of such that fails to output a pre-image of its input under . If is less than , output . Otherwise, output .
We first claim that, with probability at least over , is an inverter for . Toward a contradiction, suppose the claim does not hold. That is, with probability greater than over , only inverts with success probability . But then the overall probability of inverting , for uniformly random and , is at most
contradicting our assumption on . Moreover, by a Chernoff bound, the probability that is greater than is less than . By the definition of and a union bound, this proves the first part of the lemma.
For the second part, suppose that does not -invert . Then the expected value of in the definition of is at least . By a Chernoff bound, the probability that is less than is less than . This completes the proof of the lemma.
We will require the following lemma, which states that if is easy on average for randomized algorithms, then agnostic PAC learning over efficiently samplable distributions is possible.
Lemma 71 ([12]).
If , then there is a PPT machine that, for any polynomial , for all sufficiently large , agnostically learns in the improper setting over distributions in .
It remains to exhibit a reduction from to inverting an auxiliary input one-way function, assuming -hardness of learning. The following lemma is as in [2]. We include a proof for completeness.
Lemma 72 ([2]).
Suppose there is a randomized non-adaptive honest reduction and a polynomial such that, for every constant , reduces to agnostically PAC learning by . Then there exist a poly-time computable auxiliary input function family , a constant , and a PPT oracle machine satisfying the following: for any , , and PPT machine that -distributionally inverts ,
Proof.
Let be the assumed reduction from to agnostic PAC learning, let the polynomial denote its running time, and let the inverse polynomial be such that requires an -optimal hypothesis in response to each query. Define . Define an auxiliary-input function
where (henceforth denoted ) is the query of on input and randomness , sampling a distribution . In particular, for , represents a random sample . Note that all such samplers can be simulated uniformly in polynomial time given , by the efficiency of .
Let be a PPT machine that distributionally inverts . In particular, we require the success probability of to be such that, with probability over , for every , -distributionally inverts .
For , define hypotheses as follows:
On input , invoke to find a pre-image under of , and then return ,
and define a machine (taking oracle access to ) as follows:
On input and randomness , simulate , obtaining queries . Answer the query of with the hypothesis , which evaluates on independent choices of randomness (used by ) and then outputs the majority output . After answering all the queries this way and completing the simulation of , accept iff accepts.
For , let be a fixed function (of arbitrary complexity) that maximizes
We will argue that predicts almost as well as over .
For any , let the random variable denote the conditional distribution given that , for . For simplicity, start by considering the hypotheses and , which are the same as and except taking an oracle that “perfectly” distributionally inverts . That is,
for . Observe that for , outputs of are distributed exactly according to . In particular, for and ,
| (def. of ) | ||||
| (2) |
Now, for and strings in the support of , define
That is, is the label most likely to be associated with by , and is the corresponding conditional probability. Note that must always output this optimal label , so is the success probability of on input . Moreover, by Eq. (2), for every ,
| (3) |
We now fix and and break the subsequent analysis into two cases: either predicts the label of with probability substantially bounded away from , or not.
Case I: .
For , let be the Bernoulli random variable that equals if in the iteration of , and otherwise. Recall that iff the former event holds in a majority of trials . Let
and note that, by Eq. (3),
Then, by a Chernoff bound,
Case II: .
Now,
Thus, in both Cases I and II, it holds that
Moreover,
for . Overall, by a union bound and the definition of ,
as desired.
We are now ready to finish the proof of Theorem 15, restated below.
Theorem 73 (Restatement of Theorem 15).
Suppose there is a randomized non-adaptive honest reduction and a polynomial such that, for every constant , reduces to agnostically PAC learning by . Then, unless , there exist infinitely-often one-way functions.
Proof.
Assume the non-existence of i.o. one-way functions. Let be the assumed randomized non-adaptive reduction from to agnostic learning running in polynomial time . Let the inverse polynomial be such that requires -optimal hypotheses in response to each query, and let the constant be as in Lemma 72. Combining Lemma 70 with Lemma 51, for every distribution family , there exists a PPT machine such that, for all sufficiently large ,
and if does not -invert for some , then for any ,
Along with Lemma 72, this implies . Let be the PPT agnostic learner for guaranteed by Lemma 71, with accuracy and confidence parameters chosen such that for all .
Now consider the PPT machine defined as follows.
On input and randomness , simulate , obtaining queries , where each samples a distribution . For each such , let be the output of on input . Answer the query of with . Accept iff accepts.
By Lemma 71, for all sufficiently large , for each , with probability over the randomness of , returns an -optimal hypothesis for .
By a union bound and the definition of ,
We conclude that , as desired.
3.4.2 Proofs of Theorems 16 and 17
We will need the following lemma. For the reader’s convenience, we give a sketch of the proof in Appendix B. Though not originally stated for the infinitely-often setting, it is easily seen to follow from the same argument.
Lemma 74 ([11]).
Suppose -secure witness encryption exists for . Then, if infinitely-often one-way functions exist, there exists public-key encryption secure infinitely often against polynomial-time adversaries.
4 Concluding Remarks
We have shown that the existence of secure witness encryption is equivalent to -hardness of a certain learning problem () under restricted deterministic reductions. This may be taken as evidence that computational learning is indeed -hard, if one believes in cryptography. At the same time, this result also suggests that one may want to consider a more general kind of reductions when trying to prove -hardness of learning, as establishing the existence of secure witness encryption seems hard. On the other hand, actually proving -hardness of learning for restricted deterministic reductions would imply a breakthrough result that public-key cryptography can be based on the worst-case assumption that .
There are many interesting research directions to explore in the context of cryptography and -hardness of meta-complexity problems. For example, Mazor and Pass [38] have recently showed that the existence of secure Indistinguishability Obfuscation (a cryptographic primitive that implies Witness Encryption) would mean that a polynomial-gap version of cannot be shown -complete under Levin reductions, unless . Can one get similar results for some computational learning problems? What if one has secure Witness Encryption only? Another possible research direction is to find out whether existing tools from -hardness of learning could unconditionally yield new non-trivial forms of witness encryption (eg. with security against restricted circuit classes).
Lastly, of course, the -hardness of improper learning for remains open. We have made some partial progress, and we have demonstrated cryptographic consequences of -hardness under certain strong formulations, but we are optimistic that further progress can be made.
References
- [1] Michael Alekhnovich, Samuel R. Buss, Shlomo Moran, and Toniann Pitassi. Minimum propositional proof length is np-hard to linearly approximate. J. Symb. Log., 66(1):171–191, 2001. doi:10.2307/2694916.
- [2] Benny Applebaum, Boaz Barak, and David Xiao. On basing lower-bounds for learning on worst-case assumptions. In 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA, USA, pages 211–220. IEEE Computer Society, 2008. doi:10.1109/FOCS.2008.35.
- [3] Boaz Barak, Oded Goldreich, Russell Impagliazzo, Steven Rudich, Amit Sahai, Salil P. Vadhan, and Ke Yang. On the (im)possibility of obfuscating programs. J. ACM, 59(2):6:1–6:48, 2012. doi:10.1145/2160158.2160159.
- [4] Amos Beimel. Secret-sharing schemes: A survey. In Yeow Meng Chee, Zhenbo Guo, San Ling, Fengjing Shao, Yuansheng Tang, Huaxiong Wang, and Chaoping Xing, editors, Coding and Cryptology – Third International Workshop, IWCC 2011, Qingdao, China, May 30-June 3, 2011. Proceedings, volume 6639 of Lecture Notes in Computer Science, pages 11–46. Springer, 2011. doi:10.1007/978-3-642-20901-7_2.
- [5] Josh Cohen Benaloh and Jerry Leichter. Generalized secret sharing and monotone functions. In Shafi Goldwasser, editor, Advances in Cryptology – CRYPTO ’88, 8th Annual International Cryptology Conference, Santa Barbara, California, USA, August 21-25, 1988, Proceedings, volume 403 of Lecture Notes in Computer Science, pages 27–35. Springer, 1988. doi:10.1007/0-387-34799-2_3.
- [6] Avrim Blum, Merrick Furst, Michael Kearns, and Richard J. Lipton. Cryptographic primitives based on hard learning problems. In Douglas R. Stinson, editor, Advances in Cryptology — CRYPTO’ 93, pages 278–291, Berlin, Heidelberg, 1994. Springer Berlin Heidelberg.
- [7] Marco L. Carmosino, Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova. Agnostic learning from tolerant natural proofs. In Klaus Jansen, José D. P. Rolim, David Williamson, and Santosh S. Vempala, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA, volume 81 of LIPIcs, pages 35:1–35:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.APPROX-RANDOM.2017.35.
- [8] Irit Dinur, Prahladh Harsha, and Guy Kindler. Polynomially low error pcps with polyloglog n queries via modular composition. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 267–276. ACM, 2015. doi:10.1145/2746539.2746630.
- [9] Irit Dinur and Shmuel Safra. On the hardness of approximating label-cover. Inf. Process. Lett., 89(5):247–254, 2004. doi:10.1016/J.IPL.2003.11.007.
- [10] Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, Amit Sahai, and Brent Waters. Candidate indistinguishability obfuscation and functional encryption for all circuits. SIAM J. Comput., 45(3):882–929, 2016. doi:10.1137/14095772X.
- [11] Sanjam Garg, Craig Gentry, Amit Sahai, and Brent Waters. Witness encryption and its applications. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors, Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 467–476. ACM, 2013. doi:10.1145/2488608.2488667.
- [12] Halley Goldberg, Valentine Kabanets, Zhenjian Lu, and Igor C. Oliveira. Probabilistic kolmogorov complexity with applications to average-case complexity. In Shachar Lovett, editor, 37th Computational Complexity Conference, CCC 2022, July 20-23, 2022, Philadelphia, PA, USA, volume 234 of LIPIcs, pages 16:1–16:60. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.CCC.2022.16.
- [13] Dan Gutfreund and Amnon Ta-Shma. Worst-case to average-case reductions revisited. In Moses Charikar, Klaus Jansen, Omer Reingold, and José D. P. Rolim, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 10th International Workshop, APPROX 2007, and 11th International Workshop, RANDOM 2007, Princeton, NJ, USA, August 20-22, 2007, Proceedings, volume 4627 of Lecture Notes in Computer Science, pages 569–583. Springer, 2007. doi:10.1007/978-3-540-74208-1_41.
- [14] Johan Håstad, Russell Impagliazzo, Leonid A. Levin, and Michael Luby. A pseudorandom generator from any one-way function. SIAM J. Comput., 28(4):1364–1396, 1999. doi:10.1137/S0097539793244708.
- [15] Shuichi Hirahara. Non-disjoint promise problems from meta-computational view of pseudorandom generator constructions. In Shubhangi Saraf, editor, 35th Computational Complexity Conference, CCC 2020, July 28-31, 2020, Saarbrücken, Germany (Virtual Conference), volume 169 of LIPIcs, pages 20:1–20:47. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.CCC.2020.20.
- [16] Shuichi Hirahara. Np-hardness of learning programs and partial MCSP. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 – November 3, 2022, pages 968–979. IEEE, 2022. doi:10.1109/FOCS54457.2022.00095.
- [17] Shuichi Hirahara. Capturing one-way functions via np-hardness of meta-complexity. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1027–1038. ACM, 2023. doi:10.1145/3564246.3585130.
- [18] Shuichi Hirahara and Mikito Nanashima. Learning in pessiland via inductive inference. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 447–457, 2023. doi:10.1109/FOCS57990.2023.00033.
- [19] Shuichi Hirahara and Mikito Nanashima. One-way functions and zero knowledge. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 1731–1738. ACM, 2024. doi:10.1145/3618260.3649701.
- [20] Justin Holmgren and Ron Rothblum. Linear-size boolean circuits for multiselection. In Rahul Santhanam, editor, 39th Computational Complexity Conference, CCC 2024, July 22-25, 2024, Ann Arbor, MI, USA, volume 300 of LIPIcs, pages 11:1–11:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.CCC.2024.11.
- [21] Yizhi Huang, Rahul Ilango, and Hanlin Ren. Np-hardness of approximating meta-complexity: A cryptographic approach. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1067–1075. ACM, 2023. doi:10.1145/3564246.3585154.
- [22] Russell Impagliazzo. A personal view of average-case complexity. In Proceedings of the Tenth Annual Structure in Complexity Theory Conference, Minneapolis, Minnesota, USA, June 19-22, 1995, pages 134–147. IEEE Computer Society, 1995. doi:10.1109/SCT.1995.514853.
- [23] Russell Impagliazzo, Valentine Kabanets, and Ilya Volkovich. Synergy between circuit obfuscation and circuit minimization. In Nicole Megow and Adam D. Smith, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2023, September 11-13, 2023, Atlanta, Georgia, USA, volume 275 of LIPIcs, pages 31:1–31:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.APPROX/RANDOM.2023.31.
- [24] Russell Impagliazzo and Leonid A. Levin. No better ways to generate hard NP instances than picking uniformly at random. In 31st Annual Symposium on Foundations of Computer Science, St. Louis, Missouri, USA, October 22-24, 1990, Volume II, pages 812–821. IEEE Computer Society, 1990. doi:10.1109/FSCS.1990.89604.
- [25] Russell Impagliazzo and Michael Luby. One-way functions are essential for complexity based cryptography (extended abstract). In 30th Annual Symposium on Foundations of Computer Science, Research Triangle Park, North Carolina, USA, 30 October – 1 November 1989, pages 230–235. IEEE Computer Society, 1989. doi:10.1109/SFCS.1989.63483.
- [26] Russell Impagliazzo and Avi Wigderson. P = BPP if E requires exponential circuits: Derandomizing the XOR lemma. In Frank Thomson Leighton and Peter W. Shor, editors, Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, pages 220–229. ACM, 1997. doi:10.1145/258533.258590.
- [27] Mitsuru Ito, Akira Saio, and Takao Nishizeki. Multiple assignment scheme for sharing secret. J. Cryptol., 6(1):15–20, 1993. doi:10.1007/BF02620229.
- [28] Aayush Jain, Huijia Lin, and Amit Sahai. Indistinguishability obfuscation from well-founded assumptions. Commun. ACM, 67(3):97–105, 2024. doi:10.1145/3611095.
- [29] Valentine Kabanets and Jin-yi Cai. Circuit minimization problem. In F. Frances Yao and Eugene M. Luks, editors, Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 73–79. ACM, 2000. doi:10.1145/335305.335314.
- [30] Michael J. Kearns, Robert E. Schapire, and Linda Sellie. Toward efficient agnostic learning. Mach. Learn., 17(2-3):115–141, 1994. doi:10.1007/BF00993468.
- [31] Ker-I Ko. On the complexity of learning minimum time-bounded turing machines. SIAM J. Comput., 20(5):962–986, 1991. doi:10.1137/0220059.
- [32] Caleb Koch, Carmen Strassle, and Li-Yang Tan. Properly learning decision trees with queries is np-hard. In 64th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2023, Santa Cruz, CA, USA, November 6-9, 2023, pages 2383–2407. IEEE, 2023. doi:10.1109/FOCS57990.2023.00146.
- [33] Ilan Komargodski, Tal Moran, Moni Naor, Rafael Pass, Alon Rosen, and Eylon Yogev. One-way functions and (im)perfect obfuscation. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 374–383. IEEE Computer Society, 2014. doi:10.1109/FOCS.2014.47.
- [34] Ilan Komargodski, Moni Naor, and Eylon Yogev. Secret-sharing for NP. In Palash Sarkar and Tetsu Iwata, editors, Advances in Cryptology – ASIACRYPT 2014 – 20th International Conference on the Theory and Application of Cryptology and Information Security, Kaoshiung, Taiwan, R.O.C., December 7-11, 2014, Proceedings, Part II, volume 8874 of Lecture Notes in Computer Science, pages 254–273. Springer, 2014. doi:10.1007/978-3-662-45608-8_14.
- [35] Yanyi Liu, Noam Mazor, and Rafael Pass. A note on zero-knowledge for NP and one-way functions. Electron. Colloquium Comput. Complex., TR24-095, 2024. URL: https://eccc.weizmann.ac.il/report/2024/095.
- [36] Yanyi Liu, Noam Mazor, and Rafael Pass. On witness encryption and laconic zero-knowledge arguments. Electron. Colloquium Comput. Complex., TR24-194, 2024. URL: https://eccc.weizmann.ac.il/report/2024/194.
- [37] Zhenjian Lu, Noam Mazor, Igor C. Oliveira, and Rafael Pass. Lower bounds on the overhead of indistinguishability obfuscation. Electron. Colloquium Comput. Complex., TR24-146, 2024. URL: https://eccc.weizmann.ac.il/report/2024/146.
- [38] Noam Mazor and Rafael Pass. Gap MCSP is not (levin) np-complete in obfustopia. In Rahul Santhanam, editor, 39th Computational Complexity Conference, CCC 2024, July 22-25, 2024, Ann Arbor, MI, USA, volume 300 of LIPIcs, pages 36:1–36:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.CCC.2024.36.
- [39] Mikito Nanashima. A theory of heuristic learnability. In Mikhail Belkin and Samory Kpotufe, editors, Proceedings of Thirty Fourth Conference on Learning Theory, volume 134 of Proceedings of Machine Learning Research, pages 3483–3525. PMLR, August 2021. URL: https://proceedings.mlr.press/v134/nanashima21a.html.
- [40] Noam Nisan and Avi Wigderson. Hardness vs randomness. J. Comput. Syst. Sci., 49(2):149–167, 1994. doi:10.1016/S0022-0000(05)80043-1.
- [41] Igor C. Oliveira and Rahul Santhanam. Conspiracies between learning algorithms, circuit lower bounds, and pseudorandomness. In Ryan O’Donnell, editor, 32nd Computational Complexity Conference, CCC 2017, July 6-9, 2017, Riga, Latvia, volume 79 of LIPIcs, pages 18:1–18:49. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.CCC.2017.18.
- [42] Shien Jin Ong and Salil P. Vadhan. Zero knowledge and soundness are symmetric. In Moni Naor, editor, Advances in Cryptology – EUROCRYPT 2007, 26th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Barcelona, Spain, May 20-24, 2007, Proceedings, volume 4515 of Lecture Notes in Computer Science, pages 187–209. Springer, 2007. doi:10.1007/978-3-540-72540-4_11.
- [43] Leonard Pitt and Leslie G. Valiant. Computational limitations on learning from examples. J. ACM, 35(4):965–984, 1988. doi:10.1145/48014.63140.
- [44] Ran Raz, Omer Reingold, and Salil P. Vadhan. Extracting all the randomness and reducing the error in trevisan’s extractors. J. Comput. Syst. Sci., 65(1):97–128, 2002. doi:10.1006/JCSS.2002.1824.
- [45] Rahul Santhanam. Pseudorandomness and the minimum circuit size problem. In Thomas Vidick, editor, 11th Innovations in Theoretical Computer Science Conference, ITCS 2020, January 12-14, 2020, Seattle, Washington, USA, volume 151 of LIPIcs, pages 68:1–68:26. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.ITCS.2020.68.
- [46] Leslie G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134–1142, 1984. doi:10.1145/1968.1972.
- [47] Andrew Chi-Chih Yao. Theory and applications of trapdoor functions (extended abstract). In 23rd Annual Symposium on Foundations of Computer Science, Chicago, Illinois, USA, 3-5 November 1982, pages 80–91. IEEE Computer Society, 1982. doi:10.1109/SFCS.1982.45.
Appendix A WE from Indistinguishability Obfuscation
In this section, we give a sketch of the construction showing that indistinguishability obfuscation for implies witness encryption for .
Definition 75 (Indistinguishability obfuscation (iO)).
A PPT algorithm is an indistinguishability obfuscator for a circuit class if the following conditions are met.
-
Correctness: for all and all inputs , letting , it holds that .
-
Security: for any efficient adversary , for any circuits and computing the same function, and are indistinguishable for .
Theorem 76 ([10]).
For any , iO for implies WE for .
Proof sketch..
Let be a poly-time verifier for . For an instance of and a bit , define
Consider the witness encryption scheme defined as follows:
-
;
-
.
By the correctness of the iO, if , then for , it holds that . Therefore, the correctness of WE is satisfied.
We leave the proof of security to the reader, but we note that if , then and are the same function.
Appendix B Public-key Encryption from WE and OWFs
In this section, we give a sketch of the construction showing that witness encryption, together with the existence of a one-way function, implies the possibility of public-key encryption.
Definition 77 (Public-key encryption).
A public-key encryption scheme is a triple of algorithms meeting the following conditions.
-
Correctness: for all sufficiently large and any bit ,
-
Security: for all sufficiently large and any efficient adversary ,
Theorem 78 ([11]).
Suppose that WE exists for every language in and that a one-way function exists. Then, public-key encryption exists.
Proof sketch..
By [14], we may assume the existence of a PRG for all sufficiently large . Consider the language
and let be a witness encryption scheme for . Define a public-key encryption scheme as follows.
-
: sample and let . Let and .
-
: return .
-
: return the output of .
By the correctness of the witness encryption, for any pair in the support of , it holds that is a witness for . Therefore, . This shows the correctness of the PKE scheme.
We leave the proof of security to the reader.
