Hardness Amplification for Real-Valued Functions
Abstract
Given an integer-valued function that is mildly hard to compute on instances drawn from some distribution over , we show that the function is strongly hard to compute on instances drawn from the product distribution . We also show the same for the task of approximately computing real-valued functions . Our theorems immediately imply hardness self-amplification for several natural problems including Max-Clique and Max-SAT, Approximate SAT, Entropy Estimation, etc..
Keywords and phrases:
Average-case complexity, hardness amplificationCopyright and License:
2012 ACM Subject Classification:
Theory of computation Computational complexity and cryptographyFunding:
Both authors are supported by the National Research Foundation, Singapore, under its NRF Fellowship programme, award No. NRF-NRFF14-2022-0010.Editors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Hardness amplification is the process of taking a computational problem and a distribution of instances over which is mildly hard, and constructing a problem and distribution over which is much harder. There has been extensive work in the past studying hardness amplification for various computational tasks such as computing Boolean functions [26, 17, 12, 19, 18, 5], inverting efficient functions [26, 11], distinguishing between distributions [9], deciding languages contained in complexity classes like [21, 15, 23, 14, 6], [24], [20, 8], or [4, 13, 7], solving optimization problems [10], for specific interesting or structured problems [3, 16, 1], etc..
In this paper, we study hardness amplification for the task of evaluating integer- or real-valued functions. We first describe our setting and results, and then demonstrate various corollaries of our theorems that motivate studying such problems.
Evaluating Functions
In our first result, we consider functions , where denotes the set of integers , and the computational task is to compute given an input . Here, we only use to denote set without involving any modulo operations. Suppose we are given such an and a distribution on over which computing is -hard – meaning that no small circuit can correctly compute with probability greater than when is drawn from . Our objective is to construct a function and distribution such that computing is -hard over for some small . Further, we would like to have the same type as – to take bit-strings as input and produce integers from a bounded range as output.
For Boolean functions, Yao’s XOR Lemma [26] shows that such amplification can be achieved by having be the XOR of multiple instances of ; and similar results are also known if is the Recursive Majority-of-3 of such instances [21]. We show that in this case of integer-valued functions, having be the sum (over integers) of multiple instances of achieves the same. For , denote by the function that takes inputs , and outputs the sum .
Theorem 1.1 (Simplification of Theorem 3.1).
Suppose a function is -hard to compute over a distribution for circuits of size . Then, for , the function is -hard to compute over the product distribution for circuits of size , as long as , where and are some universal constants.
In a typical application of this theorem (see Section 1.1 for examples), one might take to be a small constant, to be some polynomially large value in , assume -hardness for being any arbitrary polynomial in , and set to be but still some polynomial in . The theorem would then imply that is -hard for all polynomial-sized circuits.
To place the -hardness we obtain in context, observe that it is not possible to show that summation generically amplifies such a function to hardness less than . Consider, for example, a hypothetical function that takes values in , is easy to compute on fraction of inputs, and on the remaining is optimally hard (so it is not possible to do better than random guessing). This function is -hard. Given random inputs from the hard distribution, the hardness comes only from about of the inputs. And simply guessing randomly on each of these will yield the correct value for the sum of their outputs with probability .
Approximating Functions
In our second result, we extend the above hardness amplification to the task of approximately evaluating bounded real-valued functions. Here we consider functions , and the task is, for some approximation parameter , to compute some value in the range given input . We show that summation again amplifies hardness, though with slightly different dependence on the various parameters, and also now depending on the .
Theorem 1.2 (Simplification of Theorem 4.1).
Suppose a function is -hard to -approximate over a distribution for circuits of size . Then, the function is -hard to -approximate over the product distribution for circuits of size , as long as , where and are some universal constants.
Here too, in the applications we show, parameters are set as described for Theorem 1.1 earlier, with .
Paper Outline
In the rest of this section, we describe various corollaries of the above theorems (Section 1.1) and provide an overview of the proofs of these theorems (Section 1.2). In Section 2, we set up the definitions, conventions, and hardcore lemmas needed in the rest of the paper. In Sections 3 and 4, we state and prove more comprehensive versions of Theorems 1.1 and 1.2, respectively.
1.1 Corollaries
Functions mapping bit-strings to integers or real numbers, even within limited ranges, are quite general and capture a variety of natural problems whose complexity is of significant interest – essentially any problem whose solution for an instance is a bounded integer or real number. For such problems, our results roughly say that computing the sum of solutions to instances is a much harder problem. Such a statement is not particularly meaningful in general, but things become much more interesting if the problem also happens to admit an additively homomorphic self-reduction.
That is, suppose there is an efficient algorithm such that for any inputs , we have . In this case, the problem of computing the sum of solutions of instances can be reduced back to the solving the problem itself on a single instance. Then, our results can be used to show that mild hardness of implies strong hardness of itself, possibly on a different distribution over instances (in what is sometimes referred to as hardness self-amplification). And this requirement is weak enough that many natural and important problems have such self-reductions. Below, we show three examples, each of which is qualitatively distinct from the others.
Optimization Problems
Various natural optimization problems can be cast in terms of computing a polynomially bounded integer-valued function of the input, and further be shown to possess simple additively homomorphic self-reductions. For example, consider the Max-Clique problem where, given (the adjacency matrix of) a graph , the task is to compute the size of its largest clique. Given graphs , we can create a new graph consisting of one copy of each , with edges between every pair of vertices that are not from the same graph. The size of the maximum clique in this composite graph is simply the sum of the maximums in all the ’s.
Another example is the problem, where given a CNF formula , the task is to find the maximum number of clauses satisfied by any assignment to its variables. Given formulas on disjoint sets of variables, the maximum number of clauses of the formula that can be satisfied is simply the sum of the maximums of all the ’s. Note that both of these problems also happen to be -hard. We get the following from Theorem 1.1, following the arguments above.
Corollary 1.3.
The following holds for any problem . If there is a family of distributions on which is -hard, then there is a family of distributions on which is -hard (where is the instance size). Further, if the former family is efficiently sampleable, then so is the latter.
In these corollaries, we consider the asymptotic hardness of these problems rather than for a fixed input size, which is why we need to consider a family of distributions – one distribution for each value of the instance size parameter – rather than a single distribution. And by efficiently sampleable, we mean sampleable by a polynomial-sized family of circuits. Similarly, when we just say -hard, we mean -hard for families of circuits of size any polynomial in the instance size parameter . More detailed and careful statements of all the corollaries are presented in the full version, along with sketches of their proofs. These are all asymptotic statements, and so involve applying our theorems (which are stated for arbitrary input lengths and circuit sizes) for all members of families of functions and circuits.
Hardness amplification for optimization problems, including the above examples, was studied in [10]. However, they considered the task of actually finding the maximum clique, the maximally satisfying assignment, etc., and their results are incomparable to the corollary above.
Entropy Estimation
A natural problem (that has incidentally been of some significance in cryptography [25]) is that of estimating the Shannon entropy of a distribution given its sampling algorithm (say, as a circuit). Given a distribution over , its entropy is some real number in the range , and Shannon entropy is also conveniently additive: for any random variables and , we have . We cannot use Theorem 1.1 here because the entropy is not integer-valued, but we can use Theorem 1.2 to show hardness amplification for the task of approximately computing the entropy of a given distribution.
In this case, we can in fact go further. A related decision problem, called the Entropy Difference problem, is known to be complete for the complexity class , which consists of problems that possess statistical zero-knowledge proofs [22]. In this problem, given sampling algorithms for two distributions and , and promised that their entropies are separated by a gap of at least , the task is to tell which distribution has larger entropy. If this problem is even mildly average-case hard over some distribution of instances , then the task of computing the entropy of distributions to within is mildly average-case hard for the distribution given by sampling as above and randomly outputting one of the two distributions. These observations, together with Theorem 1.2, give us the following.
Corollary 1.4.
If there is a problem in that is -hard over some family of distributions, then there exists a family of distributions on which -approximating Shannon entropy is -hard. Further, if the former family of distributions is efficiently sampleable, then so is the latter.
Approximate Counting
Given as input the description of a Non-deterministic Turing Machine and an input for it, define to be the number of accepting paths in the execution of given input . This function captures the defining problem of the complexity class . The same can be done with the -complete problem , which is the problem of counting the number of satisfying assignments to a given Boolean formula.
In both these cases, however, the function can take exponentially many values in the instance size, and so trying to apply our theorems would not give meaningful bounds. However, if we instead consider the function , this function is real-valued and lies in a polynomially bounded range (as long as the formula is guaranteed to be satisfiable). Further, additive approximations to are equivalent to multiplicative approximations to . Theorem 1.2 now implies the following.
Corollary 1.5.
Suppose there is a family of distributions over satisfiable Boolean formulas on which multiplicatively approximating the number of satisfying assignments to within a factor of is -hard. Then there is a family of distributions on which the same task is -hard (where is the instance size). Further, if the former is efficiently sampleable, then so is the latter.
1.2 Technical Overview
In this section, we give an overview of our analysis of the hardness amplification of summation. We focus on hardness amplification of evaluating integer-valued functions (Theorem 1.1).
Hardness of -valued functions
We will start by showing something even simpler – hardness amplification for functions that only take two different values. Our techniques here are inspired by ideas in the proof of existing hardness amplification theorems for problems in [21].
Suppose that there is a function , and a corresponding distribution over , on which the function is strongly average-case hard; that is, there is some small such that for any circuit of size at most ,
For simplicity, we assume that is balanced over . That is,
For some , consider the function , where . In the following, we will show that the average-case hardness of improves polynomially in . Particularly, for any circuits of size at most approximately , we have:
| (1) |
Before we show this, let us understand the best hardness we can hope to show. A simple algorithm for computing is to just always output the value that is most likely to take. Since is assumed to be balanced over , this value would be .
If the function had been optimally -hard, then this would also be the best possible algorithm for . What we have is that is -hard for some small , indicating that is still almost optimally hard. We essentially show that in this case the above algorithm is still nearly the best possible algorithm .
We start by observing that the hardness of implies the indistinguishability of the distributions conditioned on different outputs – and . That is, for any circuit of size at most ,
| (2) |
For function , consider the performance of any circuit for computing . We claim that, for any and any in the output range of , the following holds:
| (3) |
To prove it, assume
| (4) |
The distribution is equivalent to the distribution sampled as follows:
-
1.
Independently sample , and independently
-
2.
Sample a uniformly random permutation from all possible permutations over coordinates
-
3.
Output
Using this observation and the linearity of expectation, (4) implies that there must exist a fixed and a permutation , such that
where denotes . Then, a circuit can be constructed by taking and permutation as non-uniform advice and working as follows: on the input , outputs iff outputs . The size of is approximately the size of , and we have
which contradicts the hardness of as captured by (2).
Based on (3) and a simple telescoping argument, we further obtain, for any and any in the output range of ,
| (5) |
We can now bound the probability that a circuit can correctly compute the function as:
where the second line follows from the fact that is balanced over , the third line follows from (5), the fourth line from the maximality of the central binomial co-efficient, the fifth line from computing the sum of the series there, and the last line from the fact that the events in the probability expressions are disjoint.
The above approach to bounding the probability of computing the sum of independent instances of a function whose value between two possible outputs is strongly hard to decide is at the core of the proofs of our results.
Reducing to two outputs
Since our amplification approach is based on the strong indistinguishability of distributions over the pre-image sets of two outputs, for any evaluation problem, we will identify such a pair of indistinguishable pre-image sets based on assumption that the evaluation problem is hard. For simplicity, we will take the distribution over which the problem is hard to be the uniform distribution over .
Consider a function that is -hard for circuits of size . For every , , we define a computation problem in which we only consider the correctness on inputs whose output belongs to . We formalize the above problem by defining the relations over :
-
If , then if and only if ;
-
If , then for every .
For any , denote by the set of such that . We show that the hardness of implies that there must exist some and some distribution over their pre-image sets for which it is hard to distinguish whether is or . More precisely, we show that there exists a pair such that is -hard for circuits of size .
Amplifying using hardcore sets
Pick a relation that has such hardness. As its hardness essentially comes from the hardness of deciding between two possible outputs and , we can extend existing proofs of the hardcore lemma for Boolean functions (e.g. that of [17]) to obtain a hardcore set111Actually, what we obtain are hardcore distributions, but we assume these are sets in this overview for simplicity. for . This is a set of density at least such that the relation is -hard over random inputs from for circuits of size roughly . Further, we can ensure that all inputs are such that , and with only a small loss, we can also ensure that this set is balanced between the outputs and .
The rest of the argument is quite standard. Given inputs sampled uniformly at random from , with high probability roughly at least a fraction of these will fall in . Looking at just these inputs, we are essentially back in the case discussed at the beginning of this overview – that of a function that has two possible outputs, with inputs being sampled from a hard distribution balanced between these outputs. Applying the amplification arguments there to this subset of inputs, we get from (1) that computing the sum of the ’s for the ’s that fall in is roughly -hard for circuits of size roughly . In order to compute the sum of all the ’s, the sum corresponding to the above inputs needs to be computed. So this hardness carries over to computing as well. Setting now gives us Theorem 1.1.
2 Preliminaries
2.1 Notations
For , denote the set by (note that this is just a set, not the ring of integers modulo ). For , we use to denote the interval and use round brackets for open intervals and square brackets for closed intervals.
For distributions over the same domain , for any integers , the symbol stands for the following distribution: sample from independently and sample from independently, sample a permutation over entries uniformly randomly, and output .
For and some alphabet , given functions and , denote the function that outputs over input by . For a relation , for any , we define .
We represent real numbers using strings. In each case, the range of relevant real numbers is some that will be clear from the context, and the string is to be interpreted as a fixed-point representation of numbers in that range. That is, for any , is evaluated as , where and is represented by .
2.2 Average-Case Hardness
The average-case hardness of a problem is defined with respect to a distribution over its input domain. We start by formally defining the hardness of functions and relations.
Definition 2.1 (Hardness of Evaluating Functions).
For any , , consider a function , where is an output domain that can be encoded by and . For a distribution over , is called -hard on for circuits of size if, for any circuit of size at most , we have
Definition 2.2 (Hardness of Satisfying Relations).
For any , , consider a relation , where is an output domain that can be encoded by and . For a distribution over , is called -hard on for circuits of size , if for any circuit of size at most ,
Definition 2.3.
For a function , the approximation problem with distance is denoted by a relation , where
Similarly, the closed approximation is defined by
Definition 2.4 (Hardness of Approximating Functions).
For any , and , consider a function . For a distribution over , is called -hard to approximate on with accuracy and distance for circuits of size , if the relation is -hard on for circuits of size , where
and real value is encoded by a binary of length .
2.3 Hardcore Lemmas
Impagliazzo’s hardcore lemma [17] implies the existence of a strongly hard subset within an instance space where the Boolean function is only mildly hard on average. In this section, we first extend the hardcore lemma to a more general setting, for relations with a closure property under majority. Then, we will demonstrate how to transform a hard distribution into a balanced one, to facilitate our subsequent proofs of hardness amplification.
The following definition of density is utilized to measure the flatness of the hardcore distribution obtained, ensuring that it can be reintegrated into the original distribution [2, Chapter 19].
Definition 2.5 (Relative Density).
For and distributions on , is called -dense with respect to , if for any , we have
The closure property under majority is highly useful for identifying the hardcore distribution of functions or relations. Several prior studies have relied on this closure property to prove the existence of hardcore distribution [17, 19, 18, 5]. We begin by formally defining majority combiner.
Definition 2.6 (Majority Combiner).
For a relation and , a circuit is called a majority combiner for relation over coordinates, if for any and any , such that , we have .
The key idea behind Impagliazzo’s hardcore lemma is that, if for any distribution with certain density, there always exists a circuit of slightly smaller size that solves the problem with probability more than , for some , then we can construct a larger circuit by taking the majority vote of multiple carefully chosen circuits, which can result in a high probability of agreement with function , leading to a contradiction.
Lemma 2.7 (Extended Hardcore Lemma).
For , consider a relation ; define . Consider any distribution over . Consider any , , and large enough . Let be an integer. If there exists a majority combiner for relation over coordinates of size at most , and is -hard on for circuits of size , then there is a distribution over which is -dense with respect to , such that is -hard on for circuits of size .
Remark 2.8.
The proof of Lemma 2.7 follows [17, 2] via the Min-Max theorem with some subtle adjustments, which can be found in the full version of this work. The best-known parameter for the small circuit size in the lemma is , whereas the bound we present here is . In fact, the proof in [5], which uses a multiplicative weight update method, can be directly adapted to our extended setting. For simplicity, we provide a proof for a slightly weaker bound, which suffices for our purpose.
2.4 Balancing Hardcore Distributions
Definition 2.9 (Balanced Distribution).
For a domain , given a distribution on and , a relation is called balanced on around , if
Adapting the approach in [21], we derive the following lemma, which will be useful in our later arguments. We refer the reader to the full version of this paper for the proof.
Lemma 2.10 (Balanced Hardcore).
For , and , for a relation and distributions over , suppose that there exist , , such that
-
(1)
For any , there is exactly one of the following holds: or .
-
(2)
Letting denote the distribution and denote the distribution ; has -density with respect to .
If is a -dense distribution with respect to and is -hard on for circuits of size , then there is a distribution with density with respect to , on which is balanced around and -hard for circuits of size .
3 Evaluating Integer-Valued Functions
We now present our hardness amplification result for evaluating integer-valued functions. Given a function , which is somewhat hard to evaluate on average, it is feasible to show that the function possesses a strong average-case hardness, where represents the summation function over coordinates. We state our main theorem below.
Theorem 3.1.
For , and a distribution over , for any large enough , consider a function that is -hard on for circuits of size , define a function as follows:
Then, for , for large enough , is -hard on for circuits of size , where
For sufficiently small , the dominant term of is . Taking large enough enables us to establish hardness amplification. To prove it, we will first construct the hardcore distribution for the function , show that summation effectively amplify the hardness on this hard distribution and then generalize the result to the original distribution.
Intuitively, the hardcore distribution of any Boolean-valued function is straightforward, as the output is restricted to either yes or no. However, for integer-valued functions, the structure of a hard set is inherently more complicated. We characterize the hardcore of integer-valued functions by defining a set of new problems, simply considering two values in the output domain and an input value is only considered relevant if its corresponding output matches one of those. We show that there is a pair of values such that the resulting problem is hard. Then, we extract a hardcore from this hard problem, which possesses a good structure corresponding to the original function.
Lemma 3.2.
For , and a distribution over , consider a function . For any and , define the relation as follows:
-
If , then if and only if ;
-
If , then for any .
For , if is -hard on for circuits of size , there exists a pair of , such that is -hard on for circuits of size .
This lemma suggests that if the function is hard to evaluate on average, then there must exist output values , such that distinguishing their pre-images is also hard on average. We defer the proof to Section 3.1, and our hardcore construction is shown below.
Lemma 3.3 (Hardcore for Integer-Valued Functions).
For , and a distribution , consider a function , which is -hard on for circuits of size . If is sufficient large, there exist and a -dense distribution (with respect to ), on which is balanced around and -hard for circuits of size .
Proof.
Suppose is -hard on for circuits of size , by Lemma 3.2, if , there exists a pair of , such that is -hard on for circuits of size . This hardness implies that , then the distribution is -dense with respect to (and the same holds for ).
For relation , the majority gate is a natural choice for the combiner. For some , let be an integer, if the size of majority-of- circuits is less than , by Lemma 2.7, there is a -dense hardcore distribution (with respect to ) over , on which is -hard for circuits of size . By Lemma 2.10, we can construct a distribution with density (with respect to ), on which is balanced around and -hard for circuits of size .
We believe that this hardcore distribution exhibits a desirable structure, given that the hardness of evaluating function on it can be captured by indistinguishability. In the following, we consider the inputs sampled from the hardcore distribution. It is a natural way to amplify the average-case hardness of function , by constructing the function . If is hard on distribution on which there are two possible outputs, then the best algorithm with restricted running time would not perform significantly better than random guessing. Consequently, the almost optimal way for guessing the value of on is output the one with the highest probability of occurring.
Lemma 3.4.
For , and , consider a hardcore distribution , on which function is balanced around and -hard for circuits of size . Define the function as follows:
For and any other distribution over , for sufficient large , specifically , the function is -hard on for circuits of size , where
This lemma demonstrates that summation can effectively amplify the hardness over the hardcore distribution, for which the proof will be presented in Section 3.2. However, the hardcore lemma only ensures the existence of a hard distribution without guaranteeing efficient sampling. Therefore, to derive a more meaningful hardness result, we will eventually focus on the hardness on the original distribution.
One direct approach is to embed this distribution into the original one with some probability mass parameterized by its relative density. When sampling instances from the original distribution a sufficient number of times, the number of instances sampled from the hard one will concentrate around the expected value. Based on this observation, we proceed to prove our main theorem.
Theorem 3.1. [Restated, see original statement.]
For , and a distribution over , for any large enough , consider a function that is -hard on for circuits of size , define a function as follows:
Then, for , for large enough , is -hard on for circuits of size , where
Proof.
Theorem 3.1 For a function , which is -hard on for circuits of size , by Lemma 3.3, there exists and a -dense distribution with respect to , such that is balanced around and -hard on for circuits of size .
has density , then there exists a distribution over , such that . For , we have
Therefore, for any large enough , and any circuit of size , by Lemma 3.4,
where . The last inequality is obtained by Chernoff bound, where the first term is equivalent to the probability that a binomial distribution with parameter samples a value less than .
3.1 Proof of Lemma 3.2
Lemma 3.2. [Restated, see original statement.]
For , and a distribution over , consider a function . For any and , define the relation as follows:
-
If , then if and only if ;
-
If , then for any .
For , if is -hard on for circuits of size , there exists a pair of , such that is -hard on for circuits of size .
Proof.
Assume that, for any , , there is a circuit of size , such that
Let . Then, we can construct a circuit as follows:
-
1.
Input:
-
2.
For :
-
If for every , holds, return .
-
-
3.
Return .
During the process, compares each output of with , which means there are comparison of bits, so the circuit complexity of is at most , where is a constant. If , the size of is at most .
By taking the union bound, we have
For any input , there exists at most one , such that for every , outputs . If not, we have , which leads to a contradiction.
If every outputs a correct value in , there exists a unique , such that for every , then computes correctly.
which contradicts the fact that is -hard for circuits of size .
3.2 Proof of Lemma 3.4
Lemma 3.4. [Restated, see original statement.]
For , and , consider a hardcore distribution , on which function is balanced around and -hard for circuits of size . Define the function as follows:
For and any other distribution over , for sufficient large , specifically , the function is -hard on for circuits of size , where
Proof.
Consider a hardcore distribution and a function , such that is balanced around and -hard on for circuits of size . Let denote the distribution and denote , we have . For any , denote by Then, consider any and a permutation of entries, we claim the following fact.
Claim 3.5.
If , for any circuit of size at most and any , , we have
By Claim 3.5 (which is proven below), combined with triangle inequality, for any ,
Let . For any circuit of size at most ,
Hence, for any distribution over ,
3.2.1 Proof of Claim 3.5
Claim 3.5. [Restated, see original statement.]
If , for any circuit of size at most and any , , we have
Proof.
Without loss of generality, suppose there exists a circuit of size at most and , , such that
Then, there must exists a tuple and a permutation of entries, such that
where denotes .
Construct a circuit , implementing by taking an input of length , along with fixed and as the non-uniform advice, outputs if and only if outputs and outputs otherwise. Then,
Therefore, the probability that correctly compute is
which contradict to is -hard on .
4 Approximating Real-Valued Functions
In this section, we extend our results to real-valued functions. To avoid any precision loss introduced by encoding, we assume that for any open interval of length , there is a value that can be encoded by in the interval. Therefore, for any approximation considered below, we always assume .
Theorem 4.1.
For , , , and a distribution over , consider a function that is -hard to approximate on with accuracy and distance , for circuits of size . For any large enough , define a function as follows:
Then, for , for any large enough , is -hard to approximate on with accuracy and distance , , for circuits of size , where
When is small enough, will be dominated by . For large enough , we can effectively obtain a function with strong average-case hardness to approximate by taking the summation of multiple copies of .
Lemma 4.2.
For , , , and a distribution over , consider a real-valued function . For any , let , which denotes the radius of the partitioned intervals. For any , define the relation as follows:
-
If or , then if and only if ;
-
If and , then for any .
For any integer , for large enough , if is -hard to approximate on with accuracy and distance for circuits of size , there exist , , such that is -hard on , for circuits of size .
The proof is deferred to Section 4.1. Following the approach used in the integer case, we proceed with the construction of the hardcore distribution. This hard distribution maintains a good structure, on which will be balanced and mapped to or , for some fixed , which implies the closed approximation of with distance is balanced on around .
Lemma 4.3 (Hardcore for Real-Valued Functions).
For , , , and a distribution over , consider a function , which is -hard to approximate with accuracy and distance for circuits of size . For any integer , let , there exist , , and a -dense distribution (with respect to ), on which the closed approximation of with distance is balanced around and the approximation of with accuracy and distance is -hard for circuits of size .
Proof.
Suppose is -hard on for circuits of size , by Lemma 4.2, for integer , , if is sufficiently large, there exist , , such that (as defined in Lemma 4.2) is -hard on for circuits of size .
For , the majority combiner can be constructed by taking the middle point. By Lemma 2.7, when is large enough, we have a -dense distribution (with respect to ) over , on which is -hard for circuits of size , as well as the approximation of with accuracy and distance is hard on . If , then , the intervals and are disjoint. Then, by Lemma 2.10, we can construct a distribution with density , on which the closed approximation of with distance is balanced around and the approximation of with accuracy and distance is -hard for circuits of size .
Analogously, we prove that summation suffices to achieve amplification.
Lemma 4.4.
For , and , , for large enough , let , consider a hardcore distribution and , , on which the closed approximation of function with distance is balanced around and is -hard to approximate with accuracy and distance for circuits of size . For any integer , define a function as follows:
For and any other distribution over , for any large enough , function is -hard to approximate with accuracy and distance on for circuits of size , where and
Theorem 4.1. [Restated, see original statement.]
For , , , and a distribution over , consider a function that is -hard to approximate on with accuracy and distance , for circuits of size . For any large enough , define a function as follows:
Then, for , for any large enough , is -hard to approximate on with accuracy and distance , , for circuits of size , where
Proof.
For and a function , which is -hard for circuits of size , by Lemma 4.3, for , let , there exist , , and a -dense distribution with respect to , such that the closed approximation of with distance is balanced around and that is -hard to approximate with accuracy and distance on for circuits of size .
Since has relative density , there exists a distribution over , such that . For , we have
Therefore, for any large enough , for any circuit of size , by Lemma 4.4, we have
where . Let , we have
Then, ,
4.1 Proof of Lemma 4.2
Lemma 4.2. [Restated, see original statement.]
For , , , and a distribution over , consider a real-valued function . For any , let , which denotes the radius of the partitioned intervals. For any , define the relation as follows:
-
If or , then if and only if ;
-
If and , then for any .
For any integer , for large enough , if is -hard to approximate on with accuracy and distance for circuits of size , there exist , , such that is -hard on , for circuits of size .
Proof.
Suppose that for any , , there is a circuit of size , such that
For simplicity, let . Taking a combination of circuits , construct a new circuit as follows:
-
1.
On input , for any in ascending order:
-
Compute ;
-
if outputs a value in for every , return , which is the maximum value among all the outputs given by .
-
-
2.
Return if no such exists.
If is large enough, specifically , the size of is approximately . The performance of the circuit is stated as follows.
Claim 4.5.
For any , if for every distinct , .
By Claim 4.5, we have
The second inequality is obtained by taking the union bound. It contradicts the fact that is -hard to approximate with accuracy and distance .
In the following, we will prove that the relation () is potentially hard only if . Assume the distance of two centers , construct a circuit which outputs a value in , regardless of inputs. Since the interval length is
there exists a value in this interval that can be encoded in .
Consider any input , if , ; if , ; otherwise, any output is in . Then, the output is guaranteed to have .
Therefore, there exist , , such that is -hard for circuits of size .
4.1.1 Proof of Claim 4.5
Claim 4.5. [Restated, see original statement.]
For any , if for every distinct , .
Proof.
We partition the output space into multiple intervals with length and define a set relation . For each relation, we focus on the inputs whose outputs by lie in the corresponding intervals or . Recall the process of algorithm : for in ascending order, if for every , holds, then stops and outputs .
Suppose that the algorithm returns when , then . If , there exists , such that , since we assume the correctness of every , then should stop when , which results in a contradiction.
If , for every , then . If , suppose and , then .
4.2 Proof of Lemma 4.4
Lemma 4.4. [Restated, see original statement.]
For , and , , for large enough , let , consider a hardcore distribution and , , on which the closed approximation of function with distance is balanced around and is -hard to approximate with accuracy and distance for circuits of size . For any integer , define a function as follows:
For and any other distribution over , for any large enough , function is -hard to approximate with accuracy and distance on for circuits of size , where and
Proof.
For an integer , let , consider a hardcore distribution and , , such that, the closed approximation of function with distance is balanced around on . Denote distribution by and distribution by . Since is balanced around on distribution , . The hardness of approximating function implies the indistinguishability of this two distribution and .
For , let . Then, consider any fixed and any permutation of coordinates, we have the following fact.
Claim 4.6.
For , such that and is valid, for any circuit of size at most , for any , such that and any , we have
where .
Let . For any large even , for any circuit of size at most , the following holds,
| (6) | ||||
The inequality (4.2) is obtained by using Claim 4.6, we use the probability with a same condition to give an upper bound for the original term. The last inequality follows Claim 4.7. Therefore, for any distribution over , we have
Claim 4.7.
For large enough , for any circuit , the following value is upper-bounded by :
where and .
Proof.
Recall that denotes the tolerance of the approximation error, and for some integer , we let . Since are the multiples of , let , for some positive integer . By our assumption, , thus .
For simplicity, for any integer , let
It is clear that for . Note that , the value can be upper bounded by
where summation is justified by the basic property of probability. To calculate the above, we collect the sum of coefficients corresponding to each . For each , the term should be
| (7) |
for some functions . Let denote the coefficient of . Since the summation of all possible is at most , by the definition of probability, it is feasible to give an upper bound for the entire summation by computing the upper bound for . For any , should satisfy:
It is clear that , since for any ,
Then, , we derive .
Recall that , assume , . When , we necessarily have . If there exists a negative satisfies the inequality, then , which leads to a contradiction. Therefore,
which is equivalent to
Therefore, we can give an upper bound for ,
On the other hand, can be upper bounded in terms of , that is
Then, plug in ,
In the following, for , we denote
| (8) |
Since the maximum value of is at most the maximum value of , then
The first equality holds because . To find the maximum value of , we compare each pairs of adjacent terms in the sequence: if and only if . We have
Therefore, for ,
4.2.1 Proof of Claim 4.6
Claim 4.6. [Restated, see original statement.]
For , such that and is valid, for any circuit of size at most , for any , such that and any , we have
where .
Proof.
Without loss of generality, suppose and . Suppose that there is a circuit of size at most and , , the inequality above does not hold. Then, there must exist a tuple and a permutation of entries, such that,
In fact, , let , which is a fixed value independent of . Recall that is the tolerance of approximation error and is the radius of the intervals around and , while letting for some large enough integer . Define a circuit as follows:
-
1.
Input: .
-
2.
Compute .
-
3.
If , output a value in .
-
4.
Otherwise, output a value in .
If is large enough, the size of is less than . In the following, we will show that the circuit can approximate function with a good probability, then result in a contradiction.
For any , which means , if output a value in which is equivalent to , we necessarily have . For any value , , then
On the other hand, for any , , if output a value , such that , which is equivalent to . The interval above covers the interval , since
The circuit will output a value in . Therefore, the probability that can successfully approximate function on is
which contradict to is -hard on .
References
- [1] Shweta Agrawal, Sagnik Saha, Nikolaj I. Schwartzbach, Akhil Vanukuri, and Prashant Nalini Vasudevan. k-sum in the sparse regime: Complexity and applications. In Leonid Reyzin and Douglas Stebila, editors, Advances in Cryptology - CRYPTO 2024 - 44th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 18-22, 2024, Proceedings, Part II, volume 14921 of Lecture Notes in Computer Science, pages 315–351. Springer, 2024. doi:10.1007/978-3-031-68379-4_10.
- [2] Sanjeev Arora and Boaz Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009. URL: http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521424264.
- [3] Vahid R. Asadi, Alexander Golovnev, Tom Gur, and Igor Shinkar. Worst-case to average-case reductions via additive combinatorics. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 1566–1574. ACM, 2022. doi:10.1145/3519935.3520041.
- [4] Marshall Ball, Alon Rosen, Manuel Sabin, and Prashant Nalini Vasudevan. Average-case fine-grained hardness. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 483–496. ACM, 2017. doi:10.1145/3055399.3055466.
- [5] Boaz Barak, Moritz Hardt, and Satyen Kale. The uniform hardcore lemma via approximate bregman projections. In Claire Mathieu, editor, Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009, New York, NY, USA, January 4-6, 2009, pages 1193–1200. SIAM, 2009. doi:10.1137/1.9781611973068.129.
- [6] Andrej Bogdanov and Luca Trevisan. On worst-case to average-case reductions for NP problems. SIAM J. Comput., 36(4):1119–1159, 2006. doi:10.1137/S0097539705446974.
- [7] Enric Boix-Adserà, Matthew S. Brennan, and Guy Bresler. The average-case complexity of counting cliques in erdős-rényi hypergraphs. In David Zuckerman, editor, 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, Baltimore, Maryland, USA, November 9-12, 2019, pages 1256–1280. IEEE Computer Society, 2019. doi:10.1109/FOCS.2019.00078.
- [8] Jin-yi Cai, Aduri Pavan, and D. Sivakumar. On the hardness of permanent. In Christoph Meinel and Sophie Tison, editors, STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 4-6, 1999, Proceedings, volume 1563 of Lecture Notes in Computer Science, pages 90–99. Springer, 1999. doi:10.1007/3-540-49116-3_8.
- [9] Nathan Geier. A tight computational indistinguishability bound for product distributions. In Eike Kiltz and Vinod Vaikuntanathan, editors, Theory of Cryptography - 20th International Conference, TCC 2022, Chicago, IL, USA, November 7-10, 2022, Proceedings, Part II, volume 13748 of Lecture Notes in Computer Science, pages 333–347. Springer, 2022. doi:10.1007/978-3-031-22365-5_12.
- [10] Elazar Goldenberg and Karthik C. S. Hardness amplification of optimization problems. In Thomas Vidick, editor, 11th Innovations in Theoretical Computer Science Conference, ITCS 2020, January 12-14, 2020, Seattle, Washington, USA, volume 151 of LIPIcs, pages 1:1–1:13. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.ITCS.2020.1.
- [11] Oded Goldreich, Russell Impagliazzo, Leonid A. Levin, Ramarathnam Venkatesan, and David Zuckerman. Security preserving amplification of hardness. In 31st Annual Symposium on Foundations of Computer Science, St. Louis, Missouri, USA, October 22-24, 1990, Volume I, pages 318–326. IEEE Computer Society, 1990. doi:10.1109/FSCS.1990.89550.
- [12] Oded Goldreich, Noam Nisan, and Avi Wigderson. On yao’s xor-lemma. In Oded Goldreich, editor, Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation - In Collaboration with Lidor Avigad, Mihir Bellare, Zvika Brakerski, Shafi Goldwasser, Shai Halevi, Tali Kaufman, Leonid Levin, Noam Nisan, Dana Ron, Madhu Sudan, Luca Trevisan, Salil Vadhan, Avi Wigderson, David Zuckerman, volume 6650 of Lecture Notes in Computer Science, pages 273–301. Springer, 2011. doi:10.1007/978-3-642-22670-0_23.
- [13] Oded Goldreich and Guy N. Rothblum. Counting t-cliques: Worst-case to average-case reductions and direct interactive proof systems. In Mikkel Thorup, editor, 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 77–88. IEEE Computer Society, 2018. doi:10.1109/FOCS.2018.00017.
- [14] Parikshit Gopalan and Venkatesan Guruswami. Hardness amplification within NP against deterministic algorithms. J. Comput. Syst. Sci., 77(1):107–121, 2011. doi:10.1016/J.JCSS.2010.06.008.
- [15] Alexander Healy, Salil P. Vadhan, and Emanuele Viola. Using nondeterminism to amplify hardness. SIAM J. Comput., 35(4):903–931, 2006. doi:10.1137/S0097539705447281.
- [16] Shuichi Hirahara and Nobutaka Shimizu. Hardness self-amplification: Simplified, optimized, and unified. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 70–83. ACM, 2023. doi:10.1145/3564246.3585189.
- [17] Russell Impagliazzo. Hard-core distributions for somewhat hard problems. In 36th Annual Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, 23-25 October 1995, pages 538–545. IEEE Computer Society, 1995. doi:10.1109/SFCS.1995.492584.
- [18] Satyen Kale. Boosting and hard-core set constructions: a simplified approach. Electron. Colloquium Comput. Complex., TR07-131, 2007. URL: https://eccc.weizmann.ac.il/eccc-reports/2007/TR07-131/index.html.
- [19] Adam R. Klivans and Rocco A. Servedio. Boosting and hard-core set construction. Mach. Learn., 51(3):217–238, 2003. doi:10.1023/A:1022949332276.
- [20] Richard J. Lipton. New directions in testing. In Joan Feigenbaum and Michael Merritt, editors, Distributed Computing And Cryptography, Proceedings of a DIMACS Workshop, Princeton, New Jersey, USA, October 4-6, 1989, volume 2 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 191–202. DIMACS/AMS, 1989. doi:10.1090/DIMACS/002/13.
- [21] Ryan O’Donnell. Hardness amplification within . J. Comput. Syst. Sci., 69(1):68–94, 2004. doi:10.1016/J.JCSS.2004.01.001.
- [22] Amit Sahai and Salil P. Vadhan. A complete problem for statistical zero knowledge. J. ACM, 50(2):196–249, 2003. doi:10.1145/636865.636868.
- [23] Luca Trevisan. On uniform amplification of hardness in NP. In Harold N. Gabow and Ronald Fagin, editors, Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, May 22-24, 2005, pages 31–38. ACM, 2005. doi:10.1145/1060590.1060595.
- [24] Luca Trevisan and Salil P. Vadhan. Pseudorandomness and average-case complexity via uniform reductions. Comput. Complex., 16(4):331–364, 2007. doi:10.1007/S00037-007-0233-X.
- [25] Salil Pravin Vadhan. A Study of Statistical Zero-Knowledge Proofs. PhD thesis, Harvard University, USA, 1999. AAI0801528.
- [26] Andrew Chi-Chih Yao. Theory and applications of trapdoor functions (extended abstract). In 23rd Annual Symposium on Foundations of Computer Science, Chicago, Illinois, USA, 3-5 November 1982, pages 80–91. IEEE Computer Society, 1982. doi:10.1109/SFCS.1982.45.
