Time-Space Tradeoffs of Truncation with Preprocessing
Abstract
Truncation of cryptographic outputs is a technique that was recently introduced in Baldimtsi et al. [2]. The general idea is to try out many inputs to some cryptographic algorithm until the output (e.g. a public-key or some hash value) falls into some sparse set and thus can be compressed: by trying out an expected different inputs one will find an output that starts with zeros.
Using such truncation one can for example save substantial gas fees on Blockchains where storing values is very expensive. While [2] show that truncation preserves the security of the underlying primitive, they only consider a setting without preprocessing. In this work we show that lower bounds on the time-space tradeoff for inverting random functions and permutations also hold with truncation, except for parameters ranges where the bound fails to hold for “trivial” reasons.
Concretely, it’s known that any algorithm that inverts a random function or permutation with range making queries and using bits of auxiliary input must satisfy . This lower bound no longer holds in the truncated setting where one must only invert a challenge from a range of size , as now one can simply save the replies to all challenges, which requires bits and allows to invert with query.
We show that with truncation, whenever is somewhat smaller than the bits required to store the entire truncated function table, the known lower bound applies.
Keywords and phrases:
Time-Space Lower Bounds, BlockchainsCopyright and License:
2012 ACM Subject Classification:
Theory of computation Interactive proof systems ; Theory of computation Cryptographic protocols ; Security and privacy Information-theoretic techniquesEditor:
Niv GilboaSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
1.1 Truncation
In [2] Baldimtsi, Chalkias, Chatzigiannis and Kelkar suggested truncation as a technique to shorten outputs of cryptographic primitives like hash values, signatures or public keys. The technique applies whenever there’s a part of the input that can be chosen arbitrarily, like the randomness in signing and key generation or parts of the input when hashing. The general idea is to simply try out many different inputs until the output lands in some sparse domain, say it starts with zeros, and such an output can be encoded using less bits than a general output.
While the applicability of this technique is limited by the fact that finding an output that starts with zeros will require around invocations of the primitive (so e.g., compressing by bits already requires around a million invocations), in [2] interesting applications are identified where this can make a big difference, including in the context of the Ethereum blockchain, where saving a few bits to be stored on-chain can lead to significant savings. We discuss another application to proofs of space as used in the Chia network in the open problems Section 6.
While truncation puts extra burden on the evaluator (say, a signer), there’s no extra cost for the parties that use the output (say, verify the signature). Moreover [2] show that truncation preserves security in the bit security framework of [9]. Informally, a primitive has bit security , if every adversary who breaks the security with advantage must run in time .
1.2 Time-Space Tradeoffs
While the preservation of bit security under truncation gives some confidence in this technique, it only considers a setting without precomputation. Unfortunately, truncation can decrease the security of primitives when the adversary is given some auxiliary input. Let us illustrate this considering the one-wayness of a permutation over bits, which we’ll denote with where . A random permutation does have bits of bit security, i.e., given a random , finding will require invocations to in expectation. But given bits of advice (that depend on but not the challenge ), it’s possible to invert using just invocations whenever
| (1) |
The general idea is to compute and store values where . On challenge one now applies until one of the stored values is hit, and then continues to apply to until one hits . For functions the best space-time tradeoff are somewhat worse. For random functions an attack with
is achieved by Hellman tables [7] or rainbow tables [8]. So, any permutation can be inverted with time and space, e.g. while random functions can be inverted with (for functions that are not random the existing bounds are somewhat worse [5]).
De, Trevisan and Tulsiani [4] (building on work by Yao [11], Gennaro-Trevisan [6] and Wee [10]) prove that the attack as in eq. (1) is basically optimal as any adversary who inverts a random permutation or function with a range of size must satisfy
| (2) |
Note that the above is seemingly wrong if or as one can invert with no space but queries or space and no queries. The proof of the lower bound assumes adversary must always query on its output, so is at least , and is assumed to be at least which is required to store the challenge.
The same lower bound applies for random functions, though note that in this case it’s not matching the upper bound. The actual lower bound is slightly more general showing that must hold for any adversary who inverts with some probability , but for this introduction we assume is some constant.
1.3 Time-Space Tradeoffs with Truncation
Now consider the truncated setting, where we have a random function or permutation , but must only invert it on outputs sampled from some truncated range, say where the first bits are set to zero. We’ll denote this range with where .
Now one can store the preimages of on using bits, and using this advice, it’s possible to find the preimage for any without invoking at all. As outlined above, this means , but it still contradicts the lower-bound from eq. (2) as for .
The main result in this work is Theorem 4, which basically states that the issue just described is the only reason for the lower bound to fail: as long as is too small to basically store the entire function table of the truncated function, the lower bound applies. While we state and prove the bound for functions, it can easily be adapted to permutations. We discuss other truncated primitive in the open problems Section 6.
The technical result implying Theorem 4 is stated in Lemma 5. It uses a so-called compression argument: it shows how an adversary that inverts a random function can be turned into an encoding for the function table of , where the length of the encoding depends on the space and time efficiency of the adversary. As the function table of a random function is incompressible, as stated in Fact 1, we can derive a lower bound on the space and time complexity of the adversary.
1.4 The Compression Argument
The starting point for proving the Lemma is a proof of the lower bound for inverting random functions from [1]. This proof is less elegant than the proof of De et al. from [4], which uses a high level argument about sets, while [1] provide pseudocode of the encoding and argue about the length of its output.
Their encoding is basically Algorithm 1 in this paper for . It takes as input an adversary who is guaranteed to invert some function on an fraction of the range and making no more than queries.
is then invoked on some challenges, and every time inverts a challenge we get one entry in the function table “for free”. With every invocation, can make up to queries to , and those queries can later no longer be used as challenges, i.e., we “spoil” up to of the challenges with every succesful inversion. Overall we can get to invert something in the order of challenges before running out of unspoiled challenges, and thus get an encoding that is around bits shorter than the function table. As a random function is incompressible, this then implies that the advice used by must be around . That’s how the lower bound in [1] is proven.
In the truncated setting when , is only guaranteed to invert on an fraction of the sparse domain , and the above argument would only gives us a bound.
A key observation is the fact that we get a lower bound even in the truncated case if we assume that not much more than a fraction of the queries made by map to the sparse domain, as then each compressed entry only spoils around of the available challenges.
To prove our lemma we now make a case distinction. If for every compressed value we typically do not spoil more than potential challenges, we use the observation above, so in this case we use basically the same argument as in [1].
In the other case the queries made by fall into the sparse set at least 12 times more often than a random query would. Knowing that the output on a query is in means we can encode it using just bits. We will encode the positions of the queries made by that fall into (there are around such queries) and then encode each output using just bits. As the queries that fall into are sufficiently dense (i.e., 12 times denser than a random query would), encoding those positions uses a bit less per entry than the bits we save knowing the entry is in . So overall we save bits, and thus for this case can conclude (again using the incompressibility of a random function table) that the space used by must be at least .
2 Notation and Basic Facts
We use brackets like or to denote ordered sets (aka. lists) and unordered sets, respectively. denotes some domain of size , and for notational convenience we assume is a power of two and identify with . For some and , we’ll denote with some subset of of size (the truncated range), say , the set of bits strings that start with zeros, but in principle any subset whose elements can be compressed to (not much more than) bits will do.
For a function and a set , we denote with the set , similarly for a list , is the list .
Randomized adversaries are treated as if they were deterministic. However, this is w.l.o.g. as they are only invoked within encoding/decoding procedures that have access to a shared randomness that can be used to derandomize them.
The following well known fact captures the fact that one cannot compress a random string.
Fact 1 (from [4]).
For any randomized encoding procedure and decoding procedure where
we have
Fact 2.
If a set is at least dense in , i.e., , and is known, then can be encoded using additional bits.
This fact follows from the inequality , which implies . Encoding can be done by identifying which elements to choose from .
3 Main Theorem and Lemma
De, Trevisan and Tulsiani [4] show that the simple time-space tradeoff eq. (1) for inverting permutations is basically tight.
Theorem 3 ([4], as stated in [1]).
Fix some and an oracle algorithm that takes an advice string of length and makes at most oracle queries. If with non-neglibile probability for a random function (or permutation) there exists a string such that
then
| (3) |
In this work we show that this result extends to the case where the challenge comes from a sparse set
Theorem 4 (main).
Fix some and an oracle algorithm that takes an advice string of length and makes at most oracle queries. If with non-neglibile probability for a random function (or permutation) there exists a string such that
then either
| (4) |
The theorem is proven using a compression argument as stated in the following lemma.
Lemma 5 (generalized version of a Lemma from [1]).
Let and be as Theorem 4, and assume . There are randomized encoding and decoding procedures such that if is a function and for some
then
| (5) |
and the length of is at most
| (6) |
Moreover for some which is defined by the encoding algorithm: if , we can improve the length of the encoding to
| (7) |
The above is the average number of queries that land in the sparse set (i.e., s.t. ) that (when invoked by as defined by Algorithm 1 below) makes for every value it inverts. As makes at most queries per challenge we have . For a random query we have with probability , so if is significantly larger than , this means that the queries made by are special in the sense that they map to much more often than random queries would. We use this crucial observation for a case distinction, deriving the left or the right hand side of eq. (4), depending on whether is below or above .
4 How Theorem 4 follows from Lemma 5
4.1 Proof of Thm. 4 if
If , the theorem follows from Lemma 5 using Fact 1 as follows: assume the function table of in the lemma is chosen uniformly at random (i.e., in Fact 1 is a uniform bit string), then the term in eq. (6) can be lower bounded as
Reordering we get
using our assumption that
So as claimed (on the rhs of eq. (4) in the Theorem). Note that the extra assumption that in the lemma doesn’t matter, as if it’s not satisfied the theorem is trivially true.
4.2 Proof of Thm. 4 if
If , the theorem again follows from Lemma 5 using Fact 1, i.e., we again assume the function table of in the lemma is chosen uniformly at random (i.e., in Fact 1 is a uniform bit string), and now the term in eq. (7) can be lower bounded as
Note that, as in this case we use eq. (7) rather than eq. (6), we have an extra term on the lhs. Reordering we get
and thus as claimed.
5 Proof of Lemma 5
We always assume that if outputs some value , it makes the query at some point. This is basically w.l.o.g. as we can turn any adversary into one satisfying this by making at most one extra query. If at some point makes an oracle query where , then we also w.l.o.g. assume that right after this query outputs and stops.
5.1 The Size of the Encoding
We will now upper bound the size of the encoding of as output in line (15) of the algorithm.
Let be the average number of elements we added to the bad set for every element added to the good set , then
| (8) |
To see this we note that when we leave the while loop (see line (8) of the algorithm ) it holds that
| (9) |
- :
-
Instead of we will actually encode the set , from this encoding (who gets , and thus knows ) can then reconstruct . We claim that the elements in are whp. at least dense in (equivalently, ). By Fact 2 we can thus encode using bits (the extra bits are used to encode the size of which is required so decoding later knows how to parse the encoding). To see that the ’s are dense whp. consider line (9) in which states . If we replace with , then the ’s would be whp. close to dense in as has size and the are uniformly random. As , using instead of will decrease the density by at most a factor . If we don’t have this density, i.e., , we consider encoding to have failed.
- :
-
Require bits as each . A more careful argument (using Fact 2 and that the are on average at most ) requires bits.
- :
-
Requires bits (using that and ).
- :
-
Is bits long.
- :
-
This is a list of elements in and can be encoded using bits, but we’ll encode it with less if is large.
Summing up we get
Using the assumption in the statement of the lemma, which in turn implies , we get
Plugging in the bound from eq. (8) for we get
proving eq. (6) in the Lemma.
5.2 Improved bound if
We’ll now prove a bound on the length of the encoding as stated in eq. (7) in the lemma. Here we assume that is large, i.e., . The bound on the length of the encoding improves the expression we got without this assumption, i.e., eq. (6), by bits. We will achieve this by improving the length of the encoding of from the trivial to by exploiting the fact that now a large fraction of the elements in falls into the sparse set , i.e.,
Claim 6.
If , can be encoded using bits.
It remains to prove the claim. By eq. (8) and eq. (9) , and , which implies
I.e., a fraction of the queries made during decoding falls into . Using this with Fact 2 we can encode which of the queries from map into using bits. For any , we can encode using bits as .
We can now encode by first encoding the positions of in , and then and separately, which requires
bits as claimed.
6 Conclusion and Open Problems
In this work we showed that the known time-space tradeoff for inverting random functions or permutations also holds for truncated outputs almost up to the point where is big enough to store the entire truncated function table. This shows that the general idea of truncating cryptographic primitives as suggested in [2] is secure even when preprocessing is considered (as long as the truncated function table is big enough so it can’t be stored in practice).
While in this paper we only considered one-wayness of random functions, we believe that our result can be adapted to time-space lower bounds for other primitives, showing the bounds apply also for their truncated analogues. An interesting example considered in [2] is the discrete logarithm problem, for which a time-space lower bound is known [3].
A likely more challenging but particularly interesting case is the adaption of the “beyond Hellman” proofs of space from [1] to the truncated setting. The key primitive in [1] are functions for which a time-space lower bound of (for any constant ) can be proven. This improves on the lower bound for “normal” functions, and the reason this doesn’t contradict known upper bounds (by rainbow tables) is the fact that those functions cannot be efficiently evaluated in forward direction (but their entire function table can be computed in time ). A proof that truncation is secure for these functions would allow for better security. Currently those functions are deployed in the proof of space underlying the chia.net blockchain, but there’s a trade-off that farmers (who are supposed to dedicate space analogous to miners dedicating computation in Bitcoin) can make: by dropping a few bits of every entry, they save space at the cost of having to do some extra computation when computing the proofs111https://github.com/Chia-Network/drplotter. One could use a truncated version of those functions, which would make the initialization of the space for the farmers somewhat more costly, but it would make such “bit dropping” attacks much less attractive.
References
- [1] Hamza Abusalah, Joël Alwen, Bram Cohen, Danylo Khilko, Krzysztof Pietrzak, and Leonid Reyzin. Beyond hellman’s time-memory trade-offs with applications to proofs of space. In Tsuyoshi Takagi and Thomas Peyrin, editors, ASIACRYPT 2017, Part II, volume 10625 of LNCS, pages 357–379. Springer, Heidelberg, December 2017. doi:10.1007/978-3-319-70697-9_13.
- [2] Foteini Baldimtsi, Konstantinos Chalkias, Panagiotis Chatzigiannis, and Mahimna Kelkar. Truncator: Time-space tradeoff of cryptographic primitives. Cryptology ePrint Archive, Paper 2022/1581, 2022. URL: https://eprint.iacr.org/2022/1581.
- [3] Henry Corrigan-Gibbs and Dmitry Kogan. The discrete-logarithm problem with preprocessing. In Jesper Buus Nielsen and Vincent Rijmen, editors, Advances in Cryptology – EUROCRYPT 2018 – 37th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tel Aviv, Israel, April 29 – May 3, 2018 Proceedings, Part II, volume 10821 of Lecture Notes in Computer Science, pages 415–447. Springer, 2018. doi:10.1007/978-3-319-78375-8_14.
- [4] Anindya De, Luca Trevisan, and Madhur Tulsiani. Time space tradeoffs for attacks against one-way functions and PRGs. In Tal Rabin, editor, CRYPTO 2010, volume 6223 of LNCS, pages 649–665. Springer, Heidelberg, August 2010. doi:10.1007/978-3-642-14623-7_35.
- [5] Amos Fiat and Moni Naor. Rigorous time/space tradeoffs for inverting functions. In Cris Koutsougeras and Jeffrey Scott Vitter, editors, Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, May 5-8, 1991, New Orleans, Louisiana, USA, pages 534–541. ACM, 1991. doi:10.1145/103418.103473.
- [6] Rosario Gennaro and Luca Trevisan. Lower bounds on the efficiency of generic cryptographic constructions. In 41st FOCS, pages 305–313. IEEE Computer Society Press, November 2000. doi:10.1109/SFCS.2000.892119.
- [7] Martin E. Hellman. A cryptanalytic time-memory trade-off. IEEE Trans. Inf. Theory, 26(4):401–406, 1980. doi:10.1109/TIT.1980.1056220.
- [8] Philippe Oechslin. Making a faster cryptanalytic time-memory trade-off. In Dan Boneh, editor, Advances in Cryptology – CRYPTO 2003, 23rd Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 2003, Proceedings, volume 2729 of Lecture Notes in Computer Science, pages 617–630. Springer, 2003. doi:10.1007/978-3-540-45146-4_36.
- [9] Shun Watanabe and Kenji Yasunaga. Bit security as computational cost for winning games with high probability. In Mehdi Tibouchi and Huaxiong Wang, editors, Advances in Cryptology – ASIACRYPT 2021 – 27th International Conference on the Theory and Application of Cryptology and Information Security, Singapore, December 6-10, 2021, Proceedings, Part III, volume 13092 of Lecture Notes in Computer Science, pages 161–188. Springer, 2021. doi:10.1007/978-3-030-92078-4_6.
- [10] Hoeteck Wee. On obfuscating point functions. In Harold N. Gabow and Ronald Fagin, editors, 37th ACM STOC, pages 523–532. ACM Press, May 2005. doi:10.1145/1060590.1060669.
- [11] Andrew Chi-Chih Yao. Coherent functions and program checkers (extended abstract). In 22nd ACM STOC, pages 84–94. ACM Press, May 1990. doi:10.1145/100216.100226.
