6
$\begingroup$

I'm curious as to how a backdoor can be built into a hash function. What might it take the form of? What's probability of getting one that can be backdoored in a way that can actually undermine its security? Could this backdoor be asymmetric?

asked Apr 15, 2015 at 2:04
$\endgroup$
3
  • $\begingroup$ Every backdoor is asymmetric in some fashion. That's what makes it a backdoor. $\endgroup$ Commented Apr 15, 2015 at 4:42
  • $\begingroup$ Is it possible to backdoor a hash function and still have it compatible with other implementations? $\endgroup$ Commented May 15, 2015 at 20:30
  • $\begingroup$ a paper for future readers who might be interested Malicious sha1 via modifying the constants used $\endgroup$ Commented Aug 8, 2016 at 2:49

3 Answers 3

10
$\begingroup$

Here is a "backdoored" hash function:

Let $p = 2q + 1$ be a big prime of length 2048ドル$ bits, such that $q$ is also prime. Let $a$ be an integer of order $q$ modulo $p,ドル i.e. $a \neq 1$ but $a^q = 1 \pmod p$; it can be shown that $a = 4$ is always a valid solution. Let $s$ be a (secret) integer between 1ドル$ and $q-1,ドル and let $b = a^s \pmod p$.

Then define operation $\phi(x, y)$ such that input $x$ is a sequence of 2048ドル$ bits, and $y$ is a sequence of 2042ドル$ bits (not 2048ドル$); the output is a 2048ドル$-bit value:

  • Concatenate $x$ and $y$ and split it back into $u$ and $v,ドル such that $u$ and $v$ both have length 2045ドル$ bits each: $$u || v = x || y$$

  • Interpret $u$ and $v$ as unsigned big integers, using some convention (e.g. big-endian). Note that both $u$ and $v$ are lower than 2ドル^{2045}$.

  • Compute: $$z = a^u b^v \pmod p$$

  • $\phi(x, y)$ is the encoding of $z$ over exactly 2048ドル$ bits (there again, using a fixed convention).

Note that $\phi(x, y) \neq 0$ for all $x$ and $y$.

Now that you have $\phi,ドル define a hash function $h$ as follows:

  • For input $m,ドル pad it with some "removable" padding sequence (e.g. a Merkle-Damgård compliant padding) so that the total length of $m$ is now a multiple of 2042ドル$.
  • Split $m$ into $n$ blocks of 2042ドル$ bits (denoted $m_1,ドル $m_2,ドル ... $m_n$).
  • Define $x_0 = 0$ (a sequence of 2048ドル$ bits, all of value 0ドル$).
  • For $i = 1$ to $n,ドル define: $x_i = \phi(x_{i-1}, m_i)$
  • Define the hash output $h(m)$ to be SHA-256($x_n$).

Let's see the security of this construction:

  • Preimages: for a given $t,ドル finding $m$ such that $h(m) = t$ implies, in particular, finding some $x_n$ such that SHA-256($x_n$) $= t,ドル so our function $h$ is at least as much preimage-resistant as SHA-256.

  • Second preimages: second preimages are a special case of collisions, so $h$ will resist second preimages at least as well as it resists collisions. Ideally we would want something better (i.e. resistance up to 2ドル^{256}$ for second preimages, even though collisions can be found in effort 2ドル^{128}$), but since collisions should be infeasible anyway, this is probably acceptable.

  • Collisions: suppose that $m \neq m'$ and $h(m) = h(m')$; $m$ yields a sequence of $n$ words $x_1$ to $x_n,ドル while $m'$ yields $x'_1$ to $x'_{n'}$. We suppose, without loss of generality, that $n' \geq n$. Then one of the following holds:

    • $x_n \neq x'_{n'}$ but SHA-256($x_n$) $=$ SHA-256($x'_{n'}$). We have found a collision on SHA-256.

    • There is an index $i$ such that $x_{i-1} \neq x'_{i-1}$ but $x_i = x'_i$. This means that we know integers of at most 2045ドル$ bits $u,ドル $v,ドル $u'$ and $v'$ such that $(u, v) \neq (u', v'),ドル but $a^u b^v = a^{u'} b^{v'} \pmod p$. This implies that $a^{u-u'} = b^{v'-v} \pmod p$. Since $a$ and $b$ have order $q$ (greater than 2ドル^{2046}$) and $u-u'$ and $v-v'$ cannot be both 0ドル,ドル then $u-u'$ and $v'-v$ are invertible modulo $q,ドル and we can write $b = a^g \pmod p$ with $g = (u-u')/(v'-v) \pmod q$ (i.e. $g$ is the secret value $s$). We just solved discrete logarithm.

    • There is an index $j$ such that $n \lt j \leq n'$ and $x'_j = x_n$. By "rewinding" the steps this implies that either we find an instance of the previous situation ($a^u b^v = a^{u'} b^{v'}$ for $(u,v) \neq (u',v')$), or we have $x'_{j-n} = x_0,ドル which is not possible since $x_0 = 0$ and $x'_{j-n} = \phi(x'_{j-n-1}, m_{j-n}) \neq 0$.

    Therefore, finding a collision on $h$ requires either finding a collision on SHA-256, or solving discrete logarithm modulo $p$. Since both are computationally infeasible (as far as we know), this function $h$ can be deemed "secure" as a hash function.


The backdoor is the knowledge of $s$. In all of the above, we used the $a$ and $b$ values "as is". The value of $s$ is not needed anywhere. However, if you know $s,ドル then you can compute collisions at will:

  • Generate two random blocks $m_1$ and $m'_1$. Compute the corresponding $x_1$ and $x'_1$.
  • Let $u$ be the first 2045ドル$ bits of $x_1,ドル and $u'$ be the first 2045ドル$ bits of $x'_1$.
  • Compute $k = (u-u')/s \pmod q$
  • If $k$ is less than 2ドル^{2045},ドル or greater than $q - 2^{2045},ドル then you can find $v$ and $v'$ of 2045ドル$ bits each, such that $v'-v = k \pmod q,ドル from which you can infer two blocks $m_2$ and $m'_2$ that match $(u,v)$ and $(u',v'),ドル respectively. You have your collision.
  • The condition on $k$ works with probability at least 1ドル/2$. If you were unlucky, just try again with new random blocks $m_1$ and $m'_1$.

The above is thus a hash function with a backdoor. However, mind the fine print:

  • The backdoor is explicit. While the knowledge of $s$ is necessary to exploit the backdoor, everybody knows that it is there. Conversely, if you generate $a$ and $b$ as NUMS numbers (thus with $s$ convincingly unknown to everybody), then the hash function is "backdoorless" (but also useless).

  • When you exploit the backdoor, you produce two colliding messages $m$ and $m',ドル from which anybody can recompute $s$. Thus, the backdoor is "one-shot".

  • The backdoor allows finding collisions, not preimages. A hash function designed to have a backdoor for preimages would need something else. Ideally you would use a one-way trapdoor permutation, but (to my knowledge) no such object is currently known, that would yield an output of suitable length (e.g. 256 bits).

answered Jun 14, 2015 at 13:26
$\endgroup$
4
  • $\begingroup$ Know of any way to construct a hash function with a multi-time backdoor? How did you come up with this one? $\endgroup$ Commented Apr 10, 2016 at 8:54
  • $\begingroup$ I came up with that example by my own thinking (I was initially looking for a way to make a hash function based on elliptic curves and with a security that could reduced to that of discrete logarithm). I am not aware of any publication of that construction, other than this answer (whether a StackExchange answer counts as "publication" is another question). $\endgroup$ Commented Apr 11, 2016 at 12:54
  • $\begingroup$ How we can infer $m_2$ from $v$? please explain $\endgroup$ Commented Aug 4, 2017 at 11:47
  • $\begingroup$ By construction, $u||v = x_1 || m_2$. So it's mostly a matter of splitting the bit string at the right place to account for the slight differences in lengths (2042, 2045, 2048 bits). $\endgroup$ Commented Aug 4, 2017 at 12:12
1
$\begingroup$

I'm not very familiar with hash functions, but consider the following toy example: take any hash function $H,ドル and define a new function $H'$ which on input $x$ discards the last bit and applies $H$ to the result. $H'$ is not collision resistant because flipping the last bit of any input does not change the hash. Yet this may not be immediately obvious, and so could be considered a "backdoor".

Of course, the design of such a function would need to be kept secret for the "backdoor" to work, this is why you shouldn't trust any crypto primitive whose design is not public. And indeed, even if the design is public, the backdoor can of course be more subtle than that, this is why you shouldn't trust any crypto primitive which hasn't been extensively studied, even if its design is public.

answered Apr 15, 2015 at 7:47
$\endgroup$
-1
$\begingroup$

The backdoor is possible. Imagine if we calculate the SHA256 of all combinations of a 256bit string, we already have a dictionary containing already a big subset of the SHA256 results. Once matched in the dictionary, the source can be replaced by the corresponding 256bit string. Now I agree and believe that bitcoin could be a conspiracy in that it helps to create SHA256 dictionaries!!! That could create messy conditions.

answered Mar 31, 2019 at 15:04
$\endgroup$
2
  • 4
    $\begingroup$ This is not a back door and even running the Bitcoin network at full power for centuries wouldn't help to find anything anywhere close to full preimages. $\endgroup$ Commented Mar 31, 2019 at 15:11
  • $\begingroup$ Have to upvote. It's not often that I get such a new and exciting conspiracy theory to fret over... $\endgroup$ Commented Mar 31, 2019 at 16:50

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.