X-MAS CTF 2019 - Hashed Presents v1 & v2 Writeup

These two challenges were almost the same, so I’m doing the writeup together. In reality the first one was a lot more solved since it has an unintended vulnerability using the fact that the step value was an even number, but since the indended solution is necessary to solve the v2 we’ll ignore this.

Hashed Presents v1 (25 solves)

We have to generate 10 collisions to a custom hash function that works like this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class secureHash(object):
def __init__(self):
self.bits = 128
self.mod = 2**128
self.mask = 2**128 - 1
self.step = 23643483844282862943960719738L
self.hash = 9144491976215488621715609182563L

def update(self, inp):
for ch in inp:
self.hash = ((self.hash + ord(ch)) * self.step) & self.mask

def hexdigest(self):
x = self.hash
out = ''
for i in range(self.bits/8):
out=hex(x & 0xff)[2:].replace('L','').zfill(2)+out
x >>= 8
return out

So basically we have that given a string where the ’s are digits (in their decimal representation) the hash function computes where is the step value in the code and is the hash value. The idea here is pretty simple: we want to find some such that there exists a string such that and such that the are small enough to keep a valid character. If we do this, we’ll have that and so we’ll have a collision string made by the string . So, how we can construct this vector? The easiest method (or, at least, the standard one) as far as i know is using the Lenstra-Lenstra-Lovász lattice basis reduction algorithm. I don’t want to go into the details of the algorithm (since it’s pointless here and it’s already implemented in Sagemath), but basically given a basis of a lattice in the algorithm gives you another basis that is short and nearly orthogonal in polynomial time. So, everything we need to do is to setup a basis for our lattice and then apply LLL. I choose the easiest base and setup the matrix in the following way

where (since the length of the given strings varies from 30 to 35) and is a parameter that we can let vary over the integers. So we generate some vectors like this from this matrix and we’re done, eventually trying to sum up the ’s with different values of the ’s in order to match some.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mod = 2^128
n = 30
s = 23643483844282862943960719738

for t in range(1,50):
#print(t)
A = [[0 for i in range(n+1)] for i in range(n+1)]
A[0][n] = mod * t
for i in range(1,n+1):
A[i][n] = power_mod(s,(n - i),mod)
A[i][i-1] = 1

A=Matrix(A)
A=A.LLL()
for i in range(n+1):
if A[i][n] == 0 and A[i][n-1] == 0:
print(A[i])

Hashed Presents v2 (8 solves)

The problem here is quite similar, the only differences are: now the step is an odd number, and that there is a xor instead of an addition in the update function

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class secureHash(object):
def __init__(self):
self.bits = 128
self.mod = 2**128
self.mask = 2**128 - 1
self.step = 23643483844282862943960719737L
self.hash = 9144491976215488621715609182563L

def update(self, inp):
for ch in inp:
self.hash = ((self.hash ^ ord(ch)) * self.step) & self.mask

def hexdigest(self):
x = self.hash
out = ''
for i in range(self.bits/8):
out=hex(x & 0xff)[2:].replace('L','').zfill(2)+out
x >>= 8
return out

So, what’s changes with the xor? From the theoretical point of view, almost nothing. In fact we can reconduct the first problem to the second one very easily: suppose that we have a collision for the first cipher and , then calling the value of after the th addition in the update function (i.e. when we are processing the th character), then since we can easily find a vector such that and do the same with and , obtaining a collision for the second hash function. So our strategy will be:

  • Finding a string that matches the hash of the given string but for the other hash function
  • Finding a collision on the first hash function
  • Retrieving the collision for the second hash function using the collision on the first one
    So that’s the end, right? No.

The problem here is that this method in general generates a collision, but we’ve no control on the size of the components of this collision (and, in general, we’ll have something that is 400+ and so not an ascii character).

What we need to do is to make the collision of the first hash as small as possible. We don’t care about it being positive, but we need to find a vector that is as close as possible to 0. So in practice we want to solve a Close Vector Problem in the lattice generated by our LLL vectors, and in particular we want to find the closest vector to (the vector we generated on the first point of our strategy). Since CVP is NP-Hard we’re going to approximate it. As far as I know the best way to do this is Babai’s algorithm, that is a polynomial time algorithm that works like this (in sage):

1
2
3
4
5
6
7
8
def Babai_closest_vector(B, target):
M = B.LLL()
G = M.gram_schmidt()[0]
small = target
for i in reversed(range(M.nrows())):
c = ((small * G[i]) / (G[i] * G[i])).round()
small -= M[i] * c
return target - small

where B is a basis of the chosen lattice. So now that we have a small collision for the first problem we can simply generate the collision for the second one and solve the problem.