FCSC2025 - Ça tourne au vinaigre

Ça tourne au vinaigre

[Grand-père] Les enfants, je pense que le cuisinier était un peu trop déterminé à implementer UOV à sa sauce. La cryptographie multivariée, ce n'est pas pour n'importe qui. J'ai réussi à collecter quelques signatures, on devrait pouvoir lui casser son implémentation. Pour ça, il suffit de pren...
[Petit-enfant 1] Papi ?
[Petit-enfant 2] Papi, ça va ?
[Petit-enfant 1] Il ne respire plus... C'est terrible.
[Petit-enfant 2] Mais il n'a même pas donné la clé publique, comment va-t-on faire ?

[Grandpa] Kids, I think the cook was a little too determined to implement UOV his own way. Multivariate cryptography is not for everyone. I managed to collect some signatures, we should be able to break his implementation. For that, we just need to tak...
[Grandchild 1] Grandpa?
[Grandchild 2] Grandpa, are you okay?
[Grandchild 1] He’s not breathing....
[Grandchild 2] But he didn’t even give the public key, how are we going to proceed?

Files attached to the challenge: ca-tourne-au-vinaigre.sage, output.txt

Technical description of the challenge

This challenge presents us with a homemade implementation of UOV, a digital signature scheme based on the hardness of solving a system of multivariate quadratic equations.

The goal of the challenge is to craft a signature for a given message, valid under the key which generated the 1600 given signatures. The public part of this key is not given.

TLDR

With a fraction of the 1600 pairs $(msg, sig)$, one can recover the columns of the variables substitutions relative to the vinegar variables. This set of columns can be completed with a base of its kernel (ie. the orthogonal space) to form a complete change of variable matrix. This allows to recreate a secret system by solving for the coefficients of each polynomial (in theory there should have been 1591 coefficients per polynomial, in practice there was 1225 due to a small fail by the challmakers, anyways it's solvable with the 1600 given signatures), equivalent to the "true" secret system up to a variables substitution in the oil space. We can use this equivalent secret system to craft our signature: derive the values of the vinegar variables from the message, solve the secret (now linear) system in the oil variables, perform the variables substitution and return the variables in the public base.

Music to listen to when reading this write-up: Purple Disco Machine, Kungs - Substitution

Detailed write-up

Intoduction on UOV

Given a certain number $m$ of variables $x_0, \dots, x_{m-1}$ (called oils) as well as $v$ more variables $x_m, \dots, x_{m+v-1}$ (called vinegars), one can construct a multivariate quadratic polynomial as follows (declaring $n = m+v$):

\[ P(x_0, \dots, x_{n-1}) = \delta + \sum_{i=0}^{n-1} \gamma_i x_i + \sum_{i=0}^{m-1} \sum_{j=m}^{n-1} \alpha_{i,j} x_i x_j + \sum_{i=m}^{n-1} \sum_{j=i}^{n-1} \beta_{i,j} x_i x_j \quad \text{for } k = 1, \dots, o \]

Such a polynomial is at the core of the UOV scheme. The constant coefficient and the first sum of this definitions are the easy part; they make up an affine function in all the variables. The second and third sum contain terms of degree 2, symbolizing the oils mixing with the vinegars (for the second sum) and the vinegars mixing with each other (for the third sum). Crucially, there are no oil–oil interaction terms (i.e., no $x_i​x_j$​ with both $i,j<m$), which ensures this polynomial is affine once the vinegar variables are fixed later on.

This polynomial minus the constant coefficient can be seen as a bilinear form $B(x, y) = \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_{i,j} x_i y_j$ evaluated at $(x, x)$, with $c_{i,j} = 0$ for $i,j<m$. Note that all bilinear forms can be represented by $n \times n$ matrices filled with the coefficients $(c_{i,j})_{0\le i,j\lt n}$, such that $B(x, y) = x\top M_B y$. Our polynomial can thus be represented as $P(x) = x\top M_P x + \delta$.

Here is an example representation of such a matrix at a smaller scale and with the oils at the end instead of the beginning of the vector $x$. Grey area represent non-zero values.

This representation is taken from https://pqc-spring-school.nl/wp-content/uploads/2024/03/multivariate_handout.pdf

A private UOV key consists of a stack of $m$ such quadratic polynomials, each structured as described above. Because the polynomials are only non-linear in oil–vinegar and vinegar–vinegar terms, they form a special kind of system: when the vinegar variables are fixed arbitrarily (which is performed during signature generation), the resulting system becomes linear in the oil variables. This makes the system efficiently solvable, for example using Gauss elimination, as there are $m$ equations and $m$ oil variables.

To hide the structure of the UOV system and ensure security, the public key is generated by applying an invertible linear transformation to both the input variables and the output polynomials. This yields a system of $m$ general-looking quadratic equations in $n$ variables (all mixed-up together in monomials of degree 2), which appears hard to solve without knowledge of the original oil–vinegar separation. This transformed system is published as the public key, while the original structured system and the transformations together make up the private key.

To generate a signature using the private key, the signer begins by deriving values for the vinegar variables from the message, as well as right-hand sides for each equation (again from the message). Once these are fixed, each of the $m$ quadratic equations becomes linear in the oil variables, thanks to the structure of the system. The signer then solves this resulting linear system to find a valid set of oil variables that satisfy the private equations. The complete set of $n=m+v$ variables (both oil and vinegar) is then transformed using the inverse of the private input transformation to obtain a valid preimage for the public system. This preimage serves as the signature corresponding to a given hash value or message digest.

Thinking about the challenge

On a first inspection of this challenge, one can be influenced by the variable naming suggesting 24 oil variables and 36 vinegar variables. Indeed, this matches up with the sign() procedure fixing the last 36 variables before solving a linear system in the first 24 ones. This is also in line with all the literature available, suggesting that a proper UOV scheme should have around 2/3 as much oil variables as it has vinegar variables

Fold
class UOV:
    n = 60
    m = 24
    F = GF(256)
    Fxi = PolynomialRing(F, [f"x{i}" for i in range(n)])
    xi = list(Fxi._first_ngens(n))
    v = n - m

    # [...]

    def sign(self, message):
        shake = SHAKE256.new()
        shake.update(message)
        # Target values
        hash_values = self.bytes_to_vec(shake.read(self.m))
        # Vinegar values
        v_values = list(self.bytes_to_vec(shake.read(self.v)))

        mat = []
        constant_vec = []
        # Fix vinegar values and deduce a linear system in the oil variables
        for secret_pol in self.secret_system:
            new_secret_pol = secret_pol(self.xi[:self.m] + v_values)
            mat.append([new_secret_pol.coefficient({self.xi[i]: 1}) for i in range(self.m)])
            constant_vec.append(new_secret_pol.constant_coefficient())
        mat = Matrix(self.F,mat)

        # Deterministic signature fails if the matrix is not invertible
        if not mat.is_invertible():
            return 0

        constant_vec = vector(self.F, constant_vec)
        # Constant part of the linear system
        output_vec = constant_vec + hash_values
        # Oil values can be deduced with linear system solving
        o_values = mat.inverse() * output_vec
        # Oil and vinegar values are aggregated
        ov_values = vector(self.F, list(o_values) + v_values)
        # They are transformed to the public coordinates
        x_values = self.secret_cov.inverse() * ov_values

        return self.vec_to_bytes(x_values)

However, there is a small mix-up (pun intended) in the generate_secret_system() procedure

Fold
class UOV:
    n = 60
    m = 24
    F = GF(256)
    Fxi = PolynomialRing(F, [f"x{i}" for i in range(n)])
    xi = list(Fxi._first_ngens(n))
    v = n - m

    # [...]

    def generate_secret_system(self):
        self.secret_system = []
        for _ in range(self.m):
            curr_pol = self.F.random_element()
            for i in range(self.n):
                curr_pol += self.F.random_element() * self.xi[i]
                for j in range(max(self.v, i), self.n): # <---- HERE !!!!
                    curr_pol += self.F.random_element() * self.xi[i] * self.xi[j]
            self.secret_system.append(curr_pol)

Indeed, the nested loop mixing any variable with the vinegar variables is indexed between v (36) and n (60), so it only mixes with the last 24 variables!

As such, only the last 24 variables can be considered true vinegar variables in the sens that they mix with any other variables. For the rest of this writeup, I will call the other vinegar variables (between index 24 and 36) winegar. To be clear, these winegar variables are determined directly from the message when signing (so they act as vinegars here), but are not mixed with the oils in the secret system (and thus act as oils there). This, however, has very little impact on actually solving the challenge, as you will see later on.

In the challenge, the public key is not provided, but can we recover it ? As a reminder, the public key is a system of $m = 24$ quadratic equations in $n = m + v = 60$ variables. Thus, each equation is made of $1$ (constant) $+ n$ (degree 1) $+ \frac{n*(n+1)}2$ (degree 2) $= 1891$ coefficients. This sounds too much, as we only have 1600 signatures to our disposal. However, the private key polynomials each have $1 + n + \frac{m*(m+1)}2 = 1225$ coefficients (should have been $1 + n + \frac{v*(v+1)}2 = 1591$ if the secret polynomial was correctly implemented), which is far within our reach (< 1600).

To recover the secret polynomials, we first need to overcome a hurdle: uncovering the change of variables. This change of variable can be seen as changing base in our oil-vinegar space from a "good base" $(x_i)_{0 \le i < n}$ where our oil and vinegar subspaces are clearly distinct to a "bad base" $(y_i)_{0 \le i < n}$ where they look mixed up. This change of variable is represented with an invertible square matrix, let us call it $G$.

\[ y_i = Gx_i, \quad x_i = G^{-1}y_i \]

Solving the challenge

Let us come down to earth, and tackle the real data in front of us. I started off by reformating the input to have all the meaningful data at my fingertips.

Fold
from Crypto.Hash import SHAKE256
import json

n = 60
m = 24
v = n-m
F = GF(256)
Fxi = PolynomialRing(F, [f"x{i}" for i in range(n)])
xi = list(Fxi._first_ngens(n))

with open('output.txt', 'r') as f:
    data = json.load(f)
print("Data is good")

signatures = [{"msg":bytes.fromhex(d), "sig":bytes.fromhex(data[d])} for d in data]
for i in range(len(signatures)):
    s = SHAKE256.new()
    s.update(signatures[i]["msg"])
    _m = s.read(m)
    _v = s.read(v)
    signatures[i]["res"] = [F.from_integer(z) for z in _m] # right end side of the system
    signatures[i]["winegars_and_vinegar"] = [F.from_integer(z) for z in _v] # fixed (w/v)inegar values

As we know $v = 36$ "bad" base vectors for each signature (the signatures[i]["winegars_and_vinegar"]), and we want to transform them into $(x_i)_{0 \le i < v} = ((1, 0, 0, \dots, 0), (0, 1, 0, \dots, 0), ...)$, we can compute each corresponding column of the $G$ matrix as the solution to the linear system :

Fold
# vectors_winegars_and_vinegars[i] represents the 1600 values of x_i ("good vinegars"), one for each signature
vectors_winegars_and_vinegars = [Matrix(F, [signatures[i]["winegars_and_vinegar"][_v] for i in range(len(signatures))]).T for _v in range(v)]
# Line i of matrix_sigs represents the values of the 60 "bad variables" for the i-th signature
matrix_sigs = Matrix(F, [[F.from_integer(signatures[i]["sig"][j]) for j in range(n)] for i in range(len(signatures))])

transpo_winegars_and_vinegars = []

for _v in range(v):
    # We want a linear combination of the bad variables which equals the _v-th good vinegar
    temp = matrix_sigs.solve_right(vectors_winegars_and_vinegars[_v])
    transpo_winegars_and_vinegars.append(temp)

transpo_winegars_and_vinegars = block_matrix(F, [[z.T] for z in transpo_winegars_and_vinegars])
# sage: transpo_winegars_and_vinegars
# 36 x 60 dense matrix over Finite Field in z8 of size 2^8 (use the '.str()' method to see the entries)

transpo_winegars_and_vinegars now transforms a vector of 60 "bad" variables into the 36 "good" vinegars. It can be seen as a basis of the $V$ space, generated by the vinegar vectors.

Since the spaces $V$ and $O$ ($O$ is the space generated by the oil vectors) are orthogonal, it is possible to complete a basis of $V$ into a basis of the entire space simply by appending a basis of $O = Ker⁡(V)$ to our matrix transpo_winegars_and_vinegars.

TV = transpo_winegars_and_vinegars.T
O = TV.kernel() # kernel is left_kernel so needed to transpose beforehand

full_substitution = block_matrix(F, [[O.basis_matrix().T, TV]]).T
# base of O prepended to the base of V, to keep the order consistent with the challenge

Keep in mind that while the substitution is probably not equal to the "real" substitution that was truly used to generate the signatures, it doesn't matter. All that matters is that we have separated the oil and vinegar spaces, and have basis for both. This will allow us to represent the system in its "private" form, and solve for private polynomials coefficients

First, let's transform all our "bad" variables to "good" ones:

for i in range(len(signatures)):
    signatures[i]["good_basis"] = full_substitution*matrix_sigs[i]

To help us recover the coefficients of our "secret" (good-looking) polynomials with Gaussian elimination, we need to construct a matrix containing all the values of our monomials:

Fold
mat = []
for k in range(len(signatures)):
    l = [F.from_integer(1)] # Constant monomial, x^0 = 1
    for i in range(n):
        l.append(signatures[k]["good_basis"][i]) # Monomials of degree 1
        for j in range(max(v, i), n):
            l.append(signatures[k]["good_basis"][i]*signatures[k]["good_basis"][j]) # Monomials of degree 2
    mat.append(l)
mat = Matrix(F, mat)

Fun fact: when solving the challenge, only here did I realize the mistake in the creation of the secret polynomial! My script would not run because of some silly python mistake, so I compared my secret polynomials to "true" secret polynomials produced by the challenge code, and I noticed the number of coefficients was weird...

We can now use Gaussian elimination to get out all the coefficients of our good-looking polynomials:

Fold
secret_system = []
print("Solving system (looking for secret polynomials coefficients)")
assert len(signatures[0]["res"]) == m
for u in range(m):
    B = Matrix(F, [signatures[k]["res"][u] for k in range(len(signatures))]).T
    secret_poly_coeffs_vec = mat.solve_right(B)
    # Now we can construct the polynomial in the same order we put our monomials
    secret_poly = secret_poly_coeffs_vec[0][0] # Starting with the constant monomial
    index_coeff = 1
    for i in range(n):
        secret_poly += secret_poly_coeffs_vec[index_coeff][0]*xi[i] # Add the monomes of degree 1
        index_coeff += 1
        for j in range(max(v, i), n):
            secret_poly += secret_poly_coeffs_vec[index_coeff][0] * xi[i] * xi[j] # Add the monomes of degree 2
            index_coeff += 1
    secret_system.append(secret_poly)
    print("Solved line", u)

Great! We have $m$ good-looking polynomials! Let's use them to produce a signature:

Fold
msg = b"Un mauvais vinaigre fait une mauvaise vinaigrette!"

s = SHAKE256.new()
s.update(msg)
res = [F.from_integer(z) for z in s.read(m)]
good_win_and_vin = [F.from_integer(z) for z in s.read(v)]

_mat = []
constant_vec = []

for secret_pol in secret_system:
    new_secret_pol = secret_pol(xi[:m] + good_win_and_vin)
    _mat.append([new_secret_pol.coefficient({xi[i]: 1}) for i in range(m)])
    constant_vec.append(new_secret_pol.constant_coefficient())
_mat = Matrix(F,_mat)

output_vec = vector(F, res) - vector(F, constant_vec)
# Oil values can be deduced with linear system solving
o_values = _mat.inverse() * output_vec # Aka _mat.solve_right(output_vec)
# Oil and vinegar values are aggregated
ov_values = vector(F, list(o_values) + good_win_and_vin)

# They are transformed to the public coordinates
x_values = full_substitution.inverse() * ov_values
sig = bytes([_x.to_integer() for _x in x_values])

flag = sig.hex()
print("FCSC{"+flag+"}")

Thought process while solving the challenge