Side-Channel: Repeating Repetitions

Intro to the Side-Channel Universe

Side-channel attacks (SCA) are all about finding secret information by observing physical phenomena and exploiting a relationship between the two. The most common scenario works like this: a secret key $k$ is embedded in a computing device that takes inputs $x_1,\dots,x_n$ and computes some function like an encryption/decryption or so. The SCA assumption is that some partial evaluation leaks information via a physical change in a measured trace. For example, when computing an AES encryption, the permutation’s look-up might leak some information about the output because the device must change the content of some registry creating a physical variance in the measured power-trace since more (or less) power is required to do so.

Sadly, it is not obvious how these leaks are (or can be) described meaning that the best course of action is to create a model by deciding how a leak is described and which mathematical assumption we require. In a nutshell, one defines a leaking model that describes how the evaluation is leaked and the noise distribution, for example, the Hamming weight of the AES permutation’s output is leaked and some independent Gaussian noise is added.

This freebie lives in the noiseless universe where leakages are as good as the model outputs but, the inputs are (uniformly) randomly selected.

Repeating Repetitions Problem

If one comes from theoretical cryptography would find that SCA is defined on the (somehow unintuitive) known-plaintext model: the attacker does not have any power in choosing the inputs to the device but these are known to it. Leaving open the discussion for chosen-plaintext models in SCA for the future, this model is based on the idea that the adversary merely observes the computing device and, since it must be physically connected to it, the adversary easily obtains the inputs but not the secret contained in the device. With this in mind, it doesn’t look that weird!

An interesting parameter to study is the conditional entropy $H(K|(X,L))$ of the key $K$ knowing the inputs and leakages $(X,L)$. Together with being directly used in computing the mutual information, this parameter directly specifies the information of $K$ still unknown after seeing $(X,L)$. Even more interesting is the conditional entropy $H(K|(X,L)_n)$ with $n$ different inputs.

However, since the $n$ inputs are randomly chosen, one must take into account the probability of getting repetitions when computing the entropy and, quite quickly, the complexity of studying $H(K|(X,L)_q)$ becomes intractable.

Idea: Sum of Non-Repeating

The main idea is to partition the distribution of the input-leak $(X,L)_n$ depending on the number of distinct input-leak pairs, for example, there are $k \in [1,n]$ out of the $n$ different inputs thus one can consider the input-leak as if it were from the distribution $(X,L)_k$. This can be done because repetitions do not increase information in the noiseless scenario.

The main idea would be to transform the conditional information of $n$ inputs into the sum of conditional information of $k$ inputs, weighted with some combinatorial coefficient $c_{k,n}$.

\[H(K|(X,L)_n) = \sum_{k=1}^{n} c_{k,n} \cdot H(K|(X,L)_{k})\]

Why: Better Approximations

Improving bounds is essential for empirical-oriented research. The current best upper-bound¹ (up to my knowledge) for the mutual information is:

\[H(K) - H(K|L_q) = I(K;L_q) \leq q \cdot I(K;L_1) = q \cdot H(K) - q \cdot H(K|L_1)\]

where the bound can become meaningless when $q$ is big enough.

A direct truncation to the first $a < n$ terms of the sum would give for free an increasingly better upper-bound:

\[H(K|(X,L)_n) \geq \sum_{k=1}^{a} c_{k,n} \cdot H(K|(X,L)_{k}) \Longrightarrow\] \[\Longrightarrow H(K) - H(K|L_q) = I(K;L_q) \leq H(K) - \sum_{k=1}^{a} c_{k,n} \cdot H(K|(X,L)_{k})\]

How to Publish

As is, sadly I cannot see how this can be published alone.

Maybe if presented together with this other freebie it might have a chance. Especially if someone does an investigation on the relation between mutual information and the number of traces, then this increasingly tighter bound would be a nice result to add.

Footnotes

I remember another upper-bound which at least converges correctly but it was mainly good for asymptotic arguments. ↩

On This Page