Side-Channel: Splitting the Noise

Intro to the Side-Channel Universe

Side-channel attacks (SCA) are all about finding secret information by observing physical phenomena and exploiting a relationship between the two. The most common scenario works like this: a secret key $k$ is embedded in a computing device that takes inputs $x_1,\dots,x_n$ and computes some function like an encryption/decryption or so. The SCA assumption is that some partial evaluation leaks information via a physical change in a measured trace. For example, when computing an AES encryption, the permutation’s look-up might leak some information about the output because the device must change the content of some registry creating a physical variance in the measured power-trace since more (or less) power is required to do so.

Sadly, it is not obvious how these leaks are (or can be) described meaning that the best course of action is to create a model by deciding how a leak is described and which mathematical assumption we require. In a nutshell, one defines a leaking model that describes how the evaluation is leaked and the noise distribution, for example, the Hamming weight of the AES permutation’s output is leaked and some independent Gaussian noise is added.

Evaluating the security of a computing device against side-channel attacks is tricky.

Many leakage models and noise distributions can be considered making both the security evaluation and verifying the soundness of the model a hard problem to tackle.

Computing the Mutual Information

Despite how hard a problem might be, research is all about trying to solve it!

An important measurement is the mutual information (MI) $I(K;T)$ between the secret key $K$ and the measured traces $T$ which, intuitively, represents how many bits of information of $K$ are shared/contained in the measured traces $T$.

A funny observation is that the mutual information $I(K;T)$ is agnostic of the model! However, mutual information is generally estimated because of the inputs’ dimensionality and by requiring the specification of a leakage model: theoretically computing $I(K;T)$ requires knowing and computing all the possible combinations of keys, traces and models. This can be seen in the formulation of MI:

\[I(K;T) = \sum_{k \in K} p_k \log \frac{1}{p_k} + \sum_{k \in K}\int_{t \in T} p_{k,t} \log \frac{p_k}{p_{k,t}} dt\]

A de facto standard assumption in literature requires independence between the noise distribution and the inputs and secret distributions meaning that each trace¹ $T$ is the sum of a partial evaluation leakage $l$ and a noise sample $n$.

Idea: Splitting the Noise

Given the independence of the noise, it should be interesting to see if one can separate the conditional entropy $H(K|T)$ into the conditional entropy of noiseless traces $H(K|L)$ and a factor $d(L,T)$ that dampens the information of the noiseless scenario. Something like:

\[\sum_{k \in K}\int_{t \in T} p_{k,t} \log \frac{p_k}{p_{k,t}} dt = \left(\sum_{k \in K}\int_{t \in L} p_{k,l} \log \frac{p_k}{p_{k,l}} dl\right) \cdot d(L,T)\]

The difficulties of such manipulations are mainly in finding a good equation “shape” together with a minimal amount of assumptions (on noise, inputs or leakage model) such that the equation holds for a general enough SCA scenario².

Why: More Model Combinations

The main reason is a massive simplification of the MI’s estimation.

First, computing the mutual information in the noiseless scenario is (often) computationally intensive but, nevertheless, easier than in the noisy scenario. MI in the noiseless scenario merely measures how much of the secret is leaked by the leakage model thus allowing the comparison between leakage models.

Second, a detailed analysis of the noise contribution would be possible by only considering the noise distribution and not the whole leakage function. This would expand the analysis for noise distribution different from the Gaussian, which seems to be the only noise distribution considered in the literature.

Third, more models (leak+noise) allow a better device’s security evaluation because one can design the device’s security by assuming some omnipresent level of noise, that physical measurement creates, and focusing on reducing the mutual information of different leakage models in the noiseless scenario. Pragmatically, the majority of the literature considers a very limited amount of leaked models while, from a direct look at the MI formula, it is intuitive that the majority of the MI amount only depends on the model used and how good of a model it is. Plus, the noise contribution is pushed into a single coefficient that can be empirically estimated and relate to “the cost to achieve reduced noise” or, in other terms, “how expensive is a better measuring tool”.

How to Publish

The final goal is to analyse multi-variate leakages, i.e. the leakage is a vector of measurements/leakages. For example, one AES trace might contain the Hamming weights of the output of different permutation boxes.

As is, the whole study would (sadly) not be easily publishable because it is merely theoretical. However, if one shows the ability to study the security of multi-variate leakages, then it would be an amazing result.

As a plus, this freebie is related to another freebie!

Footnotes

For sake of simplicity, the text does not tackle the way more complicated difference between partial evaluation, leakage as the leakage model’s output and measured trace which is the effective real-measurements. Clarifying this “massive notation/nomenclature confusion” might (easily) be an interesting project too! ↩
A good “shape” would allow easy work with vectorial leakages/trace distributions. ↩

On This Page