Linear-Feedback Shift Register
A linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state. It is sometimes used as a pseudorandom number generator in some cryptographic applications.
Galois LFSR
A Galois LFSR, also referred to as a binary LFSR, is a specific type of LFSR that operates in \(\mathbb{F}_2\). Here is a simple diagram illustrating how it works.

Below is a simple Python implementation of a Galois LFSR. In this type, addition corresponds to the \(XOR\) operation.
| |
Attacks
Just so you know
This section only refers to binary LFSRs (i.e Galois LFSRs)
In this section we’ll cover possible attacks to recover the initial state and the feedback polynomial of the LFSR, thus allowing to us to predict its outputs.
Recover the taps
To recover the taps of the LFSR, you can use the Berlekamp–Massey algorithm, which will find the shortest LFSR for a given binary output sequence.
Using Sage and its implementation of the Berlekamp–Massey algorithm
| |
Find initial state
Because the LFSR is a linear operation, we can define the state \(S_k\) after \(k\) operation of an LFSR, as the multuplicaiton of the transformation matrix by the initia state \(S_0\)
Let’s write
\[ S_0 = \left(\begin{array}{cc} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_{n-1} \end{array}\right), \quad S_k = \left(\begin{array}{cc} a_k \\ a_{k+1} \\ a_{k+2} \\ \vdots \\ a_{k+n-1} \end{array}\right) \]We then have :
\[ \begin{aligned} S_k = \left(\begin{array}{cc} a_k \\ a_{k+1} \\ a_{k+2} \\ \vdots \\ a_{k+n-1} \end{array}\right) &= \left(\begin{array}{cc} c_{n-1} & 1 & 0 & \cdots & 0 \\ c_{n-2} & 0 & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 0 \\ c_1 & 0 & 0 & 0 & 1 \\ c_0 & 0 & 0 & \cdots & 0 \\ \end{array}\right) \cdot \left(\begin{array}{cc} a_{k-1} \\ a_{k} \\ a_{k+1} \\ \vdots \\ a_{k+n-2} \end{array}\right) \\ \\ &=\left(\begin{array}{cc} c_{n-1} & 1 & 0 & \cdots & 0 \\ c_{n-2} & 0 & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 0 \\ c_1 & 0 & 0 & 0 & 1 \\ c_0 & 0 & 0 & \cdots & 0 \\ \end{array}\right)^k \cdot \left(\begin{array}{cc} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_{n-1} \end{array}\right) \end{aligned} \]with \(c_0, \ c_1, \cdots, \ c_{n-1}\) being the coefficients of the feedback polynomial that define the LFSR, and thus the taps.
Note
In the binary case, \(c_0, c_1, \cdots, c_{n-1} \in \mathbb{F}_2\)
We now define the matrix
\[ T = \left(\begin{array}{cc} c_{n-1} & 1 & 0 & \cdots & 0 \\ c_{n-2} & 0 & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 0 \\ c_1 & 0 & 0 & 0 & 1 \\ c_0 & 0 & 0 & \cdots & 0 \\ \end{array}\right) \]and the relation simply becomes
\[ \begin{aligned} &S_k = T^k \cdot S_0 \\ \Longleftrightarrow \quad &S_0 = T^{-k} \cdot S_k \\ \end{aligned} \] | |
Reverse from output stream
If, as in the code example, you want to recover the seed from the given output sequence (not state, as used in the formula), you must account for the size of your sequence, which is the size of your LFSR.
Correlation attack
Suppose you have access to the bit stream of a PRNG \(R\), whose outputs are a combination of the outputs of three separate LFSRs, namely \(L_1\), \(L_2\), and \(L_3\).
Let’s say, for example, that the combination looks like
\[ R = \left\{\begin{aligned} &L_2 \quad \text{if} \ L_1 = 1 \\ &L_3 \quad \text{else} \\ \end{aligned}\right. \]If we examine the truth table produced by this relation, we can see that the output of \(R\) is correlated with the output of \(L_3\). Specifically, when \(R = 1\), \(L_3 = 1\) in 3 out of 4 cases, or \(75 \%\) of cases.
| \(L_1\) | \(L_2\) | \(L_3\) | \(R\) |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 0 |
| 0 | 1 | 0 | 0 |
| 1 | 1 | 0 | 1 |
| 0 | 0 | 1 | 1 |
| 1 | 0 | 1 | 0 |
| 0 | 1 | 1 | 1 |
| 1 | 1 | 1 | 1 |
In the case where the size of the state of this LFSR is only a fraction of a secret and small enough, you can brute-force the seed for this one by looking at the correlation percentage between \(R\) and the output of the seed you’re trying (given that you already know the taps). To do so, just keep the seed with the highest correlation factor.
Once recovered, you can now attack the two other LFSRs with this new information. In our case, you can start with \(L_2\), because from the table above, you can also see that its zeroes are correlated with \(R\)’s.
Important
For speed reasons, this Python script makes use of the numba Python library to parallelize the work. There is probably a better or faster way to do this, but I didn’t take the time to find one as this was good enough for me.
Here is how you can implement this attack in Python
| |