RSA Cryptosystem
RSA (Rivest–Shamir–Adleman) is a public-key cryptographic algorithm used to share data securely over a public channel. A private key is used to decrypt data, while a public key, derived from the private one, is used to encrypt data.
Textbook definition
This section describes the textbook (i.e no padding) RSA implementation and details the three basic primitives of any public-key cryptosystem :
Parameters
| Name | Description |
|---|---|
| \(p, q\) | The prime factors of \(N\) |
| \(N\) | The public modulus: \(N=p \cdot q\) |
| \(e\) | The public exponent coprime to \(N\) |
| \(\phi(N)\) | Euler’s totient of \(N\): \(\phi(N) = (p - 1)(q - 1)\) |
| \(d\) | The private exponent: \(d = e^{-1} \mod \phi(N)\) |
| \(M\) | The message to encrypt |
| \(C\) | The encrypted message: \(C = M^e \mod N\) |
Key generation
- Choose two large prime numbers \(p\) and \(q\).
- Compute \(N = p \cdot q\)
- Compute Euler’s Totient \(\phi(N) = (p - 1)(q - 1)\)
- Choose any integer \(e\) such that \(1 < e < \phi(N)\) and \(gcd(e, \phi(N)) = 1\)
- Compute \(d\), the modular multiplicative inverse of \(e\), such that \(ed = 1 \mod \phi(N)\)
Encryption
To encrypt a message \(M\), we use the public key \((N, e)\) to compute :
\[ C = M^e \mod N \]Decryption
To decrypt an encrypted message, we use the private key \((N, d)\) to compute :
\[ M = C^d \mod N \]Here is a simple implentation in Python
| |
Encrypted message: 3217
Decrypted message: 65Key structure
PEM format
Here is an Online decoder for PEM, DER and ASN.1
You can also read those file formats using the Pycryptodome python library.
| |
Decrypt using CRT
RSA decryption operation can be made significantly faster by using the Chinese Remainder Theoreom. To do that, the following values are precomputed and stored as part of the private key:
\[ \left\{\begin{aligned} d_p &= d \mod p-1 \\ d_q &= d \mod q-1 \\ q_{inv} &= q^{-1} \mod p \\ \end{aligned} \right. \]These values allow the recipient to compute the exponentiation \(M = C^d \mod N\) more efficiently as follows:
\[ \begin{aligned} M_1 &= C^{d_p} \mod p \\ M_2 &= C^{d_q} \mod q \\ h &= q_{inv}(M_1 - M_2) \mod p \\ M &= M_2 + h \cdot q \\ \end{aligned} \]Recover parameters
In this section, I will present various methods to recover the different parameters of the RSA cryptosystem given partial information.
Known P+Q or P-Q
Suppose you know \(p -q\), you can find that \(p^2 - (p - q)p = N\), which in turn give us :
\[ p^2 - (p - q)p - N = 0 \]a quadratic polynomial that we can solve for \(p\) !
Calculus
Check the formulas
We compute \(p\) as :
\[ p = \frac{-b + \sqrt{\Delta}}{2a} \]with
\[ \Delta = b^2 - 4ac = (p - q)^2 + 4N \]so that we have
\[ p = \frac{-(p - q) + \sqrt{(p - q)^2 + 4N}}{2} \]Skipping the steps, we have
\[ p = \frac{-(p - q) + \sqrt{(p - q)^2 + 4N}}{2} \]Important
When we know \(p + q\), we have \(p^2 - (p + q)p + N = 0\), so in this case we compute
\[p = \frac{-(p + q) + \sqrt{(p + q)^2 - 4N}}{2}\]Known Euler’s Totient
Suppose you know \(\phi\), then you may find that :
\[ \begin{aligned} N + 1 - \phi &= pq + 1 - (p - 1)(q - 1) \\ &= pq + 1 - (pq - p - q + 1) \\ &= p + q \\ \end{aligned} \]Knowing this brings us back to this case.
Another possibility
Since you know \(e\) and \(\phi\), you could simply calculate \(d = e^{-1} \mod \phi\)
Known private exponent
Suppose you know \(d\) and \(e\), then you have:
\[ e*d \equiv 1 \mod \phi \quad \Rightarrow \quad e*d - 1 = k\phi \]Find the biggest number \(x\) such that
\[ k\phi = 0 \mod (2^x) \quad \Leftrightarrow \quad k\phi = r \cdot 2^x \]Choose \(g \in [0, N-1]\)
Repeat step 2 until found
For \(t\) going from \(0\) to \(x\)
- Calculate \(y = g^{\frac{k\phi}{2^t}} \mod N\)
- Calculate \(p = gcd(y - 1, N)\)
- if \(1 < p < N\) and \(N = 0 \mod p\), then
- You found \(p\) and \(q = \frac{N}{p}\)
| |
Another method
We know
\[ e*d - 1 = k\phi \quad \Rightarrow \quad k = \frac{e*d - 1}{\phi} \]Yet, we don’t know \(\phi\), but we can make the not so bad approximation \(\phi \approx N\) to approximate \(k\) as
\[ k \approx \left\lfloor\frac{e*d - 1}{N} \right\rfloor \]Rearranging the terms, we can calculate
\[ \phi = \frac{e*d - 1}{k} \]If the result is not an integer, increment \(k\) until it is. That’s because of the approximation. You now can recover \(p\) and \(q\) this way
Small public exponent
Alternatively, if \(e\) is small, then \(ed = k\phi + 1\) where \(k < e\).
| |
Known CRT exponent
Note
What follows works equally well for \(d_q\) !
Suppose you know \(d_p\), then we know that :
\[ e * d \equiv 1 \ (\text{mod} \ \phi) \quad \Rightarrow \quad e * d_p \equiv 1 \ (\text{mod} \ p - 1) \]Choosing \(e = 65537\) as usual, we can recover \(p\) the following way :
\[ e * d_p = 1 + k_p(p - 1) \]for some integer \(k_p\). AS \(e\) is fairly small, so will \(k_p\). We can rearrange the above for:
\[ p = \frac{e * d_p - 1}{k_p} + 1 \]with the only unknown being \(k_p\).
Using python, we find a potential prime very quickly:
| |
Known Dp and Dq
Note
What is described here is more for the Maths than anything else, because it requires you to already know \(p\) and \(q\), which are generally the numbers we’re looking for.
If we know both \(d_p\) and \(d_q\), then by construction :
\[ \left\{\begin{aligned} d_p &= d \mod p-1 \\ d_q &= d \mod q-1 \\ \end{aligned} \right. \]which means we have
\[ \left\{\begin{aligned} d &= d_p^{-1} \mod p-1 \\ d &= d_q^{-1} \mod q-1 \\ \end{aligned} \right. \]and \(d\) is the solution to this system of modular equations, and thus we can be calculated using the CRT.
| |
Retrieve modulus
This section cover how to retrieve the modulus \(N\), knowing the public exponent \(e\) and given a cipher oracle.
Known public exponent
If you know \(e\), simply choose at least two random messages \(M_1, M_2\) and use the oracle to get \(C_1, C_2\).
Now by definition, you know that :
\[ \begin{aligned} C_1 \equiv M_1^e \mod N \quad \Rightarrow \quad C_1 = M_1^e + k_1N \\ C_2 \equiv M_2^e \mod N \quad \Rightarrow \quad C_2 = M_2^e + k_2N \\ \end{aligned} \]So, you can calculate
\[ gcd(C_1 - M_1^e, C_2 - M_2^e) = gcd(k_1N, k_2N) = k_3N \]with a small factor \(k_3\)
Caution
As we don’t know the modulus, we’re calculating the whole numbers \(M_i^e\), which can be very intensive for big \(e\), thus taking much time and memory to compute.
Important
By keeping this method and using more messages (\(M_3, M_4, \cdots\)), you increase the chances of \(k_3\) being just \(1\), hence giving you directly the modulus \(N\) as a result of the last formula.
Unknown or Too big
In the more general case, if \(e\) it is too big to use the first method or even unknown, you can use this one.
Note
As in the previous method, taking more messages increase the chances of getting exactly the correct modulus \(N\) as the result of the calculations.
As before, take two random messages \(M_1, M_2\), square them to get \(M_1^{'} = M_1^2, M_2^{'} = M_2^2\) and finally use the oracle to get the values of \(C_1, C_2, C_1^{'}\) and \(C_2^{'}\).
Now, since \(M_i^{'} = M_i^2\), it follows that \(C_i^{'} \equiv C_i^2 \mod N\) and thus \(C_i^2 - C_i^{'} = k_iN\). As a consequence, using the same arguments as in the firt method, \(N\) can be factored from
\[ gcd(C_1^2 - C_1^{'}, C_2^2 - C_2^{'}) = kN \]Tools
Factorize
| Tool | Description |
|---|---|
| Integer Factorization Calculator | Online tool for integer factorization using advanced methods (ECM, etc.) |
| Cado-NFS | Implementation of the Number Field Sieve for factoring large integers |
| FactorDB | Online database of known factorizations and integer properties |
CTF
| Tool | Description |
|---|---|
| RsaCtfTool | Automated RSA attack and analysis tool for CTFs and cryptanalysis |
| RSA Calculator | Online RSA computation and parameter analysis tool |