% TODO: use {equation} instead of {center} \myheading{How \ac{RSA} works} \leveldown{} RSA public-key cryptosystem (named after its inventors: Ron Rivest, Adi Shamir and Leonard Adleman) is the most used asymmetric public-private key cryptosystem. \myheading{Fermat little theorem} Fermat little theorem states that if $p$ is prime, this congruence is valid for any $a$ in the environment of modulo arithmetic of base $p$: \begin{center} $a^{p-1} \equiv 1 \pmod p$ \end{center} There are proofs, which are, perhaps, too tricky for this article which is intended for beginners. So far, you can just take it as granted. Here is one important property of this theorem. The expression $a^{p-1}$ will always return $1$ modulo $p$ for \textit{any} $a$ and \textit{any} prime $p$. The following Python code would always work: \lstinputlisting[style=custompy]{prime/fermat.py} Wolfram Mathematica can even reduce it: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= PowerMod[x, (p-1), p] Out[]= 1 In[]:= Mod[x^(p-1), p] Out[]= 1 \end{lstlisting} This theorem can be used to sieve prime numbers. So you take, for example, 10 and test it. Let's take some random $a$ value (123) (Wolfram Mathematica): \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= Mod[123^(10 - 1), 10] Out[]= 3 \end{lstlisting} We've got 3, which is not 1, indicating the 10 is not prime. On the other hand, 11 is prime: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= Mod[123^(11 - 1), 11] Out[]= 1 \end{lstlisting} This method is not perfect, some composite p numbers can lead to 1, for example p=1105, but can be used as a method to sieve vast amount of prime numbers candidates. \myheading{Euler's totient function} It is a number of coprime numbers under some $n$. Denoted as $\phi (n)$, pronounced as \textit{phi}. For the sake of simplification, you may just keep in mind that if $n=pq$ (i.e., product of two prime numbers), $\varphi (pq)=(p-1)(q-1)$. This is true for RSA environment. \myheading{Euler's theorem} Euler's theorem is a generalization of Fermat little theorem. It states: \begin{center} $a^{\varphi (n)} \equiv 1 \pmod{n}$ \end{center} But again, for the sake of simplification, we may keep in mind that Euler's theorem in the RSA environment is this: \begin{center} $a^{(p-1)(q-1)} \equiv 1 \pmod{n}$ \end{center} ... where $n=pq$ and both $p$ and $q$ are prime numbers. This theorem is central to RSA algorithm. \myheading{RSA example} There are \textit{The Sender} and \textit{The Receiver}. \textit{The Receiver} generates two big prime numbers ($p$ and $q$) and publishes its product ($n=pq$). Both $p$ and $q$ are kept secret. For the illustration, let's randomly pick $p$ and $q$ among the first 50 prime numbers in Wolfram Mathematica: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= p = Prime[RandomInteger[50]] Out[]= 89 In[]:= q = Prime[RandomInteger[50]] Out[]= 43 In[]:= n = p*q Out[]= 3827 \end{lstlisting} 3827 is published as public key, named \textit{public key modulus} or \textit{modulo}. It is semiprime. There is also public key exponent $e$, which is not secret, is often 65537, but we will use 17 to keep all results tiny. Now \textit{The Sender} wants to send a message (123 number) to \textit{The Receiver} and he/she uses one-way function: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= e = 17 Out[]= 17 In[]:= encrypted = Mod[123^e, n] Out[]= 3060 \end{lstlisting} 3060 is encrypted message, which can be decrypted only using $p$ and $q$ values separately. This is one-way function, because only a part of exponentiation result is left. One and important consequence is that even \textit{The Sender} can't decrypt it. This is why you can encrypt a piece of text in PGP/GnuPG to someone using his/her public key, but can't decrypt it. Perhaps, that's how CryptoLockers works, making impossible to decrypt the files. To recover message (123), $p$ and $q$ values must be known. First, we get the result of Euler's totient function $(p-1)(q-1)$ (this is the point where $p$ and $q$ values are needed): \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= totient = (p - 1)*(q - 1) Out[]= 3696 \end{lstlisting} Now we calculating decrypting exponent using multiplicative modulo inverse (multiplicative inverse was also described in here (\ref{modulo_arith}) ($e^{-1} \pmod{totient=(p-q)(q-1)}$): \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= d = PowerMod[e, -1, totient] Out[]= 2609 \end{lstlisting} Now decrypt the message: \begin{lstlisting}[caption=Wolfram Mathematica] In[18]:= Mod[encrypted^d, n] Out[18]= 123 \end{lstlisting} So the $d$ exponent forms another one-way function, restoring the work of what was done during encryption. \myheading{So how it works?} It works, because $e$ and $d$ exponents are reciprocal to each other by modulo $totient=(p-1)(q-1)$: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= Mod[e*d, totient] (* check *) Out[]= 1 \end{lstlisting} This allows... \begin{equation} \label{RSA:first} m^{ed}=m \pmod n \end{equation} Or in Mathematica: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= Mod[123^(e*d), n] Out[]= 123 \end{lstlisting} So the encryption process is $m^{e} \pmod{n}$, decryption is $(m^{e})^{d}=m \pmod{n}$. To prove congruence \ref{RSA:first}, we first should prove the following congruence: \begin{center} $ed \equiv 1 \pmod{((p-1)(q-1))}$ \end{center} ... using modular arithmetic rules, it can be rewritten as: \begin{center} $ed = 1+h (p-1)(q-1)$ \end{center} $h$ is some unknown number which is present here because it's not known how many times the final result was \textit{rounded} while exponentiation (this is modulo arithmetic after all). So $m^{ed}=m \pmod{n}$ can be rewritten as: \begin{center} $m^{1 + h((p-1)(q-1))} \equiv m \pmod{n}$ \end{center} ...and then to: \begin{center} $m \left(m^{(p-1)(q-1)}\right)^{h} \equiv m \pmod{n}$ \end{center} The last expression can be simplified using Euler's theorem (stating that $a^{(p-1)(q-1)} \equiv 1 \pmod{n}$). The result is: \begin{center} $m (1)^{h} \equiv m \pmod{n}$ \end{center} ... or just: \begin{center} $m \equiv m \pmod{n}$ \end{center} \myheading{Breaking \ac{RSA}} We can try to factor $n$ semiprime (or RSA modulus) in Mathematica: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= FactorInteger[n] Out[]= {{43, 1}, {89, 1}} \end{lstlisting} And we getting correct $p$ and $q$, but this is possible only for small values. When you use some big ones, factorizing is extremely slow, making RSA unbreakable, if implemented correctly. The bigger $p$, $q$ and $n$ numbers, the harder to factorize $n$, so the bigger keys in bits are, the harder it to break. \myheading{The difference between my simplified example and a real RSA algorithm} In my example, public key is $n=pq$ (product) and secret key are $p$ and $q$ values stored separately. This is not very efficient, to calculate totient and decrypting exponent each time. So in practice, a public key is $n$ and $e$, and a secret key is at least $n$ and $d$, and $d$ is stored in secret key precomputed. For example, here is my PGP public key\footnote{\url{https://yurichev.com/pgp.html}}: \begin{lstlisting} dennis@...:~$ gpg --export-options export-reset-subkey-passwd --export-secret-subkeys 0x3B262349\! | pgpdump Old: Secret Key Packet(tag 5)(533 bytes) Ver 4 - new Public key creation time - Tue Jun 30 02:08:38 EEST 2015 Pub alg - RSA Encrypt or Sign(pub 1) RSA n(4096 bits) - ... RSA e(17 bits) - ... ... \end{lstlisting} ... so there are available openly big (4096 bits) $n$ and $e$ (17 bits). And here is my PGP secret key: \begin{lstlisting} dennis@...:~$ gpg --export-options export-reset-subkey-passwd --export-secret-subkeys 0x55B5C64F\! | pgpdump gpg: about to export an unprotected subkey You need a passphrase to unlock the secret key for user: "Dennis Yurichev " 4096-bit RSA key, ID 55B5C64F, created 2015-06-29 gpg: gpg-agent is not available in this session Old: Secret Key Packet(tag 5)(533 bytes) Ver 4 - new Public key creation time - Tue Jun 30 02:08:38 EEST 2015 Pub alg - RSA Encrypt or Sign(pub 1) RSA n(4096 bits) - ... RSA e(17 bits) - ... ... Old: Secret Subkey Packet(tag 7)(1816 bytes) Ver 4 - new Public key creation time - Tue Jun 30 02:08:38 EEST 2015 Pub alg - RSA Encrypt or Sign(pub 1) RSA n(4096 bits) - ... RSA e(17 bits) - ... RSA d(4093 bits) - ... RSA p(2048 bits) - ... RSA q(2048 bits) - ... RSA u(2048 bits) - ... Checksum - 94 53 ... \end{lstlisting} ... it has all variables stored in the file, including $d$, $p$ and $q$. \myheading{A real example: OpenSSL} \begin{lstlisting}[caption=Linux console] # Generating a 32-bit RSA key pair: $ openssl genrsa -out keypair.pem 32 # Dump the private key: $ openssl rsa -noout -text -in keypair.pem # Extract the public key from private: $ openssl rsa -in keypair.pem -pubout -out pub.pub # Dump the public key: $ openssl rsa -noout -text -inform PEM -pubin -in pub.pub \end{lstlisting} N.B.: your build of OpenSSL may deny to generate too small keys\footnote{\url{https://superuser.com/questions/1640182/why-am-i-getting-an-error-when-trying-to-generate-rsa-128}}. \begin{lstlisting}[caption=The private key] $ openssl rsa -noout -text -in keypair.pem RSA Private-Key: (32 bit, 2 primes) modulus: 3231798799 (0xc0a1560f) publicExponent: 65537 (0x10001) privateExponent: 1892254033 (0x70c98151) prime1: 64109 (0xfa6d) prime2: 50411 (0xc4eb) exponent1: 42305 (0xa541) exponent2: 13863 (0x3627) coefficient: 29340 (0x729c) \end{lstlisting} \begin{lstlisting}[caption=The public key] $ openssl rsa -noout -text -inform PEM -pubin -in pub.pub RSA Public-Key: (32 bit) Modulus: 3231798799 (0xc0a1560f) Exponent: 65537 (0x10001) \end{lstlisting} The private key also have many precomputed values. All they are described in RFC 3447\footnote{\url{https://datatracker.ietf.org/doc/html/rfc3447}}. But the most important values are two primes, of course. Let's check them all in Wolfram Mathematica: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= prime1=p=64109 Out[]= 64109 In[]:= prime2=q=50411 Out[]= 50411 In[]:= modulus=n=p*q Out[]= 3231798799 In[]:= lambdaN=LCM[p-1,q-1] (* not used. Phi is used instead: (p-1)(q-1) *) Out[]= 1615842140 In[]:= exponent=privateExponent=d=65537 Out[]= 65537 In[]:= privateExponent=e=PowerMod[d,-1,(p-1)(q-1)] Out[]= 1892254033 In[]:= exponent1=PowerMod[d,-1,p-1] Out[]= 42305 In[]:= exponent2=PowerMod[d,-1,q-1] Out[]= 13863 In[]:= coefficient=PowerMod[q,-1,p] Out[]= 29340 \end{lstlisting} Recovering all these values from the public key is a question of factoring modulus: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= FactorInteger[modulus] Out[]= {{50411, 1}, {64109, 1}} \end{lstlisting} When dumping bigger keys, OpenSSL prints long hexadecimal numbers, like: \begin{lstlisting} RSA Public-Key: (256 bit) Modulus: 00:c7:b5:dc:5c:b0:73:bd:1d:b2:96:bb:49:26:a6: 82:67:89:77:21:5e:39:54:89:a8:d1:d5:89:af:5a: 34:06:ed Exponent: 65537 (0x10001) \end{lstlisting} Just remove colons (':') and join all the numbers, so the Modulus here is:\\ 0x00c7b5dc5cb073bd1db296bb4926a682678977215e395489a8d1d589af5a3406ed. \myheading{The \ac{RSA} signature} It's possible to sign a message to publish it, so everyone can check the signature. For example, \textit{The Publisher} wants to sign the message (let's say, 456). Then he/she uses $d$ exponent to compute signature: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= signed = Mod[456^d, n] Out[]= 2282 \end{lstlisting} Now he publishes $n=pq$ (3827), $e$ (17 in our example), the message (456) and the signature (2282). Some other \textit{Consumers} can verify its signature using $e$ exponent and $n$: \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= Mod[2282^e, n] Out[]= 456 \end{lstlisting} ... this is another illustration that $e$ and $d$ exponents are complement each other, by modulo $totient=(p-1)(q-1)$. The signature can only be generated with the access to $p$ and $q$ values, but it can be verified using product ($n=pq$) value. \myheading{Hybrid cryptosystem} \ac{RSA} is slow, because exponentiation is slow and exponentiation by modulo is also slow. Perhaps, this is the reason why it was considered impractical by GCHQ when Clifford Cocks first came with this idea in 1970s. So in practice, if \textit{The Sender} wants to encrypt some big piece of data to \textit{The Receiver}, a random number is generated, which is used as a key for symmetrical cryptosystem like DES or AES. The piece of data is encrypted by the random key. The key is then encrypted by \ac{RSA} to \textit{The Receiver} and destroyed. \textit{The Receiver} recovers the random key using \ac{RSA} and decrypts all the data using DES or AES. \levelup{}