RSA Encryption, Part 3

Finally, we get to the point: how RSA encryption works. You may want to review parts 1 and 2 to make sure all that modular arithmetic stuff is fresh in your mind.

3. RSA Encryption
For those of you who are unaware, the term encryption is used to describe any method that changes a message so that only its intended recipient can read it. RSA encryption (named after its creators) belongs to a special class of encryption called public key encryption. In public key encyption, separate keys are used to encrypt and decrypt the message: the public key is known to everyone, and can be used to encrypt a message, while the private key is known only to the recipient, and is required to decode the message. This is opposed to symmetric-key encryption, in which the same key must be known to both the sender and receiver.
Thus, if you want to send me a message, all you have to do is encode your message with my public key, and only I will be able to read it (since only I have the corresponding private key). If I then want to send you a message in response, I would simply encrypt my response using your public key.
Think safes: public key encryption would be akin to a safe with a slot on top for inserting messages. Anyone with access to the safe (that is, who has the public key) can drop messages into the safe through the slot, but only the person who has the key to the safe (the private key) can read them.

As a quick refresher, remember what we finished with last time. For all cases important to us, if we want to determine a^k (mod n), then
a^k ≡ a^{k mod Φ(n)} (mod n).

First of all, we need some convention for mapping letters to numbers and vice versa. Let’s use the convention that A=0, B=1, C=2, …, Y=24, Z=25.^†

Next, we begin the encryption process by choosing any two prime numbers; for convenience, let’s choose 3 and 11. The product of these will be our n: thus, n = 3*11 = 33. Also, since n is the product of two primes, Φ(n) = (3-1)(11-1) = 20 (as was discussed last time).
We also choose some number less than (and relatively prime to) Φ(n) and compute its inverse (mod Φ(n)) – we will see why in a minute. Let’s choose e=3 and d = e^-1 = 7 (since 3*7 ≡ 21 ≡ 1 (mod Φ(n))). Keep in mind that, since we defined e and d to be inverses, their product ed ≡ 1 (mod Φ(n)).
We’ll call the pair of numbers (d,n)=(7,33) our Private Key, and the pair of numbers (e,n)=(3,33) our Public Key.

Now we have everything we need – this is where it gets cool. Let’s try to encrypt the message “HELLO” using the public key. We will do this by encoding the message into numbers using our convention and then taking each number in turn to the power of e (mod n). We can then decrypt by taking each number to the power d (mod n). Let’s go over the example, and then I’ll explain why it works.

Under our encoding, the message “HELLO” is given by the numbers “7 4 11 11 14”. Since e=3, we raise 7 to the 3rd power to get 7³ ≡ 13 (mod n); so the first number in our encrypted message is 13. Likewise, 4³ ≡ 31 (mod n). After a bit of calculation, we get an encrypted message of “13 31 11 11 5”. We can now transmit this message down the line without having to worry about somebody reading it without the private key.
Now let’s decrypt the message: taking each of the numbers “13 31 11 11 5” and raising them in turn by the power d=7, we get “7 4 11 11 14”, which decodes to the message “HELLO”. Great!

So how did this all work? The reasoning is simple, actually. If you’ll remember back to last time, we concluded with the formula
a^k ≡ a^{k (mod Φ(n))} (mod n)
In this case, a is our message, and k is e, our public key exponent. Thus, for each digit a, what we’re transmitting is a^e (mod n). When the message is received, we take it to the exponent d; thus, the new value is
(a^e)^d ≡ a^ed ≡ a^{ed mod Φ(n)} (mod n)
by the above formula. However, if you’ll remember, ed ≡ 1 (mod Φ(n)). Thus, after decryption we’re left with simply a (mod n), the original message!^††

Yep – it’s just that simple. So simple, in fact, it might seem too simple. Why, you may be asking, can’t someone who’s given e (part of the public key) just calculate its inverse to get d (the unknown half of the private key)? Keep in mind that d is the inverse of e (mod Φ(n)) – and in order to calculate Φ(n), we need to know p and q, the two primes which make up n. And here’s the point – integers, as every computer scientist knows, are hard to factor into primes. Very hard^‡. This means that if n is very large (usually along the order or tens or hundreds of decimal digits), it will be nearly impossible for even the fastest super computers to determine p and q, meaning an attacker is very unlikely to be able to determine Φ(n) and thus determine d. It is this inability to "reverse" our choice of n that gives RSA its security.

^† For those of you more familiar with computers, mappings based on strings of several ASCII or Unicode characters are most common.
^†† The astute reader will have noticed that this only holds true if the encoded message is relatively prime to n. As it turns out, however, for RSA it just happens to hold true even when the message is not relatively prime to n; a proof of this can be found on wikipedia.
^‡ Probably. That is, it’s believed to be very hard, but nobody knows for sure.