RSA

From Computer Science
Revision as of 23:28, 2 October 2023 by Jkinne (talk | contribs) (Another Assignment)
Jump to: navigation, search

RSA encryption encrypts and decrypts messages using modular exponentiation. It is a public-key encryption scheme, so that a person using the system will have two different keys - a "public" key that is shared publicly and a "private" key that is kept private. If someone wants to send a message to me, they encrypt the message with my public key, and it supposed to be the case that I am the only who can decrypt the message (using my private key). Clearly there needs to be a relation between the private key and public key for this to work.

This page walks you through choosing an RSA public/private key pair, using small numbers, in a way that you can do many of the computations "by hand" using algorithms you have learned in a course (e.g., CS 303).

Public Key

The process starts with "large" primes p and q. For our example, we use p=241 and q=251. Okay, let's get started...

Fermat prime test / modular exponentation

Confirm that the primes are indeed prime. We use the Fermat prime test, and show how this works for p. We need to check that 2p-1 = 1 mod p. So we need to compute 2240 mod 241. Note that Python can do this with (pow(2, 240, 241)), but we want to see how to do this using the algorithm for fast modular exponentiation. So, here is my work for computing 2240 mod 241:

  • 21 mod 241 = 2
  • 22 mod 241 = 4
  • 24 mod 241 = 16
  • 28 mod 241 = 15
  • 216 mod 241 = 225
  • 232 mod 241 = 15
  • 264 mod 241 = 225
  • 2128 mod 241 = 15
  • Note that 240 = 16 + 32 + 64 + 128, so 2240 mod 241 = (216)(232)(264)(2128) mod 241 = 225 * 15 * 225 * 15 mod 241 = 1

We conclude that p=241 passes the Fermat prime test, and we're going to say that we're okay with using it as one of our primes for the algorithm.

Let's assume that we have also checked that q=251 is also prime (passes the Fermat prime test). For the Fermat prime test we could have used some other base, but 2 is just as easy as any other, so you can use 2 for your base in the test as well.

Modulus and Totient

We set our modulus n = p*q = 241*251 = 60491. This will be part of our public key. Note that we share the value of n, but do NOT share p and q.

We will need the so-called "Carmichael totient" function of n. Namely we need the value lcm(p-1, q-1). We will call the value "totient". We can compute the lcm using the gcd, and compute the gcd using the Euclidean algorithm.

  • totient = lcm(p-1, q-1) = (p-1)*(q-1) / gcd(p-1, q-1)
  • So let's compute gcd(p-1, q-1) = gcd(240, 250)
    • 250 = 240*1 + 10, so gcd(250, 240) = gcd(240, 10)
    • 240 = 10*24 + 0, so gcd(240, 10) = gcd(10, 0) = 10
  • We conclude that gcd(p-1, q-1) = 10, and therefore lcm(p-1,q-1) = 240*250 / 10 = 6000

So we set totient = 6000.

Encryption exponent

We need to pick an encryption exponent e that has no common factors with our totient. We will pick the smallest one that works. We try e=2, e=3, ..., until we find an e with gcd(e, 6000) = 1.

  • gcd(2, 6000) = 2, so 2 is no good.
  • gcd(3, 6000) = 3, so 3 is no good.
  • Similarly, 4, 5, and 6 don't work.
  • gcd(7, 6000) = 1.

So we will use e=7 as our encryption exponent.

We now have our public key, which is (n, e) = (60491, 7)

Private Key

The private key consists of the same modulus n together with a different exponent. We are looking for the exponent d such that d is the inverse of e mod totient. So we are looking for d that satisfies d*e = 1 mod totient.

Note that e was chosen so that gcd(e, totient) = 1. Also, note that for any a and b, the extended Euclidean algorithm can be used to find s and t such that gcd(a, b) = s*a + t*b. If we apply the extended Euclidean algorithm to e and totient, we will have s and t such that 1 = gcd(e, totient) = s*e + t*totient. Notice that this means s*e = 1 mod totient, so s will be the exponent d we want.

To find d, we need to run the extended Euclidean algorithm on e and totient, so on 7 and 6000.

  • 6000 = 7*857 + 1, so gcd(6000,7) = gcd(7, 1)
  • gcd(7, 1) = 1 = 6000 - 7*857, so 1 = 6000*1 + 7*(-857)

We get that -857 is the inverse of 7 mod 6000. Equivalently, -857+6000 = 5143 is the inverse of 7 mod 6000.

We now have our private key: (n, d) = (60491, 5143).

Send a Message

Let's now send a message using the public key computed above. The public key we are using is: (n, e) = (60491, 7). Let's say we want to send the message "cool!". We first need to have the binary representation for the text. We use the ASCII values, and assume we are going to encrypt each byte of the message separately (in real life we would encrypt much more than that at a time, and be using much much larger primes and numbers).

  • cool! in ASCII is: 99, 111, 111, 108, 33. Note - you can look up each one, have Python do it for you, or you can use a tool like CyberChef (under Data Format, use To CharCode).
  • We encode these as follows, using Python:
    • pow(99, 7, 60491) = 41054
    • pow(111, 7, 60491) = 29085
    • pow(111, 7, 60491) = 29085
    • pow(108, 7, 60491) = 31620
    • pow(33, 7, 60491) = 53346

The encrypted message that we send is thus: 41054, 29085, 29085, 31620, 53346

Receive a Message

The person receiving the message uses their private key. In our case we would perform pow(x, 5143, 60491) for each of the values x that was sent to us. You can verify that we do indeed get back the values that were encrypted (99, 111, 111, 108, 33).

Assignment

If you are given this as an assignment you will be assigned primes p and q, and you need do the following.

  1. Fermat prime test on both p and q, showing that they are prime. Demonstrate the fast modular exponentiation algorithm to do this.
  2. Totient - demonstrate the using the Euclidean algorithm to find the gcd that is used in finding the totient value. Use the gcd you find to determine the totient.
  3. Encryption exponent e - find the smallest e that has no factors with your totient. Together with n, you have your public key.
  4. Decryption exponent d - demonstrate using the extended Euclidean algorithm to find the decryption exponent d (your private key).
  5. You will be sent a message encrypted with your public key. Decrypt the message and include the plaintext of the message in your solution.
  6. Make up some response to the encrypted message, and encrypt it using the public key that was demonstrated on this page (that is, you are encrypting this to send to your instructor, using the public key on this page). Include the encrypted message in your solution.

Note that this is a "doing / demonstration" assignment and not a "proofs" assignment. Still, keep your work organized, and use the numbering indicated here to make it easier for your instructor to follow along.

Good luck!

Grading Notes

Some things that people commonly have done wrong on this assignment...

  • You need to be careful and precise with your writing. Only use an equals sign when two quantities are equal to each other. Do not use = when you really mean "-->" or when you really should be using a ":". For example, when computing the gcd of 15 and 10, do not write this: gcd(15, 10) = 15 = 1*10 + 5. That first "=" should really be a ":" or something along those lines.
  • Passing the Fermat primality test does not prove that a number is prime. If a number is prime, then it must pass the test. But there are some numbers that pass the test but are not prime. So you should not say anything like "thus we have proved that p is prime". Be precise in your language: "thus p has passed the Fermat primality test and we will use it" (or something along those lines).
  • For the totient part, you had to demonstrate the Euclidean algorithm, and to find the decryption exponent you had to demonstrate the extended Euclidean algorithm. Note that many of you had totients and e's where these parts were very easy (very few iterations were required before you got to the gcd). Make sure you are able to do both the regular and extended Euclidean algorithm even when the algorithms takes a handful of iterations.
  • For the decryption part, you should explain what you did to decrypt - did you use code, did you use Python pow, etc.? It should be enough to convince me that you did it.
  • For the encryption part, most people encrypted using their own encryption key. You were supposed to encrypt using /my/ encryption key (the one on this page) - because this is how we would use this to exchange messages (I send you messages using your encryption key, you send me messages using my encryption key).

Another Assignment

For a more advanced version of this assignment, we write code for RSA with a "real" block length. Rather than encrypting 8 bits at a time, we should be encrypting something like 1024, 2048, etc. many bits at a time. This will require being able to read and write file data that is /not/ ASCII data (namely, reading it in "binary" mode). You will write a program to generate an RSA public and private key, encrypt, and decrypt.

The following are the rules of how your program should work...

  • Takes command line arguments and works like this...
Usage: ./rsa.py encrypt key_filename filename [filename.encrypted]
       ./rsa.py decrypt key_filename filename.encryped [filename.encrypted.decrypted]
       ./rsa.py generate_key filename [#bits]
        #bits: for each chunk, defaults to 1024, must be multiple of 8
        generating key will create files key_filename.private.json, key_filename.public.json
        encrypt: encrypts #bits at a time,
                  write out 1 extra byte per chunk
                  (you can think about why)
                  first 10 bytes of the file is the total # of bytes
                  of the original file (in big-ending order)
	decrypt: decrypts one chunk at a time
	(that is 1 byte longer than the decrypted chunk)
	key files are either private or public, and encrypt/decrypt should
         take this into account - encrypting with a private key uses d as the
         exponent, and decrypting with a public key uses e.
  • public key file is a dictionary written in json format with keys: "n", "e", "chunksize"
  • private key file is a dictionary written in joson format with keys: "n", "e", "chunksize", "d"

If you are going to write this in Python, the following are some things you might need to use...

  • You'll pick primes that are chosen at random and are 1 bit larger than half of the chunk size. Pick a random starting point with the right # of bits (use randint), test if it is prime (use the Fermat prime test and also make sure it is not divisible by 2, 3, 5, 7, 11), if it is not prime check the next one.
  • If you have code for h2 that works, you probably can use it for larger integers as well (thank you very much Python!).
  • Reading/writing json (for the public and private key)... import json and use json.loads, json.dumps.
  • Reading/writing raw "binary" data... Use open(filename, 'rb') or open(filename, 'wb').
  • Viewing/converting raw binary data as a large integer. A bytes array (that you get from read'ing 'rb' data) has a method .hex, and you can "convert" to integer using the int function with base 16. You can go the other direction (taking a large int and getting the bytes) using the to_bytes method.

If you are good at Python, understand h2, have h2 code that works for small integers, then that may be enough for you to do this. You should first get the generation of keys working, then encryption, and last decryption. Test and debug often.

Note that since this is a more advanced assignment, I won't be providing too much help...