This is a transcript of a talk I gave as part of MIT’s 18.434 class, the “Seminar in Theoretical Computer Science” as part of MIT’s communication requirement. (Insert snarky comment about MIT’s CI-* requirements here.) It probably would have made a nice math circle talk for high schoolers but I felt somewhat awkward having to present it to a bunch of students who were clearly older than me.
1. Preliminaries
1.1. Modular arithmetic
In middle school you might have encountered questions such as
Exercise 1
What is ?
You could answer such questions by listing out for small
and then finding a pattern, in this case of period
. However, for large moduli this “brute-force” approach can be time-consuming.
Fortunately, it turns out that one can predict the period in advance.
Theorem 2 (Euler’s little theorem)
- Let
. Then
.
- (Fermat) If
is a prime, then
for every
.
Proof: Part (a) is a special case of Lagrange’s Theorem: if is a finite group and
, then
is the identity element. Now select
. Part (b) is the case
.
Thus, in the middle school problem we know in advance that because
. This bound is sharp for primes:
Theorem 3 (Primitive roots)
For every prime there’s a
such that
but
for any
. (Hence
.)
For a proof, see the last exercise of my orders handout.
We will define the following anyways:
Definition 4
We say an integer (thought of as an exponent) annihilates the prime
if
for every
,
- or equivalently,
.
Theorem 5 (All/nothing)
Suppose an exponent does not annihilate the prime
. Then more than
of
satisfy
.
Proof: Much stronger result is true: in then
.
1.2. Repeated Exponentiation
Even without the previous facts, one can still do:
Theorem 6 (Repeated exponentation)
Given and
, one can compute
with
multiplications mod
.
The idea is that to compute , one just multiplies
. All the
can be computed in
steps, and
.
1.3. Chinese remainder theorem
In the middle school problem, we might have noticed that to compute , it suffices to compute it modulo
, because we already know it is odd. More generally, to understand
it suffices to understand
modulo each of its prime powers.
The formal statement, which we include for completeness, is:
Theorem 7 (Chinese remainder theorem)
Let ,
, \dots,
be distinct primes, and
integers. Then there is a ring isomorphism given by the natural projection
In particular, a random choice of amounts to a random choice of
mod each prime power.
For an example, in the following table we see the natural bijection between and
.
2. The RSA algorithm
This simple number theory is enough to develop the so-called RSA algorithm. Suppose Alice wants to send Bob a message over an insecure channel. They can do so as follows.
- Bob selects integers
,
and
(with
huge) such that
is a semiprime and
- Bob publishes both the number
and
(the public key) but keeps the number
secret (the private key).
- Alice sends the number
across the channel.
- Bob computes
and hence obtains the message
.
In practice, the in RSA is at least
bits long.
The trick is that an adversary cannot compute from
and
without knowing the prime factorization of
. So the security relies heavily on the difficulty of factoring.
Remark 8
It turns out that we basically don’t know how to factor large numbers : the best known classical algorithms can factor an
-bit number in
time (“general number field sieve”). On the other hand, with a quantum computer one can do this in time.
3. Primality Testing
Main question: if we can’t factor a number quickly, can we at least check it’s prime?
In what follows, we assume for simplicity that is squarefree, i.e.
for distinct primes
, This doesn’t substantially change anything, but it makes my life much easier.
3.1. Co-RP
Here is the goal: we need to show there is a random algorithm which does the following.
- If
is composite then
- More than half the time
says “definitely composite”.
- Occasionally,
says “possibly prime”.
- More than half the time
- If
is prime,
always says “possibly prime”.
If there is a polynomial time algorithm that does this, we say that PRIMES is in Co-RP. Clearly, this is a very good thing to be true!
3.2. Fermat
One idea is to try to use the converse of Fermat’s little theorem: given an integer , pick a random number
and see if
. (We compute using repeated exponentiation.) If not, then we know for sure
is not prime, and we call
a Fermat witness modulo
.
How good is this test? For most composite , pretty good:
Proposition 9
Let be composite. Assume that there is a prime
such that
does not annihilate
. Then over half the numbers mod
are Fermat witnesses.
Proof: Apply the Chinese theorem then the “all-or-nothing” theorem.
Unfortunately, if doesn’t satisfy the hypothesis, then all the
satisfy
!
Are there such which aren’t prime? Such numbers are called Carmichael numbers, but unfortunately they exist, the first one is
.
Remark 10
For , there are more than
Carmichael numbers at most
.
Thus these numbers are very rare, but they foil the Fermat test.
Exercise 11
Show that a Carmichael number is not a semiprime.
3.3. Rabin-Miller
Fortunately, we can adapt the Fermat test to cover Carmichael numbers too. It comes from the observation that if is prime, then
.
So let , where
is odd. For example, if
then
. Then we compute
,
, \dots,
. For example in the case
and
:
And there we have our example! We have , so
isn’t prime.
So the Rabin-Miller test works as follows:
- Given
, select a random
and compute powers of
as in the table.
- If
, stop,
is composite (Fermat test).
- If
, see if the entry just before the first
is
. If it isn’t then we say
is a RM-witness and
is composite.
- Otherwise,
is “possibly prime”.
How likely is probably?
Theorem 12
If is Carmichael, then over half the
are RM witnesses.
Proof: We sample randomly again by looking modulo each prime (Chinese theorem). By the theorem on primitive roots, show that the probability the first
appears in any given row is
. This implies the conclusion.
Exercise 13
Improve the in the problem to
by using the fact that Carmichael numbers aren’t semiprime.
3.4. AKS
In August 6, 2002, it was in fact shown that PRIMES is in P, using the deterministic AKS algorithm. However, in practice everyone still uses Miller-Rabin since the implied constants for AKS runtime are large.