|
From Real-World Cryptography by David Wong In this article, the author teaches readers about the Diffie-Hellman key exchange standard, which was the very first key exchange algorithm ever invented, and the Elliptic Curve Diffie-Hellman key exchange standard, which is the Diffie-Hellman built with elliptic curves. |
Take 37% off Real-World Cryptography by entering fccwong into the discount code box at checkout at manning.com.
There are many key exchange algorithms, and all are based on mathematical problems believed to be hard to solve. This is unlike symmetric cryptography primitives that most often rely on statistical analysis and years of cryptanalysis.
All right, what are you going to learn in this section?
- The first key exchange algorithm I will talk about is actually the very first key exchange algorithm ever invented (and one of the first formalization of a public key cryptographic algorithm). It is called Diffie-Hellman (DH) after the last names of the two inventors Whitfield Diffie and Martin E. Hellman.
- The second key exchange algorithm I will talk about is actually the same Diffie-Hellman algorithm but built with elliptic curves, it is thus usually called Elliptic Curve Diffie-Hellman (ECDH).
I will do my best to give good intuitions on how these algorithms work, but there’s no way around it: there is going to be some math. Brace yourself!
Diffie-Hellman (DH)
In 1976, Whitfield Diffie and Martin E. Hellman wrote their seminal paper on the DH key exchange algorithm entitled “New Direction in Cryptography” (what a title right?).
The construction was based on a mathematical problem that was (and still is) believed to be hard: how to find a discrete logarithm in a group (a mathematical concept I will explain later).
In this section I will provide explanations on what is a group, what is this discrete logarithm problem, and what group DH chose to use. You will see later that ECDH is the same algorithm but using a different group. At the end of this section you will hopefuly leave with a strong intuition on how a DH key exchange algorithm works, and how it can be used in practice.
As I’ve stated earlier, a key exchange is a few mathematical operations happening in a group. But what’s a group?
It’s two things:
- A set of elements.
- An operation defined on these elements that satisfy certain properties.
This is a bit too abstract for us. Here is what we use in practice for DH:
First. The set of strictly positive integers 1, 2, 3, 4, …, p-1 where p is a prime number. Different standards will specify different numbers for p.
Remember, a prime number is just a number that can only be divided by 1 or itself. The first prime numbers are 2, 3, 5, 7, 11, and so on.
Prime numbers are everywhere in asymmetric cryptography! And fortunately, we have efficient algorithms to find large ones, but to speed things up most cryptographic libraries will instead look for pseudo-primes (which are numbers that have a very high probability of being primes). Interestingly, such optimizations have been broken several times in the past, the most infamous occurrence was in 2017 when more than a million devices were found to have generated incorrect primes for their cryptographic applications.
Second. A modular multiplication operation defined on this set of numbers.
If you’ve never heard about modular arithmetic and what “modulo” means, read on…
Let’s take for example the number 7, and write its Euclidian division with 5 as
7 = 5 x 1 + 2
Notice that the remainder is 2. The equation above can be read as
“7 is congruent to 2 modulo 5” or 7 = 2 mod 5
We call 5 the modulus in this equation. Similarly:
- 8 = 1 mod 7
- 54 = 2 mod 13
- 170 = 0 mod 17
- and so on…
The classical way of picturing such a concept is with a clock. Once we reach the last number (eleven), incrementing means going back to 0. This is illustrated in figure 1.
Figure 1. The group of integers modulo the prime number 5 can be pictured as a clock that resets to 0 after the number 4. Thus 5 is represented as 0, 6 as 1, 7 as 2, 8 as 3, 9 as 4, 10 as 0, and so on.
A modular multiplication is very natural to define over such a set of numbers. Let’s take the following multiplication for example:
3 x 2 = 6
With what you learned above, you know that 6 is congruent to 1 modulo 5, and thus the equation can be rewritten as
3 x 2 = 1 mod 5
Quite straightforward isn’t it?
I said earlier that the operation satisfies a number of properties. One of them is that every element defined in our group must have an inverse. We say that an element a has an inverse b if
a x b = 1 mod n
We usually write the inverse b of a as a-1. For example, we know that 2 is the inverse of 3 mod 5 so we can write
3-1 = 2 mod 5
It happens that when we use the positive numbers modulo a prime, only the zero element lacks an inverse (indeed, can you find an element b such that 0 x b = 1 mod 5?). This is the reason why we do not include zero as one of our element in our group.
OK, we now have a group: the set of strictly positive integers 1, 2, …, p-1 for p a prime number, along with the modular multiplication.
The group we formed also happens to be a finite field, a much broader mathematical structure which I will ignore for now (briefly, a finite field defines two operations instead of one: a multiplication and an addition). For this reason, DH defined over this group is sometimes called FFDH (for Finite Field Diffie-Hellman). Since cryptographers like to talk about DH as the general construction that can be instantiated over any group, FFDH is sometimes used to avoid ambiguities with other instantiations of DH (like Elliptic Curve Diffie-Hellman).
One more thing I need to define for convenience, you can write:
- 3 x 3 as 32
- 3 x 3 x 3 as 33
- and so on…
This allows me to introduce the notion of a subgroup. By choosing a number that we call a generator (or base), and by multiplying it over and over by itself, you can sometimes produce a subgroup. For example, the generator 4 defines a subgroup consisting of the numbers 1 and 4:
- 41 mod 5 = 4
- 42 mod 5 = 1
- 43 mod 5 = 4 (we start again from the beginning)
- 34 mod 5 = 1
- and so on …
It happens that when our modulus is prime, every element of our group can generate a subgroup. Different subgroups have different sizes, which we call orders. I illustrate this in figure 2.
Figure 2. The different subgroups of the multiplicative group modulo 5. They all include the number 1 (called the identity element) and have different orders (number of elements).
All right, you now understand:
- What a group is.
- A generator can be used to generate a whole (sub)group.
Groups are the center of a huge amount of different cryptographic primitives. It is thus important to have good intuitions about them, if you want to understand how other cryptographic primitives work.
One more thing, what is the discrete logarithm problem?
Imagine that I take a generator, let’s say 3, and give you a random element it generates, let’s say 2 = 3X mod 5. Asking you “what is the X here?” is the same as asking you “find the discrete logarithm of 2 in base 3“. Thus, the discrete logarithm problem in our group is about finding out how many times we multiplied the generator with itself in order to produce a given group element.
This is a very important concept. Take a few minutes to understand it before continuing.
In our example group, you can quickly find out that 3 is the answer. Indeed 33 = 2 mod 5. But if we picked a way bigger prime number than 5, things get much more complicated: it becomes a hard problem.
This is the secret sauce behind Diffie-Hellman!
You now know enough to understand how to generate a keypair in DH:
- All the participants must agree on a large prime p and a generator g.
- Each participant generates a random number X, this becomes their private key.
- Each participant derives their public key as gX mod p.
The fact that the discrete logarithm problem is hard means that no one should be able to recover the private key out of the public key. I illustrate this in figure 3.
Figure 3. Choosing a private key in Diffie-Hellman is like choosing an index in a list of numbers produced by a generator g. The discrete logarithm problem is to find the index from the number alone.
While we do have algorithms to compute discrete logarithms, they are not very efficient. On the other hand, if I give you the solution to the problem (the X), we have very fast algorithm for you to check that indeed I provided you with the right solution (gX mod p is fast to compute).
Like everything in cryptography, it is not impossible to find a solution by simply trying to guess. Yet, by choosing large enough parameters (here a large prime number), it is possible to reduce the efficacy of such a search for a solution down to negligible odds. Meaning that even after hundreds of years of random tries, your odds of finding a solution should still be statistically close to zero.
Great. How do we use all of this math for our DH key exchange algorithm?
Imagine that:
- Alice has a private key a and a public key A = ga mod p.
- Bob has a private key b and a public key B = gb mod p.
With the knowledge of Bob’s public key, Alice can compute the shared secret as
Ba mod p
Bob can do a similar computation with Alice’s public key and his own private key
Ab mod p
Very naturally, we can see that these two calculations end up computing the same number!
Ba = (gb)a = gab = (ga)b = Ab mod p
And that’s the magic of DH. From an outsider, just observing the public keys A and B does not help in any way to compute the result of the key exchange gab mod p.
By the way, in theoretical cryptography the idea that observing ga mod p and gb mod p does not help you to compute gab mod p is called the Computational Diffie-Hellman Assumption (CDH). It is often confused with the stronger Decisional Diffie-Hellman Assumption (DDH) which intuitively states that given the tuple (ga mod p, gb mod p, gz mod p), nobody should be able to confidently guess if the latter element is the result of a key exchange between the two public keys or just a random element of the group. Both are very useful theoretical assumptions that have been used to build many different algorithms in cryptography.
Next, you will learn about how real-world applications make use of this algorithm, and the different standards that exist.
Diffie-Hellman Standards
Now that you have seen how Diffie-Hellman works, you understand that participants need to agree on a set of parameters, specifically on a prime number p and a generator g.
In this section you’ll learn about how real-world applications choose these parameters, and what are the different standards that exist.
First thing first, the prime number p. As I stated earlier, the bigger, the better. Since DH is based on a the discrete logarithm problem, its security is directly correlated to the best attacks known on the problem. Any advances in this area can weaken the algorithm. With time, we’ve managed to obtain a pretty good idea of how fast (or slow) these advances are, and how much is enough security. The current known best practices are to use a prime number of 2048 bits.
In general, keylength.com summarizes recommendations on parameter lengths for common cryptographic algorithms. The results are taken from authoritative documents produced by research groups or government bodies like the ANSSI (France), the NIST (US), and the BSI (Germany). While they do not always agree, they often converge towards similar orders of magnitude.
In the past, many libraries and software were generating and hardcoding their own parameters. Unfortunately, they were often found to be either weak, or worse, backdoored. While blindly following standards might seem like a good idea in general, DH is one of the unfortunate counter-example that exist, as it was found that some of them might have been backdoored, as was the case for RFC 5114. Due to all of these issues, newer protocols and libraries have converged towards either deprecating DH (in favor of ECDH) or using the groups defined in the better standard RFC 7919.[4]
For this reason, best practice nowadays is to use RFC 7919 which defines several groups of different sizes and security. For example, ffdhe2048 is the group defined by the 2048-bit prime modulus
p = 32317006071311007300153513477825163362488057133489075174588434139269806834136210002792056362640164685458556357935330816928829023080573472625273554742461245741026202527916572972862706300325263428213145766931414223654220941111348629991657478268034230553086349050635557712219187890332729569696129743856241741236237225197346402691855797767976823014625397933058015226858730761197532436467475855460715043896844940366130497697812854295958659597567051283852132784468522925504568272879113720098931873959143374175837826000278034973198552060607533234122603254684088120031105907484281003994966956119696956248629032338072839127039L
and with generator g = 2
It is common to choose the number 2 for the generator, as computers are quite efficient at multiplying with 2 (it’s a simple left shift <<instruction).
The group size (or order) is also specified as q = (p-1)/2.
This implies that both private keys and public keys will be around 2048-bit of size.
In practice, these are quite large sizes for keys (compare that with symmetric keys for example, that are usually 128-bit). You will see in the next section that defining a group over the elliptic curves allow us to obtain much smaller keys for the same amount of security.
Elliptic Curve Diffie-Hellman (ECDH)
It turns out that the Diffie-Hellman algorithm, which we just discussed, can be implemented in more than just modulo a prime number. All we need is just a group! It also turns out that a group can be made from elliptic curves (a type of curves in mathematics). The idea was proposed in 1985 by Neal Koblitz and Victor S. Miller independently, and much later adopted when Elliptic Curve Cryptography (ECC) (cryptographic algorithms that are based on elliptic curves) started seeing standardization around 2000. The world of applied cryptography quickly adopted algorithms based on elliptic curves as they provided way smaller keys than the previous generation’s algorithms. Compared to the recommended 2048-bit parameters in DH, parameters of 256-bit were possible with the elliptic curve variant of the algorithm.
Elliptic curve cryptography has remained at its full strength since it was first presented in 1985. “[…] The United States, the UK, Canada and certain other NATO nations have all adopted some form of elliptic curve cryptography for future systems to protect classified information throughout and between their governments”
— NSA
The Case for Elliptic Curve Cryptography (2005)
Let’s now explain how elliptic curves work. First and foremost, it is good to understand that elliptic curves are just curves! Meaning that they are defined by all the coordinates x and y that solves an equation. Specifically this equation
y2 + a1 x y + a3 y = x3 + a2 x2 + a4 x + a6
for some a1, a2, a3, a4, and a6.
Note that for most practical curves today, this equation can be simplified to y2 = x3 + ax + b. While the simplification is not possible for two types of curves (called binary curves and curves of characteristic 3), they are used rarely enough that we will use the simplified form in the rest of this chapter.
Figure 4 shows an example of an elliptic curve, with two points taken at random.
Figure 4. One example of an elliptic curve defined by an equation.
At some point in the history of elliptic curves, it was found that a group could be constructed over them. From there, implementing Diffie-Hellman on top of these groups was straightforward. I will use this section to explain the intuition behing elliptic curve cryptography.
Groups over elliptic curves are often defined as additive groups. Unlike multiplicative groups defined in the previous section, the + sign is used instead.
Using an addition or a multiplication does not matter much in practice, it is just a matter of preference. While most of cryptography uses a multiplicative notation, the literature around elliptic curve has gravitated around an additive notation, and thus this is what I will use when refering to elliptic curve groups in this book.
This time, I will define the operation before defining the elements of the group. Our addition operation is defined in the following way:
- Draw a line going through two points you want to add. The line hits the curve in another point.
- Draw a vertical line from this newly found point. The vertical line hits the curve in yet another point.
- This point is the result of adding the original two points together.
This process is illustrated in figure 5.
Figure 5. An addition operation can be defined over points of an elliptic curve by using geometry!
There are two special cases where this rule won’t work. Let’s define them as well:
- How do we add a point to itself? The answer is to draw the tangent to that point (instead of drawing a line between two points).
- What happens if the line we draw in step 1 (or step 2) does not hit the curve at any other point? Well, this is embarassing and we need this special case to work and produce a result. The solution is to define the result as a fictive point (something we made up). This fictive point is called the point at infinity (that we usually write with a big letter O).
Figure 6 illustrates these special cases.
Figure 6. Building on figure 5, addition on an elliptic curve is also defined when adding a point to itself, or when two points cancel each other to result in the point at infinity.
I know this point at infinity is some next-level weirdness, but don’t worry too much about it. It is really just something we came up with in order to make the addition operation work. Oh and by the way, it behaves like a zero, and it is our identity element:
O + O = O
and for any point P on the curve
P + O = P
All good. So far we’ve seen that to create a group on top of an elliptic curve, we need:
- An elliptic curve equation that defines a set of valid points.
- A definition of what an addition means in this set.
- An imaginary point called a point at infinity.
I know this is a lot of information to unpack, but we are missing one last thing. Elliptic curve cryptography is defined over a finite field. In practice, what this means is that our coordinates are the numbers 1, 2, …, p-1 for some large prime number p. This should sound familiar!
For this reason, when thinking of elliptic curve cryptography, you should be thinking of a graph that looks much more like the one on the right in figure 7.
Figure 7. Elliptic Curve Cryptography (ECC) in practice is mostly specified with elliptic curves in coordinates modulo a large prime number p. This means that what we use in cryptography looks much more like the right graph than the left graph.
And that’s it!
We now we have a group we can do cryptography on. The same way we had a group made with the numbers (excluding 0) modulo a prime number and a multiplication operation for Diffie-Hellman.
How can we do Diffie-Hellman with this group defined on elliptic curves? Let’s see how the discrete logarithm works now in this group.
Let’s take a point G and add it to itself x times to produce another point P via the addition operation we defined. We can write that as P = G+…+G x times, or use some mathematical syntactic sugar to write that as P = [x]G which reads “x times G”. The elliptic curve discrete logarithm problem (ECDLP) is to find the number x from knowing just P and G.
That’s all we needed, you now know enough to understand how to generate a keypair in ECDH:
- All the participants agree on an elliptic curve equation, a finite field (most likely a prime number), and a generator G (usually called based point in elliptic curve cryptography).
- Each participant generates a random number X, this becomes their private key.
- Each participant derives their public key as [X]G.
As the elliptic curve discrete logarithm problem is hard, you guessed it, no one should be able to recover your private key just by looking at your public key. I illustrate this in figure 8.
Figure 8. Choosing a private key in ECDH is like choosing an index in a list of numbers produced by a generator (or base point) G. The elliptic curve discrete logarithm problem (ECDLP) is to find the index from the number alone.
All of this might be a bit confusing, as the operation we had defined for our DH group was a multiplication, and for an elliptic curve we now use an addition. Again, these distinctions do not matter at all, as they are equivalent. You can see a comparison in figure 9.
Figure 9. Some comparisons between the group used in Diffie-Hellman and the group used in Elliptic Curve Diffie-Hellman.
You should now be convinced that the only thing that matters for cryptography is that we have a group defined with its operation, and that the discrete logarithm for this group is hard. See figure 10.
Figure 10. A comparison between the discrete logarithm problem modulo large primes, and the discrete logarithm problem in elliptic curve cryptography. They both relate to the Diffie-Hellman key exchange as the problem is to find the private key from a public key.
Last note on the theory, the group we formed on top of elliptic curves has some difference with the group we formed with the strictly positive integers modulo a prime number. Due to some of these differences, the strongest attacks known against DH (index calculus) do not work well on the elliptic curve groups. This is the main reason why parameters for ECDH can be much much lower than the parameters for DH, for the same level of security.
OK we are done with the theory. Let’s go back to defining ECDH. Imagine that:
- Alice has a private key a and a public key A = [a]G.
- Bob has a private key b and a public key B = [b]G.
With the knowledge of Bob’s public key, Alice can compute the shared secret as
[a]B
Bob can do a similar computation with Alice’s public key and his own private key
[b]A
Very naturally, we can see that these two calculations end up computing the same number!
[a]B = [a][b]G = [ab]G = [b][a]G = [b]A
And no passive adversary should be able to derive the shared point just from observing the public keys.
Looks familiar right?
Next, let’s talk about standards!
Elliptic Curve Diffie-Hellman Standards
The standardization of ECDH has been even more chaotic than DH. Many standardization bodies have come out and specified a large number of different curves, which was then followed by many wars over which curve was more secure. A large amount of research, mostly led by Daniel J. Bernstein, pointed out the fact that a number of curves standardized by the NIST could potentially be part of a weaker class of curves only known to the NSA.
I no longer trust the constants. I believe the NSA has manipulated them through their relationships with industry.
— Bruce Schneier
The NSA Is Breaking Most Encryption on the Internet (2013)
Nowadays, most of the curves in use come from two different standards:
- NIST FIPS 186-4, which was originally published in 2000.
- RFC 7748 which was published in 2016.
NIST FIPS 186-4 specifies 15 different curves, amongst which P-256 is the most widely used curve on the internet. It is defined with the points on the equation
y2 = x3 – 3x + b mod p
where
b = 41058363725152142129326129780047268409114441015993725554835256314039467401291
and
p = 2256 – 2224 + 2192 + 296 – 1
It defines a curve of order
n = 115792089210356248762697446949407573529996955224135760342422259061068512044369
meaning that there are n points (including the point at infinity).
The base point is the point
G = (48439561293906451759052585252797914202762949526041747995844080717082404635286, 36134250956749795798585127919587881956611106672985015071877198253568414405109)
The curve provides 128-bit of security. For applications that are using other cryptographic algorithms providing 256-bit security instead of 128-bit of security (for example AES with a 256-bit key), P-512 is available in the same standard to match the level of security.
Interestingly P-256, and other curves defined in FIPS 186-4, are said to be generated from a seed. For P-256, the seed is the bytestring 0xc49d360886e704936a6678e1139d26b7819f7e90. I’ve talked about this notion of “nothing-up-my-sleeve” numbers before, that are constants aimed at proving that there was no room for backdoor during the design of the algorithm. Unfortunately, there isn’t much explanation behind this seed, other than the fact that it is specified along the curve’s parameter.
RFC 7748 specifies Curve25519, the curve defined by the equation
y2 = x3+486662x2+x mod p
where
p = 2255 – 19
It defines a curve of order
n = 2252 + 27742317777372353535851937790883648493
The base point is
G = (9, 14781619447589544791020593568409986887264606134616475288964881837755586237401)
The curve also provides 128-bit of security. For 256-bit of security, Curve448 is available in the same standard.
That’s all for this article. If you want to learn more about the book, you can check it out on our browser-based liveBook reader here.