Description: https://images.manning.com/360/480/resize/book/d/dbc9f54-a6a1-4551-bcce-942e2e7894fa/Wong-RWC-MEAP-HI.png

From Real-World Cryptography by David Wong

This article describes how Transport Layer Security Works.


Take 37% off Real-World Cryptography by entering fccwong into the discount code box at checkout at manning.com.


Today, Transport Layer Security (TLS) is the de-facto standard to secure communications between applications. In this article you will learn more about how TLS works underneath the surface, and how it is used in practice. You will find this section useful to understand how to use TLS properly, but also to understand how most (if not all) secure transport protocols work. (You will also find out why it is hard and strongly discouraged to redesign or reimplement such protocols.)

At a high level, TLS is split into two phases:

  • A handshake phase where a secure communication is negotiated and created between two participants.
  • A post-handshake phase where communications are encrypted between the two participants.

This idea is shown in figure 1.


Figure 1. At a high level, secure transport protocols first create a secure connection during a handshake phase. After that applications on both sides of the secure connection can communicate securely.


At this point, it is assumed that you have the following (correct) intuition about how these two steps works:

  • The handshake is, at its core, simply a key exchange. The handshake ends up with the two participants agreeing on a set of symmetric keys.
  • The post-handshake phase is purely about encrypting messages between participants, using an authenticated encryption algorithm and the set of keys produced at the end of the handshake.

Most transport security protocols work this way, and the interesting parts of these protocols always lie in the handshake phase.

Next, let’s take a look at the handshake phase.

The TLS Handshake

As you’ve seen, TLS is (and most transport security protocols are) divided into two parts: a handshake and a post-handshake phase. Let’s look at the handshake first. The handshake itself has 4 aspects that I want to tell you about:

 

  1. Negotiation. TLS is highly configurable. Both a client and a server can be configured to negotiate a range of SSL and TLS versions, as well as a menu of acceptable cryptographic algorithms. The negotiation phase of the handshake aims at finding common ground between the client’s and the server’s configurations, in order to securely connect the two peers.
  2. Key exchange. The whole point of the handshake is to perform a key exchange between the two participants. What key exchange algorithm to use? This is one of the things decided as part of the negotiation process.
  3. Authentication. It is trivial for a MITM attacker to impersonate any side of a key exchange. For this reason, key exchanges must be authenticated. (Your browser must have a way to make sure that it is talking to google.com and not your Internet service provider, for example.)
  4. Session Resumption. As browsers often connect to the same websites again and again, key exchanges can be costly and slow down a user’s experience. For this reason, mechanisms to fast-track secure sessions without redoing a key exchange are integrated into TLS.

This is a long list. As fast as greased lightning, let’s start with the first item.

Negotiation in TLS: What Version and What Algorithms?

Most of the complexity in TLS comes from the negotiation of the different moving parts of the protocol. Infamously, this negotiation has also been the source of many issues in the history of TLS. Attacks like FREAK, LOGJAM, DROWN, etc. took advantage of weaknesses present in older versions to break more recent versions of the protocol (as long as a server offered to negotiate these versions). While not all protocols have versioning, or allow for different algorithms to be negotiated, SSL/TLS was designed for the web. As such, SSL/TLS needed a way to maintain backward compatibility with older clients and servers that could be slow to update.

This is what happens on the web today: your browser might be recent and up-to-date, made to support TLS 1.3, but visiting some old web page chances are that the server behind it only supports versions of TLS up to 1.2 or 1.1 (or worse). Vice-versa, many websites must support older browsers (and thus older versions of TLS) as some users are stuck in the past.

 Most versions of SSL/TLS have security issues (except for TLS 1.2 and TLS 1.3), which should make you think that it is safer to only support TLS 1.3 and maybe TLS 1.2 but nothing more. Yet, some large companies must support a large amount of older TLS clients due to their business. It is not uncommon to find TLS libraries that support these older and broken versions by implementing mitigations to known attacks. Mitigations that are often too complex to implement. For example, the Lucky13 and Bleichenbacher98 attacks which broke some versions of the protocol (and thus all implementations of the protocol) have been mitigated in many “hardened” TLS implementations, yet researchers have been rediscovering them in different TLS implementations pretty much once every year since they were found.

Negotiation starts with the client sending a first request called a Client Hello to the server. The Client Hello contains a range of supported SSL and TLS versions, a suite of cryptographic algorithms that the client is willing to use, and some more information that can be relevant for the rest of the handshake or for the application. The suite of cryptographic algorithms include:

  • One or more key exchange algorithms. TLS 1.3 defines the following algorithms allowed for negotiations: ECDH with P-256, P-384, P-521, X25519, X448; and FFDH with the groups defined in RFC 7919. Previous versions of TLS also offered RSA key exchanges but they were removed in the last version.
  • Two (for different parts of the handshake) or more digital signature algorithms. TLS 1.3 specifies RSA PKCS#1 v1.5 and the newer RSA-PSS, as well as more recent elliptic curve algorithms like ECDSA and EdDSA. Note that digital signatures are specified with a hash function which allows you to negotiate, for example, RSA-PSS with either SHA-256 or SHA-512.
  • One or more hash functions to be used with HMAC and HKDF. TLS 1.3 specifies SHA-256 and SHA-384, two instances of the SHA-2 hash function. This choice of hash function is unrelated to the one used by the digital signature algorithm.
  • One or more authenticated encryption algorithms. These can include AES-GCM (with keys of 128 or 256 bits), Chacha20-Poly1305, and AES-CCM.

The server then responds with a Server Hello message that contains one of each type of cryptographic algorithms, cherry-picked from the client’s selection.



If the server is unable to find an algorithm it supports, it has to abort the connection. Although in some cases, the server does not have to abort the connection and can ask the client to provide more information instead. To do this, the server replies with a message called a Hello Retry Request asking for specific piece of information. The client can then resend its Client Hello, this time with the added requested information.

TLS and Forward Secure Key Exchanges

The key exchange is the most important part of the TLS handshake! Without it, there’s obviously no symmetric key being negotiated. But for a key exchange to happen, the client and the server must first trade their respective public keys.

In TLS 1.2 and previous versions, the key exchange is done once both participants know which key exchange algorithm to use. This means that they first negotiate on which algorithm to use, and then exchange their public keys.

In TLS 1.3, to avoid that first negotiating round trip (one client message and one server message), the client speculatively sends a public key in the very first message (the Client Hello). If the client fails to predict the server’s choice of key exchange algorithm then the client will have to send a new Client Hello containing the correct public key. For example:

  • The client sends a TLS 1.3 Client Hello announcing that it can do either an X25519 or an X448 key exchange. It also sends an X25519 public key.
  • The server does not support X25519, but does support X448. It sends a Hello Retry Request to the client announcing that it only supports X448.
  • The client sends the same Client Hello but with an X448 public key instead.
  • The handshake goes on.

I illustrate this difference in figure 2.


Figure 2. In TLS 1.2, the client waits for the server to choose which key exchange algorithm to use before sending a public key. In TLS 1.3, the client speculates on which key exchange algorithm(s) the server will settle on, and preemptively sends a public key (or several) in the first message, potentially avoiding an extra round trip.


TLS 1.3 is full of such optimizations, which are important for the web. Indeed many people in the world have unstable or slow connections, and it is important to keep non-application communication to the bare minimum required.

Furthermore, in TLS 1.3 and unlike previous versions of TLS, all key exchanges are ephemeral. This means that for each new session, the client and the server both generate new key pairs, then get rid of them as soon as the key exchange is done. Why? The key word is forward secrecy!

Imagine what would happen if instead, a TLS server used a single private key for every key exchange it performed with its clients.

A compromise of the server’s private key at some point in time would be devastating, as a MITM attacker would then be able to decrypt all previously recorded conversations. (Do you understand how?)

Instead, by performing ephemeral key exchanges and getting rid of private keys as soon as a handshake ends, the server protects against such attackers. I illustrate this in figure 3.


Figure 3. In TLS 1.3 each session starts with an ephemeral key exchange. If a server is compromised at some point in time, no previous sessions will be impacted.


Once ephemeral public keys are traded, a key exchange is performed, and keys can be derived. TLS 1.3 derives different keys at different points in time, to encrypt different phases with independent keys.

The first two messages, the Client Hello and the Server Hello, cannot be encrypted as no public keys were traded at this point. But after that, as soon as the key exchange happens, TLS 1.3 encrypts the rest of the handshake. (This is unlike previous versions of TLS that did not encrypt any of the handshake messages.)

To derive the different keys, TLS 1.3 uses HKDF with the hash function negotiated. HKDF-Extract is used on the output of the key exchange (to remove any biases) while HKDF-Expand is used with different info parameters to derive the encryption keys. For example, “c hs traffic” (for client handshake traffic) is used to derive symmetric keys for the client to encrypt to the server during the handshake, “s ap traffic” (for “server application traffic”) is used to derive symmetric keys for the server to encrypt to the client after the handshake.

Don’t forget, unauthenticated key exchanges are insecure, next you’ll see how TLS addresses this.

TLS Authentication And The Web Public Key Infrastructure

After some negotiations, and after the key exchange has taken place, the handshake must go on. What happens next is the other most important part of TLS: authentication. Like we said earlier, it is trivial to intercept a key exchange and impersonate one (or both) sides of the key exchange. In this section I’ll explain how your browser cryptographically validates that it is talking to the right website, and not to an impersonator.

But let’s take a step back. There is something I haven’t told you so far. A TLS 1.3 handshake is actually split in three different stages (as illustrated in figure 4):

  1. Key Exchange. This phase contains the ClientHello and ServerHello messages which provide some negotiation and perform the key exchange. All messages (including handshake messages) after this phase are encrypted.
  2. Server Parameters. Messages in this phase contain additional negotiation data from the server. This is negotiation data that does not have to be contained in the first message of the server, and that could benefit from being encrypted.
  3. Authentication. This phase includes authentication information from both the server and the client.

Figure 4. A TLS 1.3 handshake is divided into 3 phases: the key exchange phase, the server parameters phase, and finally the authentication phase.


On the web, authentication in TLS is usually one-sided. Only the browser verifies that google.com is indeed google.com, but google.com does not verify who you are (or at least not as part of TLS).

CLIENT AUTHENTICATION is often delegated to the application layer for the web (via a form asking you for your credentials most often). That being said, client authentication can also happen in TLS if requested by the server (during the server parameters phase). When both sides of the connection are authenticated, we talk about mutually-authenticated TLS (sometimes abbreviated as mTLS). Client authentication is done the exact same way as server authentication, and can happen at any point in time after the authentication of the server (during the handshake or in the post-handshake phase).

Let’s now answer the question: when connecting to google.com, how does your browser verify that you are indeed handshaking with google.com?

Using the web public key infrastructure (web PKI)!

There are two sides to the web PKI:

First, Browsers must trust a set of root public keys that we call Certificate Authorities (CAs). Usually, browsers will either use a hardcoded set of trusted public keys or rely on the operating system to provide them.

NOTE:  For the web, there exist hundreds of these CAs which are independently run by different companies and organizations across the world. It is quite a complex system to analyze and these CAs can sometimes also sign the public keys of intermediate CAs that in turn also have the authority to sign the public keys of websites. For this reason, organizations like the Certification Authority Browser Forum (CA/Browser forum) enforce rules and decide when new organizations can join the set of trusted public keys, and when a CA can no longer be trusted and must be removed from that set.

Second, websites who want to use HTTPS must have a way to obtain a certification from these CAs (a signature of their signing public key). In order to do this, a website owner (or a webmaster as we used to say) must prove to a CA that they own a specific domain.

NOTE:  Obtaining a certificate for your own website used to involve a fee, this is no longer the case nowadays as Certificate Authorities like Let’s Encrypt provide these for free.

For example, to prove that you own example.com a CA might ask you to host a file at example.com/some_path/file.txt that contains some random numbers generated for your request.



After this, a CA can provide a signature over the website’s public key. As the CA’s signature is usually valid for a period of years, we say that it is over a long-term signing public key (as opposed to an ephemeral public key). More specifically, CAs do not actually sign public keys, but instead they sign certificates. A certificate contains the long-term public key, along with some additional important metadata like the name of your domain if you are a web page.

To prove to your browser that the server it is talking to is indeed google.com, the server sends a certificate chain as part of the TLS handshake which comprises of:

  1. Its own (leaf) certificate containing among others the name google.com, google’s long-term signing public key, as well as a signature from a CA.
  2. A chain of intermediate CA certificates, from the one that signed google’s certificate to the root CA that signed the last intermediate CA.

This is a bit wordy so I illustrated this in figure 5.


Figure 5. Web browsers only have to trust a relatively small set of root Certificate Authorities (CAs) in order to trust the whole web. These CAs are stored in what is called a trust store. In order for a website to be trusted by a browser, the website must have its (leaf) certificate be signed by one of these CAs. Sometimes root CAs only sign intermediate CAs, who in turn sign other intermediate CAs or leaf certificates. This is called the web public key infrastructure (web PKI).


This certificate chain is sent in a Certificate TLS message by the server, and by the client as well if the client has been asked to authenticate.

Following this, the browser can use its certified long-term key pair to sign all handshake messages that have been received and sent since then (in what is called a Certificate Verify message).

I’ve recapitulated this flow (where only the server authenticates itself) in figure 6.


Figure 6. The authentication part of a handshake starts with the server sending a certificate chain to the client. The certificate chain starts with the leaf certificate (the certificate containing the website’s public key and additional metadata like the domain name) and ends with a root certificate that is trusted by the browser. Each certificate contains a signature from the certificate above it in the chain.


The signature in the Certificate Verify message proves to the client what the server has seen so far. Without this signature, a MITM attacker could intercept the server’s handshake messages and replace the ephemeral public key of the server contained in the Server Hello message, allowing the attacker to successfully impersonate the server.

Take a few moments to understand why an attacker cannot replace the server’s ephemeral public key in the presence of the Certificate Verify signature.

Story time.

A few years ago I was hired to review a custom-TLS protocol made by a large company. It turned out that their protocol had the server provide a signature that did not cover the ephemeral key. When I told them about the issue, the whole room went silent for a full minute.

It was of course a substantial mistake: an attacker who could have intercepted the custom handshake and replaced the ephemeral key with its own, would have successfully impersonated the server. The lesson here is that it is important not to reinvent the wheel. Secure transport protocols are hard to get right and if history has shown anything, they can fail in many unexpected ways. Instead, you should rely on mature protocols like TLS and make sure you use a popular implementation that has received a substantial amount of public attention.

Finally, in order to officially end the handshake, both sides of the connection must send a Finished message as part of the Authentication phase. A Finished message contains an authentication tag produced by HMAC (instantiated with the negotiated hash function for the session). This allows both the client and the server to tell the other side: “these are all the messages I have sent and received, in order, during this handshake”. If the handshake was intercepted and tampered with by a MITM attacker, this integrity check can allow the participants to detect and abort the connection. (This is especially useful as some handshakes modes are not signed).

Before heading to a different aspect of the handshake, let’s double click on X.509 certificates, as they are an important detail of many cryptographic protocols.

Authentication Via X.509 Certificates

While certificates are optional in TLS 1.3 (you can always use plain keys), many applications and protocols (not just the web) make heavy use of them in order to certify additional metadata. Specifically, the X.509 certificate standard version 3 is used.

X.509 is a pretty old standard that was meant to be flexible enough to be used in a multitude of scenarios, from email to web pages. The X.509 standards uses a description language called Abstract Syntax Notation One (ASN.1) to specify information contained in a certificate. An ASN.1 looks like this:

 
 Certificate  ::=  SEQUENCE  {
     tbsCertificate       TBSCertificate,
     signatureAlgorithm   AlgorithmIdentifier,
     signatureValue       BIT STRING  }
  

You can literally read this as a structure that contains three fields:

  • tbsCertificate. The “to-be-signed” certificate. This contains all the information that one wants to certify. For the web, this can contain a domain name (google.com), a public key, an expiration date, and so on.
  • signatureAlgorithm. The algorithm used to sign the certificate.
  • signatureValue. The signature from a Certificate Authority.

By the way, the last two values are not contained in the actual certificate (tbsCertificate), do you know why?

You can easily check what’s in an X.509 certificate by connecting to any website using HTTPS, and using your browser functionalities to observe the certificate chain sent by the server. See figure 7 for an example.


Figure 7. Using Chrome’s Certificate viewer, we can observe the certificate chain sent by a Google’s server. The root Certificate Authority is “Global Sign,” which is trusted by your browser. Down the chain, an intermediate CA called “GTS CA 101” is trusted due to its certificate containing a signature from “Global Sign.” In turn, Google’s leaf certificate, valid for *.google.com (google.com, mail.google.com, and so on), contains a signature from “GTS CA 101.”


You might encounter X.509 certificates as .pem files, which is some base64 encoded content surrounded by some human-readable hint of what the base64 encoded data contains (here a certificate). The following snippet represents the content of a certificate in a .pem format:

 
 -----BEGIN CERTIFICATE-----
 MIIJQzCCCCugAwIBAgIQC1QW6WUXJ9ICAAAAAEbPdjANBgkqhkiG9w0BAQsFADBC
 MQswCQYDVQQGEwJVUzEeMBwGA1UEChMVR29vZ2xlIFRydXN0IFNlcnZpY2VzMRMw
 EQYDVQQDEwpHVFMgQ0EgMU8xMB4XDTE5MTAwMzE3MDk0NVoXDTE5MTIyNjE3MDk0
 NVowZjELMAkGA1UEBhMCVVMxEzARBgNVBAgTCkNhbGlmb3JuaWExFjAUBgNVBAcT
 [...]
 vaoUqelfNJJvQjJbMQbSQEp9y8EIi4BnWGZjU6Q+q/3VZ7ybR3cOzhnaLGmqiwFv
 4PNBdnVVfVbQ9CxRiplKVzZSnUvypgBLryYnl6kquh1AJS5gnJhzogrz98IiXCQZ
 c7mkvTKgCNIR9fedIus+LPHCSD7zUQTgRoOmcB+kwY7jrFqKn6thTjwPnfB5aVNK
 dl0nq4fcF8PN+ppgNFbwC2JxX08L1wEFk2LvDOQgKqHR1TRJ0U3A2gkuMtf6Q6au
 3KBzGW6l/vt3coyyDkQKDmT61tjwy5k=
 -----END CERTIFICATE-----
  

If you decode the base64 content surrounded by the BEGIN CERTIFICATE and END CERTIFICATE, you end up with a Distinguished Encoding Rules (DER) encoded certificate. DER is a deterministic (only one way to encode) binary encoding used to translate X.509 certificates into bytes. All these encodings are often quite confusing to new-comers! I recap all of this in figure 8.


Figure 8. On the top left corner, an X.509 certificate is written using the ASN.1 notation. It is then transformed into bytes that can be signed via the DER encoding. As this is not text that can easily be copied around and be recognized by humans, it is base64 encoded. The last touch wraps the base64 data with some handy contextual information using the PEM format.


DER only encodes information as “here is an integer” or “this is a bytearray.” Fields’ names described in ASN.1 like tbsCertificate are thus lost after encoding. Decoding DER without the knowledge of the original ASN.1 description of what each field truly means is thus pointless. Handy command line tools like OpenSSL allow you to decode and translate in human terms the content of a DER-encoded certificate. For example, if you download google.com’s certificate, you can use the following command to display its content in your terminal:

 
 $ openssl x509 -in google.pem -text
 Certificate:
     Data:
         Version: 3 (0x2)
         Serial Number:
             0b:54:16:e9:65:17:27:d2:02:00:00:00:00:46:cf:76
         Signature Algorithm: sha256WithRSAEncryption
         Issuer: C = US, O = Google Trust Services, CN = GTS CA 1O1
         Validity
             Not Before: Oct  3 17:09:45 2019 GMT
             Not After : Dec 26 17:09:45 2019 GMT
         Subject: C = US, ST = California, L = Mountain View, O = Google LLC, CN = *.google.com
         Subject Public Key Info:
             Public Key Algorithm: id-ecPublicKey
                 Public-Key: (256 bit)
                 pub:
                     04:74:25:79:7d:6f:77:e4:7e:af:fb:1a:eb:4d:41:
                     b5:27:10:4a:9e:b8:a2:8c:83:ee:d2:0f:12:7f:d1:
                     77:a7:0f:79:fe:4b:cb:b7:ed:c6:94:4a:b2:6d:40:
                     5c:31:68:18:b6:df:ba:35:e7:f3:7e:af:39:2d:5b:
                     43:2d:48:0a:54
                 ASN1 OID: prime256v1
                 NIST CURVE: P-256
 [...]
  

Having said all of this, X.509 certificates are quite controversial.

Validating X.509 certificates was comically dubbed “The Most Dangerous Code in the World” by a team of researchers in 2012. This is because DER encoding is a difficult protocol to parse correctly, and the complexity of X.509 certificates makes for many mistakes to be potentially devastating. For this reason, I don’t recommend any modern application to use X.509 certificates unless they have to.

Pre-Shared keys and Session Resumption in TLS, Or How to Avoid Key Exchanges

Key exchanges can be costly, and are sometimes not needed. For example, you might have two machines that only connect to each other, and you might not want to have to deal with a public-key infrastructure in order to secure their communications. TLS 1.3 offers a way to avoid this overhead with pre-shared keys (PSK).

A pre-shared key is simply a secret that both the client and the server know, and that can be used to derive symmetric keys for the session.

In TLS 1.3, a PSK handshake works by having the client advertise in its Client Hello message that it supports a list of PSK identifiers. If the server recognizes one of them, it can say so in its response (the Server Hello message) and both can avoid doing a key exchange (if they want to). By doing this, the authentication phase is skipped, making the Finished message at the end of the handshake important to prevent MITM attacks.

NOTE  that this does not mean that the same set of symmetric keys is derived for every session using the same PSK. Using different keys for different sessions is extremely important, as you do not want these sessions to be linked. Worse, since encrypted messages might be different between sessions, this could lead to nonce reuses and their catastrophic implications. To mitigate this, both the Client Hello and Server Hello messages have a random field which is randomly generated for every new session. These random fields are used in the derivation of symmetric keys in TLS, effectively creating never-seen-before encryption keys every time you create a new connection.

Another use case for PSKs is session resumption. Session resumption is about reusing secrets created from a previous session or connection. If you have already connected to google.com and have already verified their certificate chain and agreed on a shared secret, why do this dance again a few minutes or hours later? If you know how browsers work, you probably know that they also create several TCP connections to a web pages they visit in order to quickly load a page, do we really want to do a full TLS handshake for all of these separate TCP connections as well?

TLS 1.3 offers a way to generate a PSK after a handshake was successfully performed, which can be used in subsequent connections to avoid having to redo a full handshake.

If the server wants to offer this feature, it can send New Session Ticket message(s) at any time during the post-handshake phase. There are multiple ways for the server to create so-called “session tickets.” For example, the server can send an identifier, associated to the relevant information in a database (forcing the server to keep a state), or the server can send the authenticated encryption of the required information to perform session resumption with the client (allowing the server’s session resumption mechanism to be stateless). These are not the only ways, but as this mechanism is quite complex and most of the time not necessary I won’t touch more of it in this article.

Next let’s see the easiest part of TLS: how application data is encrypted.

How TLS 1.3 Encrypts Application Data

Once a handshake has taken place, and symmetric keys have been derived, both the client and the server can send each other encrypted application data. But this is not all, TLS ensures that such messages cannot be replayed nor reordered.

To do this, the nonce used by the authenticated encryption algorithm starts at a fixed value and is incremented for each new message. If a message is replayed, or reordered, the nonce will be different from what is expected and decryption will fail. When this happens the connection is killed.

NOTE  Encryption does not always hide the length of what is being encrypted. TLS 1.3 comes with record padding, which can be configured to pad application data with a random number of zero bytes before encrypting it, effectively hiding the true length of the message. In spite of this, statistical attacks that remove the added noise can exist, and it is not straightforward to mitigate them. If you really require this security property, you should refer to the TLS 1.3 specification.

Starting in TLS 1.3, clients have the possibility to send encrypted data as part of their first series of messages (right after the Client Hello message), that is if the server agrees. This means that browsers do not necessarily have to wait until the end of the handshake to start sending application data to the server. This mechanism is called early data or 0-RTT (for zero round-trip-time) and can only be used with the combination of a PSK (as it allows derivation of symmetric keys during the Client Hello message).

This feature was quite controversial during the development of the standard, as a passive attacker can replay an observed Client Hello followed by the encrypted 0-RTT data. This is why 0-RTT must be used only with application data that can be replayed safely.

For the web, browsers treat every GET queries as idempotent, meaning that they should not change any state on the server side and are only meant to retrieve data (unlike POST queries, for example). This is of course not always the case, and applications have been known to do whatever they want to do. For this reason, if you are confronted with the decision of using 0-RTT or not, it is simpler just not to use it.

That’s all for this article. For more, check out the book on our browser-based liveBook platform here.