How does BEAST work?
Last month, I did a talk at GOTO Berlin where I explained the basics of Transport Layer Security. During the talk, the audience asked a few questions through the app. One of them was: “How does Beast work?” and I wasn’t able to answer that one on stage, unfortunately. Since it’s an interesting question, I’ll answer it here. Unfortunately, understanding BEAST is a bit harder than the talk itself…
BEAST stands for Browser Exploit Against SSL/TLS. In itself, it isn’t a vulnerability. Thai Duong and Juliano Rizzo made a demonstration of a longer-known vulnerability. This vulnerability was published back in 2004 and applied to SSL 3.0 and TLS 1.0. BEAST showed that this old vulnerability was in fact useable for a real-world attack.
Block Ciphers and Cipher Block Chaining
To understand what BEAST is about, we must dive a bit deeper into cryptography. BEAST attacks so-called block ciphers. Block ciphers are encryption mechanisms that operate on fixed-length groups of bits (or blocks). In its simplest form, called Electronic Codebook (ECB), a block cipher will operate on each block at a time. The disadvantage is that if the input contains certain patterns, you can recognise those patterns in the (encrypted) output.
You can solve this by using a more advanced form, called a Cipher Block Chaining (CBC). In this form, you combine each block of input with the previous encrypted block before it is being encrypted. In a mathematical form:
\[E_x = C ( P_x \oplus E_{x-1} )\]
where
- \(E_x\) is the xth encrypted block
- \(P_x\) is the xth input block
- \(C\) is the encryption method itself
- \(\oplus\) is the
XOR
operation
A chosen-plaintext attack
The type of vulnerability that BEAST uses is a chosen-plaintext attack. This means that we can somehow control the input to encrypt. When using asymmetric cryptography, we can encrypt any input we like, since we only need the public key for that.
Encryption methods take an \(IV\), or initialization vector, next to the plain-text input. The goal of this \(IV\) is to make it harder for an attacker to detect relations between the encrypted message and the plain text message. To do this, the \(IV\) must be random or pseudorandom. But as we just saw, when we apply CBC, the \(IV\) isn’t random at all… it is very predictable instead!
If we would encrypt a block where the input equals the previous encrypted block (so \(P_2 = E_1\)), we are effectively encrypting a bunch of zeroes. Not that interesting, of course, but it cancels out the \(IV\)!
We can apply this trick to guess an earlier block of input (\(P_1\)) that we didn’t control. For example, our victim sending their username and password to a banking system. Let’s call this guess \(g\). Now we encrypt \(P_2 \oplus E_0 \oplus g\) and see if it matches \(E_1\). Since \(P_2 = E_0\), we have cancelled out the \(IV\) again and we are now encrypting only \(P_1\) without the \(IV\). If it matches \(E_1\), we have correctly guessed \(g = P_1\)!
Application
Note, that this is only possible when the attacker can somehow connect to the target system (e.g., a bank) through the victims computer (e.g., the customer of the bank). Usually, a same-origin policy constraint prevents this.
With BEAST, Duong and Rizzo demonstrated that this attack was not only theoretically possible. Back then, there was a defect in Java that allowed to circumvent this same-origin policy constraints. The same type of defect might have existed in other programming environments, too. Duong and Rizzo used this defect to build a Java applet that applied the above vulnerability.
Consequences and follow-up
As I said, the vulnerability itself was already known for quite some years. It had even been fixed already in TLS 1.1, but until BEAST, this version of TLS wasn’t used that often.
Initially, people started migrating their servers to use RC4. This is another encryption method that doesn’t rely on blocks, but instead on a stream of input. A few years later, in 2013, this encryption method was also proven to be vulnerable to several attacks.
Eventually, upgrading to later versions of TLS proved to be a better solution… :-)