**09/22/2015 Update: A bug was found in the code. The code block has been updated with the fixed code.**

**Description**

RC4 is one of the most widely used ciphers in the world. It is used in WEP, WPA, SSL, BitTorrent, PDF, etc. It is one of the simplest to understand and implement. It is also one of the fastest algorithms available.

**What is RC4**

It is a stream cipher, which means that it encrypts a stream of data byte by byte as opposed to a block cipher that encrypts groups of bytes at a time, usually with the inclusion of byte shifting. This usually means that block ciphers are more secure than stream ciphers. One of the weaknesses of a stream cipher, more so than in a block cipher, is that the layout of the data is unchanged. Therefore, if you know the layout of the data, you can flip a bit to change the meaning of the message. For example, if you know the message format is "Deposit: ###", then you can essentially flip one of the bits at byte 10 and hope to change 100 to 900. What is needed then, is a way to authenticate a message. Usually, a cryptographic hash function is used for this purpose. In essence, this message authentication code allows you to verify the sender and the message.

It is a secret key algorithm, which means that it uses a key that should only be known by the intended sender and recipient. A major problem with this is the secure transmission of the key. Public key cryptography solves this by using so called "hard" math where calculating in one direction is easy while solving in the opposite direction takes much much longer. One example is multiplication vs factorization. Multiplying two numbers is easy but attempting to find the two numbers used to get that number takes a long time. This type of "hard" math allows you to transmit public information that the sender and recipient uses to compute a key without having to transmit a key. So why isn't public key cryptography used exclusively? It is a lot slower than secret key cryptograpy.

**General Weaknesses of Cryptography**

A weakness of cryptography is that they can be hacked using brute force. This means that a message can stay a secret only for as long as it takes a computer to try every password until it finds the one you used. Each byte that you add to a password means that it will take roughly 256 times longer to crack your password. So the question is, how long do you need that message to stay a secret?

Now, that isn't actually a weakness of the algorithm per se. An actual weakness is that many algorithms are subject to mathematical analysis that may reveal what key was used to encrypt the data. Given enough encrypted data using the same or similar keys will result in a crack quicker than it would take using brute force. One way to mitigate this is the use of a nonce, initialization vector, or salt. They are basically random bits that are used with the key so that even though you are using the same key, each message is different because the random bits in effect change the key that is used.

This is why you should choose a longer password and why you should change your passwords often.

**How RC4 Works**

RC4 encrypts a stream of data by generating a pseudorandom stream of bytes from a keystream. This output is XORed with the message.

To generate the inital keystream, you start out with a 256 item array filled with the sequence 0-255. Looping 256 times, each byte in the key is used to calculate two numbers between 0 and 255. The two numbers at those indexes are then swapped.

With this, the keystream is generated by calculating two indexes, swapping the numbers at those indexes, adding those two numbers mod 256 and then returning the number at that index.

This number is then XORed with one byte of the message. This continues until the end of the message is reached.

**Specific Weaknesses of RC4**

Mathematical analysis has shown that the first few bytes of the keystream follow a pattern that could reveal the key or keystream that was used to create it. It is suggested that they first few bytes of the keystream should be discarded to reduce this relationship. The standard is 768 bytes but the recommended value is 3072 bytes.

I mention the following weakness because it could be a major weakness in the algorithm but I do not understand enough about the attack to say anything worthwhile about it. Souradyuti Paul and Bart Preneel gave a proof of a prior weakness posited by Itsik Mantin and Adi Shamir about combinatorial analysis of the cipher stream. In their paper, they discuss the weakness and offer a modification to the algorithm that should mitigate it.

**Sample Implementation of RC4**

This is a function that works in many VB implementations. It even works in a Visual Basic Script and was, in fact, coded specifically for VBScript. But it should be directly portable to VBA. It is an inlined version because calling functions take a lot of overhead. One thing I did to make it more secure is to drop the first 3072 bytes of the keystream. Encryption and Decryption is done using the same function. I validated the pre-drop version against test values.

Expand|Select|Wrap|Line Numbers

- Function RunRC4(sMessage, strKey)
- Dim kLen, x, y, i, j, temp
- Dim s(256), k(256)
- 'Init keystream
- klen = Len(strKey)
- For i = 0 To 255
- s(i) = i
- k(i) = Asc(Mid(strKey, (i Mod klen) + 1, 1))
- Next
- j = 0
- For i = 0 To 255
- j = (j + k(i) + s(i)) Mod 256
- temp = s(i)
- s(i) = s(j)
- s(j) = temp
- Next
- x = 0
- y = 0
- 'Drop n bytes from keystream
- For i = 1 To 3072
- x = (x + 1) Mod 256
- y = (y + s(x)) Mod 256
- temp = s(x)
- s(x) = s(y)
- s(y) = temp
- Next
- 'Encode/Decode
- For i = 1 To Len(sMessage)
- x = (x + 1) Mod 256
- y = (y + s(x)) Mod 256
- temp = s(x)
- s(x) = s(y)
- s(y) = temp
- RunRC4 = RunRC4 & (s((s(x) + s(y)) Mod 256) Xor Asc(Mid(sMessage, i, 1))) & ","
- Next
- End Function

Download an example to play with.