I started reading RC4 from a book and was not able to understand some phrases correctly.
The RC4 algorithm is remarkably simple and easy to understand. A variable length key of from 1 to 256 bytes is used to initialize a 256-byte state vector S. At all times S contains a permutation of all 8-bit numbers from 0 to 255.
From the above my interpretation is that if suppose we use Java as our programming language
- Each character is 2 bytes and the key length can be from 1 till 128(256/2)length long.
- A vector/extendable-array is created array is created. But what is the length of the array? Is the size of array is 256?
- At all times array S contains a permutation of all the 8-bit numbers from 0 to 255? Could someone explain this to me.
Kindly help me as I am feeling lost in my first chapter to study cryptography.
1 Answer 1
RC4 is applied bytewise, regardless how long a simple character is. There were times where a character was one byte, but today those (old) characters mean bytes. To be consistent with current implementations and in case of ASCII-characters (those whose ordinal number is less than 128) you should chop the (upper) byte containing the zero. "password" has a length of 8 and will be applied 32 times in the initial permutation step. Otherwise the permutation-array will contain artifacts, if only every "second" byte is effectively used for permutation.
the KSA of RC4
for i from 0 to 255
j := (j + S[i] + key[i mod keylength]) mod 256
swap values of S[i] and S[j]
endfor
If you use 2 bytes per ascii char, one of the bytes will contain a "0" for every second byte. For every second password byte the key-schedule-algorithm will reduce to:
for i from 0 to 255
j := (j + S[i] + 0) mod 256
swap values of S[i] and S[j]
endfor
Wich makes the permutation of S and the inner state exploitabe. If the first password byte is null, then S[0] is swapped with S[0], which is no swap at all. I suggest you make some experiments on that - it could be interesting for you - S[0] may later been exchanged but maybe not. And it may also be exchanged with one of those other unlucky bytes. Which may lead to a more or less constant permutation values on certain values of S[i].
The security of this algorithm is dependend on the secrecy of the permutation of S. But if you can guess some of the bytes of S by knowing some of password properties, you can effectively reduce uncertaincy as an attacker.
The streamcipher itself is applied byte wise. The RC4 pseudorandom stream is a stream of bytes and your data to encrypt is a stream of bytes.
- "At all times array S contains a permutation of all the 8-bit numbers from 0 to 255"
The array S has a length of 256 bytes, it contains every possible byte (8bit => 2^8) exactly once. At the start S is ordered in that way, that S[0]=0 .. and S[255]=255. Each Password byte will exchange two elements with each other. All the elements from 0 to 255 are still there, but not at the position as before. S is the internal state after applying of the password (this state is hidden/secret and not known to anyone but you). Your password is the key, how to create this special and hopefully unique permutation.
hth. please ask if i missed to explain something.
-
"At all times array S contains a permutation of all the 8-bit numbers from 0 to 255"? Could you elaborate this.zilcuanu– zilcuanu2015年02月23日 13:45:56 +00:00Commented Feb 23, 2015 at 13:45
-
Each of the 256 elements of the array contains a (different) value also in the range [0,255]. So, the array provides a bijective map from [0,255] onto itself. Any bijection from a set onto itself must also describe a reordering, or permutation, of that set.Useless– Useless2015年02月23日 13:59:41 +00:00Commented Feb 23, 2015 at 13:59
-
One of the best explanation. Thanks @thepacker. So as per my understanding we still consider each character as one byte for backward compatibility right. Also what is meant by " Otherwise the permutation-array will contain artifacts, if only every "second" byte is effectively used for permutation." Please explain with an example.zilcuanu– zilcuanu2015年02月23日 14:06:54 +00:00Commented Feb 23, 2015 at 14:06
-
Your original quote doesn't seem to contain the word character, so where did that come from? Just stick with byte, or use octet for perfect clarity.Useless– Useless2015年02月23日 16:29:35 +00:00Commented Feb 23, 2015 at 16:29
-
@Useless: I do not understand your point. I used the word Predeep used in qestion one and referred to that. And a byte was usually often enough denoted as "unsigned char" at least in C.thepacker– thepacker2015年02月23日 16:35:30 +00:00Commented Feb 23, 2015 at 16:35