Bit shifting in Python

Question 1

I am trying to "decode" a file which has every byte shifted its bits.

I have successfully managed to decode this by reading the file, and then byte by byte doing the following (b_int is the integer of the single byte, it will not be 9 bits or 7 bits, always 8 bit):

right shift all bits 1 place b_int >> 1
left shift all bits 7 places b_int << 7
calculate the lower 8 bits of the left_shifted '{0:08b}'.format(left_shifted)[-8:] (and convert this string back to int with int(binary_8_bits, 2))
do an or on the right_shifted, and lower 8 bits of left_shifted right_shifted | lower_8_left_shifted
from int back to byte bytes([shifted])
write the result to a file

All this is working, but is very slow, as it has to process byte for byte done in the for byte in data loop.

My question: is there a faster way of doing this?

Full program:

def shift_bits(b_int):
 return (b_int >> 1) | (int('{0:08b}'.format(b_int << 7)[-8:], 2))
def decode_ss(filename):
 with open(filename, 'rb') as openfile:
 data = openfile.read()
 outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
 with open(outputfilename, 'wb') as output:
 for byte in data:
 output.write(bytes([shift_bits(byte)]))
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)
print('done')

Question 2

Is b_int limited to bytes, as in will it ever have a 9th bit?

Question 3

8 bits, I have update my post.

Question 4

Since b_int is limited to bytes we can change your code a little to remove the formatting.

The left_shift can only have two states, if b_int & 1 is 1 or 0. And so is the same as ((b_int & 1) << 7). (b_int >> 1) is as simple as it gets, and so this can stay the same.

And so I changed it to:

def shift_bits(b_int):
 return (b_int >> 1) | (int('{0:08b}'.format(b_int << 7)[-8:], 2))
def p_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)

Which works as intended:

for i in range(1 << 9):
 if shift_bits(i) != p_shift_bits(i):
 print(i)
 break

However, you can still speed it up if you used functools.lru_cache:

@functools.lru_cache(None)
def pc_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)

These gave the following timings:

>>> timeit.timeit('shift_bits(i)', 'from __main__ import shift_bits; i=183')
1.9397294419024576
>>> timeit.timeit('p_shift_bits(i)', 'from __main__ import p_shift_bits; i=183')
0.4003988483403518
# Yes heavily biased, but if the length of data is greater than 256, it comes into play.
>>> timeit.timeit('pc_shift_bits(i)', 'from __main__ import pc_shift_bits; i=183')
0.15648568020111497

Question 5

thanks for your quick response. I tried both p_shift_bits(b_int) and pc_shift_bits(b_int) and both improve performance. Based on a 10Mb input file: p_shift_bits(b_int) is the fasted, and cuts down in 50% of processing time. Surprisingly pc_shift_bits(b_int) with the decorator only cuts down 20-25% of processing time. I will experiment a little more with this.

Question 6

I will not yet mark this as answer, leaving it open for others to review/answer.

Question 7

@EdwinvanMierlo Strange that lru_cache doesn't speed up the process more, as it should work roughly the same way as your answer. Lazily build a table to lookup on...

Question 8

yes: I read up about functools.lru_cache() this is where I got the idea to actually do a static lookup table. It really is speed I am after in this particular case. However your answer has pointed me in the right direction, futhermore: I learned about functools !! Something new to include in my projects !! Thanks !!

Question 9

@JaredGoguen The LRU above is unbounded, as I passed None in as the maxsize.

Question 10

Since it appears to not have been mentioned, I want to point out that the suggestion to build a lookup table as you go is reasonable only because you are using Python: the overhead of interpreting (and then doing) the bit shifts seems to be greater than retrieval from a dictionary (by @peilonrayz 's benchmark). Just inlining the operation instead of having it in a separate function might give you a measurable speedup.

Down on the CPU level, bit shifting and ORing is very fast, and it's possible that ((b_int & 1) << 7) | (b_int >> 1) is faster than the table lookup (although with an L1 cache hit it'll probably be the same). If you're doing this frequently with large files, I would go through the trouble of writing the C code just to see how fast this operation actually is. The code is about as long as the Python anyway.

Regardless of the language used, you can expect a big relative speedup (probably more than anything else you can do here) by writing more than a byte in a single call. See what happens when you decode the entire data and then write it in chunks of 4096.

Question 11

I would love to decode the entire data. I have been experimenting to speed this up, not to any success.

Question 12

I am not a programmer of any language, I just do some python "on the side" ... so writing this in C or any other low level language would not be an option for me at the moment. I think the learning curve exceeds the time I have to spend on this.

Question 13

I don't have the time to test, but I doubt in-lining shift_bits would have that big an impact. (Interpreted Python is slow) As for the second paragraph, using an array as Edwin van Mierlo did is probably the fastest (\$O(1)\$ in C) and simplest way (native Python). But your last paragraph is definitely a good improvement, and worthy of up-votes alone.

Question 14

One thing which will speed things up considerably (and avoid crashing) in case of very large input is to read the file in parts. You can use file_pointer.read(buffer_length) to read a fixed length every time or you can use for line in file_pointer to read line by line.

Question 15

I have experimented with various buffer_length, but as my input file(s) are all about 10Mb I found to .read() all the fastest way. At least for the moment. for line in file_pointer will only work if you have lines, as the LF is bit-shifted as well, there is no "line" in the file as such, unless you see the file as a very large 1-liner, which is the same as I do now, reading it all at once.

Question 16

After some experimenting, I realized that the bit-shifting is limited to the values 0 to 255, and that a list can be created as a lookup for those values. Using the index of the list as byte value from the input file, the element of the list would be the byte value for my output file.

With the following code I created my list:

def p_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)
l = list()
for i in range(0, 255):
 l.append(bytes([p_shift_bits(i)]))
print(l)

at this point I have a list which is my lookup for the actual processing, so my code will look like this:

_MY_LOOKUP = [b'\x00', b'\x80', b'\x01', b'\x81', b'\x02', b'\x82', b'\x03', 
 b'\x83', b'\x04', b'\x84', b'\x05', b'\x85', b'\x06', b'\x86', 
 b'\x07', b'\x87', b'\x08', b'\x88', b'\t', b'\x89', b'\n', 
 b'\x8a', b'\x0b', b'\x8b', b'\x0c', b'\x8c', b'\r', b'\x8d', 
 b'\x0e', b'\x8e', b'\x0f', b'\x8f', b'\x10', b'\x90', b'\x11', 
 b'\x91', b'\x12', b'\x92', b'\x13', b'\x93', b'\x14', b'\x94', 
 b'\x15', b'\x95', b'\x16', b'\x96', b'\x17', b'\x97', b'\x18', 
 b'\x98', b'\x19', b'\x99', b'\x1a', b'\x9a', b'\x1b', b'\x9b', 
 b'\x1c', b'\x9c', b'\x1d', b'\x9d', b'\x1e', b'\x9e', b'\x1f', 
 b'\x9f', b' ', b'\xa0', b'!', b'\xa1', b'"', b'\xa2', b'#', 
 b'\xa3', b'$', b'\xa4', b'%', b'\xa5', b'&', b'\xa6', b"'", 
 b'\xa7', b'(', b'\xa8', b')', b'\xa9', b'*', b'\xaa', b'+', 
 b'\xab', b',', b'\xac', b'-', b'\xad', b'.', b'\xae', b'/', 
 b'\xaf', b'0', b'\xb0', b'1', b'\xb1', b'2', b'\xb2', b'3', 
 b'\xb3', b'4', b'\xb4', b'5', b'\xb5', b'6', b'\xb6', b'7', 
 b'\xb7', b'8', b'\xb8', b'9', b'\xb9', b':', b'\xba', b';', 
 b'\xbb', b'<', b'\xbc', b'=', b'\xbd', b'>', b'\xbe', b'?', 
 b'\xbf', b'@', b'\xc0', b'A', b'\xc1', b'B', b'\xc2', b'C', 
 b'\xc3', b'D', b'\xc4', b'E', b'\xc5', b'F', b'\xc6', b'G', 
 b'\xc7', b'H', b'\xc8', b'I', b'\xc9', b'J', b'\xca', b'K', 
 b'\xcb', b'L', b'\xcc', b'M', b'\xcd', b'N', b'\xce', b'O', 
 b'\xcf', b'P', b'\xd0', b'Q', b'\xd1', b'R', b'\xd2', b'S', 
 b'\xd3', b'T', b'\xd4', b'U', b'\xd5', b'V', b'\xd6', b'W', 
 b'\xd7', b'X', b'\xd8', b'Y', b'\xd9', b'Z', b'\xda', b'[', 
 b'\xdb', b'\\', b'\xdc', b']', b'\xdd', b'^', b'\xde', b'_', 
 b'\xdf', b'`', b'\xe0', b'a', b'\xe1', b'b', b'\xe2', b'c', 
 b'\xe3', b'd', b'\xe4', b'e', b'\xe5', b'f', b'\xe6', b'g', 
 b'\xe7', b'h', b'\xe8', b'i', b'\xe9', b'j', b'\xea', b'k', 
 b'\xeb', b'l', b'\xec', b'm', b'\xed', b'n', b'\xee', b'o', 
 b'\xef', b'p', b'\xf0', b'q', b'\xf1', b'r', b'\xf2', b's', 
 b'\xf3', b't', b'\xf4', b'u', b'\xf5', b'v', b'\xf6', b'w', 
 b'\xf7', b'x', b'\xf8', b'y', b'\xf9', b'z', b'\xfa', b'{', 
 b'\xfb', b'|', b'\xfc', b'}', b'\xfd', b'~', b'\xfe', b'\x7f']
def decode_ss(filename):
 with open(filename, 'rb') as openfile:
 data = openfile.read()
 outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
 with open(outputfilename, 'wb') as output:
 for byte in data:
 output.write(_MY_LOOKUP[byte])
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)

I believe this is (currently) the fastest method, faster than doing the bit-shifting during execution.

[EDIT] for completeness, after some tinkering and tips from the comments sections, my code now looks like this:

def p_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)
def decode_ss(filename):
 my_lookup = [bytes([p_shift_bits(i)]) for i in range(0, 255)]
 with open(filename, 'rb') as openfile:
 data = openfile.read()
 outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
 with open(outputfilename, 'wb') as output:
 for byte in data:
 output.write(my_lookup[byte])
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)

Peilonrayz ♦Peilonrayz 44.4k7 gold badges80 silver badges157 bronze badges · Accepted Answer · 2017-07-07 12:33:39Z

Since b_int is limited to bytes we can change your code a little to remove the formatting.

The left_shift can only have two states, if b_int & 1 is 1 or 0. And so is the same as ((b_int & 1) << 7). (b_int >> 1) is as simple as it gets, and so this can stay the same.

And so I changed it to:

def shift_bits(b_int):
 return (b_int >> 1) | (int('{0:08b}'.format(b_int << 7)[-8:], 2))
def p_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)

Which works as intended:

for i in range(1 << 9):
 if shift_bits(i) != p_shift_bits(i):
 print(i)
 break

However, you can still speed it up if you used functools.lru_cache:

@functools.lru_cache(None)
def pc_shift_bits(b_int):
 return ((b_int & 1) << 7) | (b_int >> 1)

These gave the following timings:

>>> timeit.timeit('shift_bits(i)', 'from __main__ import shift_bits; i=183')
1.9397294419024576
>>> timeit.timeit('p_shift_bits(i)', 'from __main__ import p_shift_bits; i=183')
0.4003988483403518
# Yes heavily biased, but if the length of data is greater than 256, it comes into play.
>>> timeit.timeit('pc_shift_bits(i)', 'from __main__ import pc_shift_bits; i=183')
0.15648568020111497

thanks for your quick response. I tried both p_shift_bits(b_int) and pc_shift_bits(b_int) and both improve performance. Based on a 10Mb input file: p_shift_bits(b_int) is the fasted, and cuts down in 50% of processing time. Surprisingly pc_shift_bits(b_int) with the decorator only cuts down 20-25% of processing time. I will experiment a little more with this.
I will not yet mark this as answer, leaving it open for others to review/answer.
@EdwinvanMierlo Strange that lru_cache doesn't speed up the process more, as it should work roughly the same way as your answer. Lazily build a table to lookup on...
yes: I read up about functools.lru_cache() this is where I got the idea to actually do a static lookup table. It really is speed I am after in this particular case. However your answer has pointed me in the right direction, futhermore: I learned about functools !! Something new to include in my projects !! Thanks !!
@JaredGoguen The LRU above is unbounded, as I passed None in as the maxsize.

Stack Exchange Network

Bit shifting in Python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Bit shifting in Python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions