I am trying to "decode" a file which has every byte shifted its bits.
I have successfully managed to decode this by reading the file, and then byte by byte doing the following (b_int
is the integer of the single byte, it will not be 9 bits or 7 bits, always 8 bit):
- right shift all bits 1 place
b_int >> 1
- left shift all bits 7 places
b_int << 7
- calculate the lower 8 bits of the left_shifted
'{0:08b}'.format(left_shifted)[-8:]
(and convert this string back to int withint(binary_8_bits, 2)
) - do an
or
on the right_shifted, and lower 8 bits of left_shiftedright_shifted | lower_8_left_shifted
- from int back to byte
bytes([shifted])
- write the result to a file
All this is working, but is very slow, as it has to process byte for byte done in the for byte in data
loop.
My question: is there a faster way of doing this?
Full program:
def shift_bits(b_int):
return (b_int >> 1) | (int('{0:08b}'.format(b_int << 7)[-8:], 2))
def decode_ss(filename):
with open(filename, 'rb') as openfile:
data = openfile.read()
outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
with open(outputfilename, 'wb') as output:
for byte in data:
output.write(bytes([shift_bits(byte)]))
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)
print('done')
4 Answers 4
Since b_int
is limited to bytes we can change your code a little to remove the formatting.
The left_shift
can only have two states, if b_int & 1
is 1 or 0. And so is the same as ((b_int & 1) << 7)
. (b_int >> 1)
is as simple as it gets, and so this can stay the same.
And so I changed it to:
def shift_bits(b_int):
return (b_int >> 1) | (int('{0:08b}'.format(b_int << 7)[-8:], 2))
def p_shift_bits(b_int):
return ((b_int & 1) << 7) | (b_int >> 1)
Which works as intended:
for i in range(1 << 9):
if shift_bits(i) != p_shift_bits(i):
print(i)
break
However, you can still speed it up if you used functools.lru_cache
:
@functools.lru_cache(None)
def pc_shift_bits(b_int):
return ((b_int & 1) << 7) | (b_int >> 1)
These gave the following timings:
>>> timeit.timeit('shift_bits(i)', 'from __main__ import shift_bits; i=183')
1.9397294419024576
>>> timeit.timeit('p_shift_bits(i)', 'from __main__ import p_shift_bits; i=183')
0.4003988483403518
# Yes heavily biased, but if the length of data is greater than 256, it comes into play.
>>> timeit.timeit('pc_shift_bits(i)', 'from __main__ import pc_shift_bits; i=183')
0.15648568020111497
-
\$\begingroup\$ thanks for your quick response. I tried both
p_shift_bits(b_int)
andpc_shift_bits(b_int)
and both improve performance. Based on a 10Mb input file:p_shift_bits(b_int)
is the fasted, and cuts down in 50% of processing time. Surprisinglypc_shift_bits(b_int)
with the decorator only cuts down 20-25% of processing time. I will experiment a little more with this. \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 13:23:57 +00:00Commented Jul 7, 2017 at 13:23 -
\$\begingroup\$ I will not yet mark this as answer, leaving it open for others to review/answer. \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 13:24:37 +00:00Commented Jul 7, 2017 at 13:24
-
\$\begingroup\$ @EdwinvanMierlo Strange that
lru_cache
doesn't speed up the process more, as it should work roughly the same way as your answer. Lazily build a table to lookup on... \$\endgroup\$2017年07月07日 14:15:57 +00:00Commented Jul 7, 2017 at 14:15 -
\$\begingroup\$ yes: I read up about functools.lru_cache() this is where I got the idea to actually do a static lookup table. It really is speed I am after in this particular case. However your answer has pointed me in the right direction, futhermore: I learned about functools !! Something new to include in my projects !! Thanks !! \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 14:23:05 +00:00Commented Jul 7, 2017 at 14:23
-
1\$\begingroup\$ @JaredGoguen The LRU above is unbounded, as I passed
None
in as themaxsize
. \$\endgroup\$2017年07月07日 15:08:13 +00:00Commented Jul 7, 2017 at 15:08
Since it appears to not have been mentioned, I want to point out that the suggestion to build a lookup table as you go is reasonable only because you are using Python: the overhead of interpreting (and then doing) the bit shifts seems to be greater than retrieval from a dictionary (by @peilonrayz 's benchmark). Just inlining the operation instead of having it in a separate function might give you a measurable speedup.
Down on the CPU level, bit shifting and ORing is very fast, and it's possible that ((b_int & 1) << 7) | (b_int >> 1)
is faster than the table lookup (although with an L1 cache hit it'll probably be the same). If you're doing this frequently with large files, I would go through the trouble of writing the C code just to see how fast this operation actually is. The code is about as long as the Python anyway.
Regardless of the language used, you can expect a big relative speedup (probably more than anything else you can do here) by writing more than a byte in a single call. See what happens when you decode the entire data and then write it in chunks of 4096.
-
\$\begingroup\$ I would love to decode the entire data. I have been experimenting to speed this up, not to any success. \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 21:32:53 +00:00Commented Jul 7, 2017 at 21:32
-
\$\begingroup\$ I am not a programmer of any language, I just do some python "on the side" ... so writing this in C or any other low level language would not be an option for me at the moment. I think the learning curve exceeds the time I have to spend on this. \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 21:35:12 +00:00Commented Jul 7, 2017 at 21:35
-
\$\begingroup\$ I don't have the time to test, but I doubt in-lining
shift_bits
would have that big an impact. (Interpreted Python is slow) As for the second paragraph, using an array as Edwin van Mierlo did is probably the fastest (\$O(1)\$ in C) and simplest way (native Python). But your last paragraph is definitely a good improvement, and worthy of up-votes alone. \$\endgroup\$2017年07月07日 23:39:51 +00:00Commented Jul 7, 2017 at 23:39
One thing which will speed things up considerably (and avoid crashing) in case of very large input is to read the file in parts. You can use file_pointer.read(buffer_length)
to read a fixed length every time or you can use for line in file_pointer
to read line by line.
-
3\$\begingroup\$ I have experimented with various
buffer_length
, but as my input file(s) are all about 10Mb I found to.read()
all the fastest way. At least for the moment.for line in file_pointer
will only work if you have lines, as theLF
is bit-shifted as well, there is no "line" in the file as such, unless you see the file as a very large 1-liner, which is the same as I do now, reading it all at once. \$\endgroup\$Edwin van Mierlo– Edwin van Mierlo2017年07月07日 13:31:38 +00:00Commented Jul 7, 2017 at 13:31
After some experimenting, I realized that the bit-shifting is limited to the values 0 to 255, and that a list can be created as a lookup for those values. Using the index of the list as byte value from the input file, the element of the list would be the byte value for my output file.
With the following code I created my list:
def p_shift_bits(b_int):
return ((b_int & 1) << 7) | (b_int >> 1)
l = list()
for i in range(0, 255):
l.append(bytes([p_shift_bits(i)]))
print(l)
at this point I have a list which is my lookup for the actual processing, so my code will look like this:
_MY_LOOKUP = [b'\x00', b'\x80', b'\x01', b'\x81', b'\x02', b'\x82', b'\x03',
b'\x83', b'\x04', b'\x84', b'\x05', b'\x85', b'\x06', b'\x86',
b'\x07', b'\x87', b'\x08', b'\x88', b'\t', b'\x89', b'\n',
b'\x8a', b'\x0b', b'\x8b', b'\x0c', b'\x8c', b'\r', b'\x8d',
b'\x0e', b'\x8e', b'\x0f', b'\x8f', b'\x10', b'\x90', b'\x11',
b'\x91', b'\x12', b'\x92', b'\x13', b'\x93', b'\x14', b'\x94',
b'\x15', b'\x95', b'\x16', b'\x96', b'\x17', b'\x97', b'\x18',
b'\x98', b'\x19', b'\x99', b'\x1a', b'\x9a', b'\x1b', b'\x9b',
b'\x1c', b'\x9c', b'\x1d', b'\x9d', b'\x1e', b'\x9e', b'\x1f',
b'\x9f', b' ', b'\xa0', b'!', b'\xa1', b'"', b'\xa2', b'#',
b'\xa3', b'$', b'\xa4', b'%', b'\xa5', b'&', b'\xa6', b"'",
b'\xa7', b'(', b'\xa8', b')', b'\xa9', b'*', b'\xaa', b'+',
b'\xab', b',', b'\xac', b'-', b'\xad', b'.', b'\xae', b'/',
b'\xaf', b'0', b'\xb0', b'1', b'\xb1', b'2', b'\xb2', b'3',
b'\xb3', b'4', b'\xb4', b'5', b'\xb5', b'6', b'\xb6', b'7',
b'\xb7', b'8', b'\xb8', b'9', b'\xb9', b':', b'\xba', b';',
b'\xbb', b'<', b'\xbc', b'=', b'\xbd', b'>', b'\xbe', b'?',
b'\xbf', b'@', b'\xc0', b'A', b'\xc1', b'B', b'\xc2', b'C',
b'\xc3', b'D', b'\xc4', b'E', b'\xc5', b'F', b'\xc6', b'G',
b'\xc7', b'H', b'\xc8', b'I', b'\xc9', b'J', b'\xca', b'K',
b'\xcb', b'L', b'\xcc', b'M', b'\xcd', b'N', b'\xce', b'O',
b'\xcf', b'P', b'\xd0', b'Q', b'\xd1', b'R', b'\xd2', b'S',
b'\xd3', b'T', b'\xd4', b'U', b'\xd5', b'V', b'\xd6', b'W',
b'\xd7', b'X', b'\xd8', b'Y', b'\xd9', b'Z', b'\xda', b'[',
b'\xdb', b'\\', b'\xdc', b']', b'\xdd', b'^', b'\xde', b'_',
b'\xdf', b'`', b'\xe0', b'a', b'\xe1', b'b', b'\xe2', b'c',
b'\xe3', b'd', b'\xe4', b'e', b'\xe5', b'f', b'\xe6', b'g',
b'\xe7', b'h', b'\xe8', b'i', b'\xe9', b'j', b'\xea', b'k',
b'\xeb', b'l', b'\xec', b'm', b'\xed', b'n', b'\xee', b'o',
b'\xef', b'p', b'\xf0', b'q', b'\xf1', b'r', b'\xf2', b's',
b'\xf3', b't', b'\xf4', b'u', b'\xf5', b'v', b'\xf6', b'w',
b'\xf7', b'x', b'\xf8', b'y', b'\xf9', b'z', b'\xfa', b'{',
b'\xfb', b'|', b'\xfc', b'}', b'\xfd', b'~', b'\xfe', b'\x7f']
def decode_ss(filename):
with open(filename, 'rb') as openfile:
data = openfile.read()
outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
with open(outputfilename, 'wb') as output:
for byte in data:
output.write(_MY_LOOKUP[byte])
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)
I believe this is (currently) the fastest method, faster than doing the bit-shifting during execution.
[EDIT] for completeness, after some tinkering and tips from the comments sections, my code now looks like this:
def p_shift_bits(b_int):
return ((b_int & 1) << 7) | (b_int >> 1)
def decode_ss(filename):
my_lookup = [bytes([p_shift_bits(i)]) for i in range(0, 255)]
with open(filename, 'rb') as openfile:
data = openfile.read()
outputfilename = ''.join(filename.split('.')[:-2]) + '.txt'
with open(outputfilename, 'wb') as output:
for byte in data:
output.write(my_lookup[byte])
filename = r'C:\inputfile.txt.ss'
decode_ss(filename)
Explore related questions
See similar questions with these tags.
b_int
limited to bytes, as in will it ever have a 9th bit? \$\endgroup\$