1
\$\begingroup\$

I've rewritten a piece of Javascript in Python 3.7 that decompresses a file compressed in a specific format. The project that I got this from is available here.

The code I've come up with is as close an analog as I could interpret (I'm not the best at Javascript).

def decompress_lz2(data):
 global loop_count
 lb_len = 0
 lb_dist = 0
 escape = 0x16
 off_input = 0
 output = b''
 while off_input < len(data):
 loop_count += 1
 if lb_len:
 off_output = len(output) - lb_dist
 repeat = max(0, off_output + lb_len - len(output))
 chunk = output[off_output:off_output + lb_len - repeat]
 output += chunk
 if repeat:
 repeat_chunk = bytes([chunk[-1]]) * repeat
 output += repeat_chunk
 lb_len = 0
 if escape:
 chunk = data[off_input:off_input + escape]
 output += chunk
 off_input += escape
 escape = 0
 flag = data[min(off_input, len(data) - 1)]
 off_input += 1
 lb_len = flag >> 5
 if lb_len:
 if lb_len == 7:
 while True:
 next_ = data[off_input]
 off_input += 1
 lb_len += next_
 if next_ != 0xff:
 break
 lb_len += 2
 lb_dist = (flag & 0x1F) << 8
 lb_dist += (1 + data[off_input])
 off_input += 1
 if lb_dist == 0x2000:
 lb_dist += (data[off_input] << 8)
 off_input += 1
 lb_dist += data[off_input]
 off_input += 1
 else:
 escape = flag + 1
 return output

where data is a byte string read in from a file opened in binary mode. My code and the original code both produce the same output, but where the original code takes only a few seconds to execute, mine takes ~10 minutes on the same file. Testing with multiple files yields similar benchmarks. My specific efficiency question is: What can I do to increase the speed of execution of this script on the same system while maintaining output accuracy?

I'm open to the idea of multithreading/multiprocessing though I don't think it's possible due to the nature of this compression type.

Example file, though it's very small and runs quickly on both implementations. It must be fed to decompress_lz2 as bytes.

Peilonrayz
44.4k7 gold badges80 silver badges157 bronze badges
asked Sep 22, 2020 at 0:04
\$\endgroup\$
3
  • \$\begingroup\$ Try using a line profiler to find where exactly is the problem. pip install line-profiler. Add @profile decorator to the desired function(in your case it is decompress_lz2). In terminal: kernprof -lv -o /tmp/out.lprof filename.py \$\endgroup\$ Commented Sep 22, 2020 at 10:24
  • 1
    \$\begingroup\$ Maybe, also share some example data so that people can test it on their system \$\endgroup\$ Commented Sep 22, 2020 at 10:27
  • \$\begingroup\$ The files produced and consumed by this decompression are proprietary. Let me see if I can share an example file. \$\endgroup\$ Commented Sep 22, 2020 at 15:54

1 Answer 1

1
\$\begingroup\$

I can't test it at the moment, but I suspect that the culprit is

output += chunk

in a couple of places. This line potentially has O(n^2) complexity because of copying output to a new place in memory that has room to append chunk to it. It would be more efficient to append chunk to a list and then use b''.join() at the end to concatenate all the chunks.

def decompress_lz2(data):
 global loop_count
 lb_len = 0
 lb_dist = 0
 escape = 0x16
 off_input = 0
 output = [] # <-- make output a list
 while off_input < len(data):
 loop_count += 1
 if lb_len:
 off_output = len(output) - lb_dist
 repeat = max(0, off_output + lb_len - len(output))
 chunk = output[off_output:off_output + lb_len - repeat]
 output.append(chunk) # <-- changed '+=' to '.append()'
 if repeat:
 repeat_chunk = bytes([chunk[-1]]) * repeat
 output.append(repeat_chunk) # <-- changed '+=' to '.append()'
 lb_len = 0
 if escape:
 chunk = data[off_input:off_input + escape]
 output.append(chunk) # <-- changed '+=' to '.append()'
 off_input += escape
 escape = 0
 flag = data[min(off_input, len(data) - 1)]
 off_input += 1
 lb_len = flag >> 5
 if lb_len:
 if lb_len == 7:
 while True:
 next_ = data[off_input]
 off_input += 1
 lb_len += next_
 if next_ != 0xff:
 break
 lb_len += 2
 lb_dist = (flag & 0x1F) << 8
 lb_dist += (1 + data[off_input])
 off_input += 1
 if lb_dist == 0x2000:
 lb_dist += (data[off_input] << 8)
 off_input += 1
 lb_dist += data[off_input]
 off_input += 1
 else:
 escape = flag + 1
 return b''.join(output) # <-- concatenate the chunks
answered Sep 23, 2020 at 19:52
\$\endgroup\$
3
  • \$\begingroup\$ Ha! I made this change and was about to update this when I saw your answer. I did essentially what you did because I realized that list slicing and then using += on two lists was the inefficiency due to memory copying and the potential for huge reallocations. My new solution gets rid of all the slicing as well as the assignment of new lists so that excessive copying and garbage collection need not happen by iterating through the existing data and output lists to make this possible. I'm marking your answer as the answer because it does essentially the same thing. \$\endgroup\$ Commented Sep 23, 2020 at 20:20
  • \$\begingroup\$ @Stephen How much faster is this? From ~10 minutes down to how long? \$\endgroup\$ Commented Sep 23, 2020 at 21:45
  • \$\begingroup\$ This version? I haven't actually tested so I make no guarantees but the version that I wrote up took the execution to ~1.5 seconds from ~10 minutes, an insane improvement. Which ultimately brought it into the same ballpark as the javascript implementation. \$\endgroup\$ Commented Sep 23, 2020 at 21:57

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.