5
\$\begingroup\$

I am writing a program to compress an image using Huffman encoding, and I need to write the data structure and the byte file. After that, I have to read/decode it. I realize that my read and write is very slow.

I think in writetree() I need to write either width, height, compress bits and the tree data structure. I am not sure we can combine it or not.

And another part, I think I use too many for loops so it is very slow in a very long string.

from PIL import Image
import numpy as np
import json
import sys, string
trim = ('0', ('127', '255'))
width = 4
height = 4
longstring = "1100100111001101011110010011010110101111001001101011010111100100110101101011"
def decode (tree, str) :
 output = ""
 list = []
 p = tree
 count = 0
 for bit in str :
 if bit == '0' : p = p[0] # Head up the left branch
 else : p = p[1] # or up the right branch
 if type(p) == type("") :
 output += p # found a character. Add to output
 list.append(int(p))
 p = tree # and restart for next character
 return list
def writetree(tree,height, width,compress):
 with open('structure.txt', 'w') as outfile:
 json.dump(trim, outfile)
 outfile.close()
 f = open("info.txt", "w")
 f.write(str(height)+"\n")
 f.write(str(width)+"\n")
 f.write(str(compress)+"\n")
 f.close()
def readtree():
 with open('structure.txt') as json_file:
 data = json.load(json_file)
 k = open("info.txt", "r")
 heightread = k.readline().strip("\n")
 widthread = k.readline().strip("\n")
 compressread = k.readline().strip("\n")
 json_file.close()
 k.close()
 return tuple(data), int(heightread), int(widthread), int(compressread)
def writefile():
 print("Write file")
 with open('file', 'wb') as f:
 bit_strings = [longstring[i:i + 8] for i in range(0, len(longstring), 8)]
 byte_list = [int(b, 2) for b in bit_strings]
 print(byte_list)
 realsize = len(bytearray(byte_list))
 print('Compress number of bits: ', len(longstring))
 writetree(trim,height,width,len(longstring))
 f.write(bytearray(byte_list))
 f.close()
def readfile():
 print("Read file")
 byte_list = []
 longbin = ""
 with open('file', 'rb') as f:
 value = f.read(1)
 while value != b'':
 byte_list.append(ord(value))
 value = f.read(1)
 print(byte_list)
 for a in byte_list[:-1]:
 longbin = longbin + '{0:08b}'.format(a)
 trim_read, height_read, width_read , compress_read = readtree()
 sodu = compress_read%8
 '''
 because the this string is split 8 bits at the time, and the current compress_read is 76 
 so the sodu is 4. I have to convert the last byte_list into 4bits not 8 bits
 '''
 if sodu == 0:
 longbin = longbin + '{0:08b}'.format(byte_list[-1])
 elif sodu == 1:
 longbin = longbin + '{0:01b}'.format(byte_list[-1])
 elif sodu == 2:
 longbin = longbin + '{0:02b}'.format(byte_list[-1])
 elif sodu == 3:
 longbin = longbin + '{0:03b}'.format(byte_list[-1])
 elif sodu == 4:
 longbin = longbin + '{0:04b}'.format(byte_list[-1])
 elif sodu == 5:
 longbin = longbin + '{0:05b}'.format(byte_list[-1])
 elif sodu == 6:
 longbin = longbin + '{0:06b}'.format(byte_list[-1])
 elif sodu == 7:
 longbin = longbin + '{0:07b}'.format(byte_list[-1])
 print(longbin)
 print("Decode/ show image:")
 pixels = decode(trim_read, longbin)
 it = iter(pixels)
 pixels = list(zip(it,it,it))
 #print(pixels)
 image_out = Image.new("RGB", (width_read, height_read))
 image_out.putdata(pixels)
 #image_out.show()
writefile()
readfile()
toolic
14.9k5 gold badges29 silver badges206 bronze badges
asked Nov 22, 2019 at 20:31
\$\endgroup\$
5
  • \$\begingroup\$ Hello! I think you have a great first post, could you maybe explain quickly how your algorithm works? I think your question might draw more attention, but anyways I think you've done a good job. \$\endgroup\$ Commented Nov 23, 2019 at 0:59
  • 1
    \$\begingroup\$ I am using the Huffman encoding, It is quite long to explain in here. \$\endgroup\$ Commented Nov 23, 2019 at 5:50
  • \$\begingroup\$ for now, i need to write file, save the data structure ( trim ) also save the height and width and compress bit of the image. I think I make it loop over and over again so It much slower in longwer string \$\endgroup\$ Commented Nov 23, 2019 at 5:52
  • \$\begingroup\$ You can see in readfile function, I read the file and and append to the list, and loop it again to convert '{0:08b}'.format(a) \$\endgroup\$ Commented Nov 23, 2019 at 5:55
  • \$\begingroup\$ Are you still looking for answers to this? :) \$\endgroup\$ Commented Dec 16, 2019 at 2:51

1 Answer 1

1
\$\begingroup\$

Unused

ruff identifies some unused code.

These lines can be deleted:

import numpy as np
import sys, string
count = 0
realsize = len(bytearray(byte_list))

The tree input to this function is unused:

def writetree(tree,height, width,compress):

It can be removed:

def writetree(height, width,compress):

It must then be removed from calls to writetree as well.

Comments

Commented-out code should be deleted to remove clutter:

#print(pixels)
#image_out.show()

Naming

The variables named list and str are the same name as a Python built-ins. This can be confusing. To eliminate the confusion, rename the variables as something like pixels and string, respectively. The first clue is that they have special coloring (syntax highlighting) in the question, as they do when I copy the code into my editor.

The PEP 8 style guide recommends snake_case for function and variable names.

For example, writetree would be write_tree.

Consider more meaningful names for some of the variables, such as p and sodu.

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example, with the decode function, describe the input and return types and what is being decoded.

Layout

The black program can be used to automatically format the code with consistent use of whitespace around operators and space between functions. This will also split the following line:

if bit == '0' : p = p[0] # Head up the left branch

into 2, which is a good practice:

if bit == "0":
 p = p[0] # Head up the left branch

DRY

This expression is repeated several times in the writefile function:

len(longstring)

You can set it to a variable, then len will only be executed once.

The following:

longbin = longbin + '{0:08b}'.format(a)

can be simplified using the special assignment operator:

longbin += '{0:08b}'.format(a)

It can be further simplified with an f-string:

longbin += f'{a:08b}'
answered Jul 14 at 11:41
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.