Storing data in the pixels of an image

Question 1

I've recently redone something I wrote around a year ago since I'd done it wrong (it required the original image to decode an imagine, which can be avoided), so I'm wondering if I could either get feedback on the style of writing, or on which bits can be broken, since I've tried to make it fairly robust. The idea is to store stuff in images by modifying the pixels, so you could send files over image hosts and stuff like that.

First off, I tried following that 72/80 line thing for the first time, and had to do some not so nice cuts/renames, especially with the list comprehension bits, so that's why a lot of stuff is split over multiple lines.

The one bit I'm not so sure about is the Imgur authentication. Since user accounts get a 4MB upload limit and anonymous accounts get a 1MB upload limit (and this obviously requires all images to stay as PNG), I've kind of forced the code to use an account if upload is enabled. If it is not the users account (there is an option to authenticate your own account), it'll just upload to a default account I set up, where the pyimgur.Imgur client is un-pickled from a string. If you successfully authenticate, it'll use your account instead until you use imgur_auth = False, where it'll reset it.

So, as to the actual code, if you input an image to write over, it'll calculate the minimum bits it can use to store the required data. The first red value will be used as a header, then the data will be split and replace the last few bits of each colour in the image. This goes up to 7, then after that at 8, there is no point in keeping the original resolution since the original image would exist no more, so it just reverts to as if you never input an image.

If you don't input an image, you can choose a ratio since the resolution can be anything, and it defaults to '16:9'. If you input an image and it reverts to this method, it calculates the ratio from the resolution of the image.

I mentioned a header before and I go into more detail in the code, but basically, the header is a number between 0 and 255, where the final integer is the bits used per colour, and the first two is how many bytes are used to store the length of data. If it is 28, that means 2 bytes are used to store the data length. You would get the data length from the numbers at data[1:3], by converting them to binary, then putting them together and converting back to an integer. As a simple example, if data[1:3] = [255, 255], pieced together would be 1111111111111111, converted back to int would be 65535. The data would be stored at data[4:65539]. The maximum value this can represent is around 2.6*10^239.

If the bits per colour is below 8, you need to get the last n bits of every single colour, join it together in one big string, then split every 8 characters.

Also, because I poorly implemented a default settings class last time, I got a config file working this time. I initially did it with ConfigParser, but got loads of bugs, so I just wrote a new thing from scratch based off it.

Requirements: PIL, requests, pyimgur, UsefulThings (dropbox link)
UsefulThings is just a group of functions and stuff that I can reuse elsewhere so this code doesn't get messy with non related stuff. Link to it here, you can just copy the functions into the code then find and replace 'UsefulThings.' with ''.

from __future__ import division
from collections import defaultdict
from math import log
import os
import random
import cPickle
import StringIO
import urllib2
import webbrowser
import zlib
import UsefulThings
import logging
logging.getLogger('requests').setLevel(logging.WARNING)
try:
 from PIL import Image
except ImportError:
 raise ImportError('python imaging library module was not found')
#Import internet bits, stop upload features if any don't exist
global override_upload
override_upload = False
try:
 import pyimgur
 import requests
except ImportError:
 output_text = ['Warning: Error importing pyimgur', 
 ', disabling the upload features.']
 try:
 import requests
 except ImportError:
 output_text = output_text[0] + [' and requests'] + output_text[1]
 print ''.join(output_text)
 override_upload = True
class ISGlobals(object):
 """Class for determining default values, where it will read any 
 changes from a config file.
 There are three different groups of variables:
 Required (shown and are reset if edited) - This would mainly be
 for things to show the user, such as comments.
 Default (shown and can be edited) - These are the core variables
 used in the code, things the user would normally pass in 
 through the main function.
 Hidden (not shown but can be edited) - These are the less 
 important variables, which aren't normally needed. Things 
 such as overrides should go here, which can be used if a 
 program uses this code and needs to force a few things to 
 happen.
 By default, the config file is stored in appdata, though this has
 only been tested if it works on Windows.
 """
 #Build list of directories
 link_dict = {}
 link_dict['%USERDIR'] = os.path.expanduser( "~" ).replace( "\\", "/" )
 link_dict['%PYTHONDIR'] = os.getcwd().replace( "\\", "/" )
 link_dict['%APPDATA'] = os.getenv('APPDATA')
 location = '%APPDATA/VFXConfig.ini'
 for k in link_dict:
 location = location.replace(k, link_dict[k])
 def __init__(self, save_all=False):
 """Define the default values, then check them against the 
 values stored in the config.
 """
 #Set to true to overwrite all values
 reset_config = False
 required_globals = defaultdict(dict)
 default_globals = defaultdict(dict)
 hidden_globals = defaultdict(dict)
 rg = required_globals['ImageStore']
 dg = default_globals['ImageStore']
 hg = hidden_globals['ImageStore']
 rg[(';You can use %PYTHONDIR, %USERDIR or %APPDATA'
 ' as links to directories.')] = None
 dg['ShowAllValuesHereOnNextRun'] = False
 dg['DefaultImageName'] = 'ImageDataStore.png'
 dg['DefaultImageDirectory'] = '%USERDIR/ImageStore'
 dg['DefaultCustomImage'] = 'http://bit.ly/1G3u3cV'
 dg['UseDefaultCustomImageIfNone'] = False
 dg['DefaultShouldVerify'] = True
 dg['DefaultShouldSave'] = True
 dg['DefaultShouldUpload'] = False
 dg['DefaultShouldOpenOnUpload'] = False
 hg['ForceDefaultVerify'] = False
 hg['ForceNoSave'] = False
 hg['ForceNoUpload'] = False
 hg['ForceNoOpenOnUpload'] = False
 hg['ForceDefaultCustomImage'] = False
 hg['ForceNoCustomImage'] = False
 required_globals = dict(required_globals)
 default_globals = dict(default_globals)
 hidden_globals = dict(hidden_globals)
 #Write the values to the config
 config_parser = UsefulThings.NewConfigParser(self.location)
 required_globals = config_parser.write(required_globals, 
 write_values=True, 
 update_values=True)
 default_globals = config_parser.write(default_globals, 
 write_values=True, 
 update_values=reset_config)
 hidden_globals = config_parser.write(hidden_globals, 
 write_values=save_all, 
 update_values=reset_config)
 #Get all the important values
 #Uses write() so the hidden_globals will be returned too
 all_globals = dict(default_globals).copy()
 all_globals['ImageStore'].update(hidden_globals['ImageStore'])
 self.global_dict = config_parser.write(all_globals, 
 write_values=False, 
 update_values=False)
 self.global_dict = self.global_dict['ImageStore']
 self.global_dict = {k: v for k, v in self.global_dict.iteritems() 
 if k[0] not in config_parser.comment_markers}
 #Convert links to the correct format
 for i in self.link_dict:
 self.global_dict = {k: (v.replace(i, self.link_dict[i]) if 
 isinstance(v, str) else v) 
 for k, v in self.global_dict.iteritems()}
 if not save_all and self.global_dict['ShowAllValuesHereOnNextRun']:
 ISGlobals(save_all=True)
 def get(self, x):
 """Get one of the stored values.
 This should be used after assigning ISGlobals to a variable, 
 so that it doesn't rebuild the list each time a new variable is
 looked up.
 """
 return self.global_dict[x]
class ImageStoreError(Exception):
 pass
class ImageStore(object):
 """Class for writing and reading data in the actual pixels in an 
 image.
 Header Information:
 To mark the bits used and end of file, a header is stored at the 
 beginning of the image. 
 The first byte of the image (red value of the pixel), is used as
 an integer that stores two values. The last number determins how
 many bits are used per pixel, and the first two numbers 
 determine how many bytes are being used to store the number of 
 bytes of data being held in the image.
 The second part is what was just mentioned above, where the 
 number of bytes is converted to binary and joined, then split 
 into single bytes to allow it to be written to the image. To 
 avoid a noticeable line of pixels in the top corner, this is 
 encoded with the rest of the image. This means if the bits per 
 pixel value is below 8, the image must be decoded before this 
 information can be read.
 For example, if the number was 123, the image would be storing 3
 bits per colour, and the next 12 bytes would contain the length 
 of data. Once the image is read and original data pieced back 
 together, bytes at the index of [1:14] will store how many bytes 
 the data is. The data will then be found at [14:14+length].
 As to a practical example, if the number was 48, and the next 4 
 bytes were [39, 29, 177, 252], the number of bytes would be 
 656257532, and the data would be at image_data[6:656257538].
 The absolute maximum data size through using this method is 
 2.6*10^239 bytes, which is more than the number of atoms in the
 universe, so I think it's safe to say nobody will go above it.
 """
 max_data_len = int('1' * 8 * 99, 2)
 imgur_main = 'http://i.imgur.com'
 imgur_delete = 'http://imgur.com/delete'
 imgur_api = 'https://api.imgur.com/3/{}'
 def __init__(self, image_path=None, allow_print=True):
 """Format the image path ready to either read or write.
 Note that there are no checks on the directory or file, since 
 you do not need a writeable image for ImageStore.read(), and 
 the image/path doesn't need to exist for ImageStore.write().
 Parameters:
 image_path (str or None): Path to the image.
 Depending on if the path or filename is not provided, 
 the code will automatically fill in the blanks with
 default values.
 allow_print (bool): If printing the current progress of the
 code execution is allowed.
 Set to False to totally disable all printing.
 """
 self.defaults = ISGlobals().get
 self.allow_print = allow_print
 #Get a valid image path based on the input
 default_dir = self.defaults('DefaultImageDirectory')
 default_name = self.defaults('DefaultImageName')
 self.path = image_path
 if image_path is None:
 self.path = '{}/{}'.format(default_dir, default_name)
 else:
 if ':/' not in self.path:
 if self.path.startswith('/'):
 self.path = self.path[1:]
 self.path = '{}/{}'.format(default_dir, self.path)
 #Format the extension
 self._original_extension = None
 path_split = self.path.split('/')
 if '.' in path_split[-1]:
 self._original_extension = path_split[-1].split('.')[-1]
 self.path = (self.path[::-1].split('.', 1)[1])[::-1]
 self.path += '.png'
 def _save_image(self, image_object):
 """Save the PIL image object to disk.
 If the file path doesn't exist, it will be created first.
 """
 if UsefulThings.make_file_path(self.path):
 try:
 image_object.save(self.path, 'PNG')
 except IOError:
 raise IOError('unable to write image')
 else:
 raise IOError('unable to write image path: {}'.format(self.path))
 def _read_image(self, image_location, require_png=False): 
 """Open an image, supports opening from a URL as well as normal
 files.
 Parameters:
 image_location (str): Path to image.
 require_png (bool): If the image must be a PNG.
 If True and image isn't a PNG, it will throw an error.
 """
 if not isinstance(image_location, (str, unicode)):
 return None
 #Load from URL
 if any(value in image_location for value in ('http://', 
 'https://', 
 'ftp://')):
 try:
 image_location = StringIO.StringIO(urllib2.urlopen(
 image_location).read())
 except urllib2.URLError:
 raise urllib2.URLError('failed to load image from URL')
 #Open image
 try:
 return Image.open(image_location).convert('RGB')
 except IOError:
 #Check if image exists but just not with png extension
 if require_png:
 if (self._original_extension is not None
 and 'png' not in self._original_extension
 and '.png' in image_location):
 try:
 Image.open(image_location.replace('.png', 
 '.{}'.format(self._original_extension)))
 raise ImageStoreError('image format needs to be PNG')
 except IOError:
 pass
 raise IOError('no image file found')
 raise IOError('failed to open image file')
 def _imgur_request(self, client, url_prefix, parameters):
 """Send request to the Imgur API.
 Currently only used for uploading images.
 """
 return client._send_request(self.imgur_api.format(url_prefix), 
 method='POST', 
 params=parameters)
 def _print(self, item, indent_num=0):
 """Print wrapper to allow disabling all messages."""
 if self.allow_print:
 print '{}{}'.format(' ' * indent_num, item)
 def _time(self, verb, TimeThisObject):
 """Wrapper for formatting the time correctly.
 Allows the format to be easily edited without having to change
 each string.
 """
 self._print('{}: {}'.format(TimeThisObject.output(), verb), 1)
 def write(self, input_data, custom_image=None, image_ratio=None, 
 verify=None, save_image=None, upload_image=None, 
 open_on_upload=None, imgur_title=None, imgur_description=None, 
 imgur_auth=None):
 """Write data directly into an image.
 If a custom image is used, bits per byte is calculated, which 
 determins how many bits are used to store data in each colour 
 value in the image. This can range from 1, which is virtually 
 invisible to the human eye, to 8, which would entirely erase the
 original image. 
 The input image is processed to remove the last few bits from
 each colour, and is padded back to 8 bits using the input data.
 Parameters:
 input_data (any): Data to be written to the image.
 May be in almost any format as it is pickled.
 custom_image (str or None/False, optional): Path or URL to 
 an image to write over. 
 Since the output has to be a PNG, a large custom image
 will result in a large file size, no matter what the
 input data is.
 Leave as None to write the smallest image possible, or
 use the default custom image depending on the config.
 image_ratio (str or None, optional): Set the width to height
 ratio if there is no custom image, in the format 'w:h'.
 If a custom image is given, but the data turns out too
 large to store in it, although it'll scrap the custom
 image and act like one wasn't provided, it will inherit
 the ratio from that image.
 Will default to a 16:9 resolution if left as None.
 verify (bool or None, optional): Read the image after 
 creation to make sure the read data matches the original
 input data.
 Will check the file and/or URL depending on where the
 image was saved.
 Disabling it will not have any adverse effects, since
 the code should catch most problems during writing.
 If None, it will use the default value provided in the 
 config.
 save_image (bool or None, optional): If the file should be 
 saved to disk.
 If None, it will use the default value provided in the 
 config.
 upload_image (bool or None, optional): If the file should be
 uploaded to imgur.
 If None, it will use the default value provided in the 
 config.
 open_on_upload (bool or None, optional): If the uploaded 
 link should be opened by the default web browser.
 If None, it will use the default value provided in the 
 config.
 imgur_title (str or None, optional): Title to give to the 
 Imgur upload.
 imgur_description (str or None, optional): Description to 
 give to the Imgur upload.
 imgur_auth (class, str or None, optional): Used to upload 
 images to the account of the authenticated user.
 Use imgur_log_in() to get the pyimgur instance for this, 
 and after code execution, a string will be provided to
 use here for the purpose of not having to authenticate 
 again.
 """
 #Get default values from config
 if verify is None or self.defaults('ForceDefaultVerify'):
 verify = self.defaults('DefaultShouldVerify')
 if self.defaults('ForceNoSave'):
 save_image = False
 elif save_image is None:
 save_image = self.defaults('DefaultShouldSave')
 if override_upload or self.defaults('ForceNoUpload'):
 upload_image = False
 else:
 if upload_image is None:
 upload_image = self.defaults('DefaultShouldUpload')
 if self.defaults('ForceNoOpenOnUpload'):
 open_on_upload = False
 elif open_on_upload is None:
 open_on_upload = self.defaults('DefaultShouldOpenOnUpload')
 bits_per_byte = 8
 self._print('Writing to image...', 0)
 with UsefulThings.TimeThis(print_time=self.allow_print) as t: 
 encoded_data = [ord(letter) for letter in self.encode(input_data)]
 self._time('Encoded data', t)
 #Build header info
 num_bytes = len(encoded_data)
 if num_bytes > self.max_data_len:
 message = 'well done, you just broke the laws of physics'
 raise ImageStoreError(message)
 nb_binary = str(bin(num_bytes))[2:]
 nb_length = len(nb_binary)
 nb_integer_parts = nb_length // 8 + (1 if nb_length % 8 else 0)
 nb_total_length = nb_integer_parts * 8
 nb_new = [nb_binary.zfill(nb_total_length)[i:i+8] 
 for i in range(0, nb_total_length, 8)]
 pixel_data = [int(i, 2) for i in nb_new] + encoded_data
 self._time('Calculated header', t)
 #Try read custom image from config if nothing has been given
 if (not self.defaults('ForceNoCustomImage')
 and (custom_image is True 
 or (custom_image is None
 and self.defaults('UseDefaultCustomImageIfNone'))
 or self.defaults('ForceDefaultCustomImage'))):
 try:
 self._read_image(self.defaults('DefaultCustomImage'))
 custom_image = self.defaults('DefaultCustomImage')
 except (IOError, urllib2.URLError):
 pass
 #Totally disable
 if (custom_image is False
 or self.defaults('ForceNoCustomImage')):
 custom_image = None
 if custom_image is not None:
 #Read and process custom image
 custom_image_input = self._read_image(custom_image)
 image_width, image_height = custom_image_input.size
 max_image_bytes = image_width * image_height
 self._time('Read custom image', t)
 #Calculate required bits per byte
 total_data_bytes = len(pixel_data) + 1
 self._print('Image resolution: {}x{} ({} pixels)'.format(
 image_width, image_height, max_image_bytes), 1)
 self._print('Input data: {} bytes'.format(total_data_bytes), 1)
 self._print(('Checking the smallest possible '
 'bits per byte to use...'), 1)
 bits_per_byte = 1
 while bits_per_byte < 9:
 storage_needed = total_data_bytes * (8 / (bits_per_byte))
 self._print(" {}: Up to {} bytes".format(bits_per_byte, 
 int(round(storage_needed))), 1)
 if storage_needed < max_image_bytes:
 break
 bits_per_byte += 1
 #Data won't fit in image, revert to normal method
 if bits_per_byte == 8:
 custom_image = None
 image_ratio = '{}:{}'.format(image_width, image_height)
 break
 self._time('Calculated bits to use', t)
 #Only continue if data fits in image
 if custom_image is not None:
 #Process both parts of data
 joined_binary_data = ''.join(str(bin(x))[2:].zfill(8) 
 for x in pixel_data)
 #Pad the end a little to stop an unusual error
 joined_binary_data += '0' * bits_per_byte
 split_binary_data = UsefulThings.split_list(
 joined_binary_data, 
 bits_per_byte)
 num_pixels_needed = len(split_binary_data)
 split_image_data = UsefulThings.flatten_list(
 custom_image_input.getdata())
 self._time('Processed input data and image', t)
 reduced_image = [str(bin(i))[2:].zfill(8)[:-bits_per_byte] 
 for i in split_image_data]
 self._time('Reduced bits of custom image', t)
 #Duplicate split_binary_data until it is long enough
 #Faster overall compared to picking random values
 extra_length_needed = (len(split_image_data) - 
 len(split_binary_data))
 while len(split_binary_data) < extra_length_needed:
 split_binary_data *= 2
 pixel_data = [int(reduced_image[i] + split_binary_data[i-1],
 2) for i in range(len(reduced_image))]
 self._time('Merged input data with custom image', t)
 else:
 self._print('Data does not fit in image, '
 'reverting to normal method...', 1)
 if custom_image is None:
 #Convert input to numbers
 pixel_data = [0] + pixel_data
 input_bytes = len(pixel_data)
 pixel_data += [0] * (3 - input_bytes % 3) #Pad to multiple of 3
 total_bytes = len(pixel_data) // 3
 required_pixels = int(total_bytes * 8 / bits_per_byte)
 #Calculate width and height of image
 if ':' in str(image_ratio):
 image_ratio_split = [max(1, float(i)) 
 for i in image_ratio.split(':')]
 else:
 image_ratio_split = [16, 9]
 x = pow(required_pixels * image_ratio_split[0] * 
 image_ratio_split[1], 0.5)
 #Don't let any dimensions go over the number of bytes
 image_width = max(1, min(required_pixels, 
 int(round(x/image_ratio_split[1]))))
 image_width //= 3
 image_width *= 3
 image_height = required_pixels / image_width
 #Round height up if left as a decimal
 if float(image_height) != float(int(image_height)):
 image_height += 1
 image_height = int(image_height)
 image_dimensions = image_width * image_height
 self._time('Calculated image size', t)
 remaining_pixels = image_dimensions - required_pixels
 pixel_data += [random.choice(pixel_data) 
 for i in range(3 * remaining_pixels)]
 self._time('Padded data', t)
 #Write first byte as header
 initial_header = int(str(nb_integer_parts) + str(bits_per_byte))
 pixel_data[0] = initial_header
 #Draw image
 image_output = Image.new('RGB', (image_width, image_height))
 image_data = image_output.load()
 #Assign pixels
 for y in range(image_height):
 for x in range(image_width):
 cp = 3 * (x + y * image_width)
 image_data[x, y] = tuple(pixel_data[cp:cp + 3])
 self._time('Created image', t)
 #Build StringIO file
 #Even if not uploading, can be used for file size
 output_StringIO = StringIO.StringIO()
 image_output.save(output_StringIO, 'PNG')
 contents = output_StringIO.getvalue()
 output_StringIO.close() 
 if save_image:
 self._save_image(image_output)
 self._time('Saved file', t)
 self._print('Path to file: {}'.format(self.path), 2)
 if upload_image: 
 client = self.choose_client(imgur_auth)
 #Renew the token
 if self.validate_client(client):
 client.refresh_access_token()
 #Send upload request to Imgur
 imgur_params = {'image': contents.encode('base64'), 
 'title': imgur_title, 
 'description': imgur_description}
 image_upload = self._imgur_request(client,
 'image',
 imgur_params)
 uploaded_image_type = image_upload['type'].split('/')[1]
 uploaded_image_size = image_upload['size']
 uploaded_image_link = image_upload['link']
 uploaded_image_id = image_upload['id']
 uploaded_image_delete_link = image_upload['deletehash']
 self._time('Uploaded image', t)
 #Detect if image has uploaded correctly, delete if not
 if (uploaded_image_type.lower() == 'png' and
 uploaded_image_size == len(contents)):
 i_link = 'Link to image: {}'
 i_delete = 'Link to delete image: {}/{}'
 self._print(i_link.format(uploaded_image_link), 2)
 self._print(i_delete.format(self.imgur_delete,
 uploaded_image_delete_link), 2)
 if open_on_upload:
 webbrowser.open(uploaded_image_link)
 else:
 output = 'Image failed to upload correctly - '
 #Check if it was converted
 if uploaded_image_type.lower() != 'png':
 output += 'file was too large.'
 else:
 output += 'unknown reason'
 self._print(output, 1)
 upload_image = False
 pyimgur.Image(image_upload, client).delete()
 #Validate the image
 if save_image or upload_image:
 if verify:
 failed = 0
 read_save_image = input_data
 read_upload_image = input_data
 if save_image:
 read_save_image = ImageStore(self.path, 
 allow_print=False).read()
 if upload_image:
 read_upload_image = ImageStore(uploaded_image_link, 
 allow_print=False).read()
 #Deactivate features based on the problem
 if read_save_image != read_upload_image:
 failed += 1
 upload_image = False
 pyimgur.Image(image_upload, client).delete()
 if read_save_image != input_data:
 failed += 1
 save_image = False
 if failed == 2:
 raise ImageStoreError('image failed validation')
 self._time('Verified file', t)
 #Build dictionary to return
 output_path = {'size': len(contents)}
 if save_image:
 output_path['path'] = self.path
 if upload_image:
 output_path['url'] = '{}/{}.{}'.format(self.imgur_main,
 uploaded_image_id, 
 uploaded_image_type)
 output_path['url_delete'] = '{}/{}'.format(
 self.imgur_delete,
 uploaded_image_delete_link)
 return output_path
 def read(self):
 """Attempt to read the stored data from an image.
 To undo the write process, if a custom image is used, each 
 colour must be broken down into bits and the last few bits are 
 taken then pieced together. If these are split into groups of 8
 and converted back to characters, it results in the original 
 encoded string.
 """
 self._print('Reading image...', 0)
 with UsefulThings.TimeThis(print_time=self.allow_print) as t:
 image_input = self._read_image(self.path, 
 require_png=True)
 self._time('Read image', t)
 #Get data and header
 image_data = UsefulThings.flatten_list(image_input.getdata())
 image_header = str(image_data[0])
 bytes_per_pixel = int(image_header[-1])
 nb_parts = int(image_header[:-1])
 image_data = image_data[1:]
 self._time('Processed image', t)
 #Get the required pixels as binary, piece together, and
 # convert back to int
 if bytes_per_pixel != 8:
 image_data_parts = [str(bin(i))[2:].zfill(8)[8-bytes_per_pixel:]
 for i in image_data]
 image_data_binary = ''.join(image_data_parts)
 image_data_split = UsefulThings.split_list(image_data_binary, 8)
 num_bytes = int(''.join(image_data_split[:nb_parts]), 2)
 truncated_data = [int(i, 2) for i in 
 image_data_split[nb_parts:num_bytes + nb_parts]]
 else:
 #Does same as above, but avoids converting the entire 
 # data to binary to save time
 nb_raw = image_data[:nb_parts]
 nb_binary = ''.join(str(bin(i))[2:].zfill(8) 
 for i in nb_raw)
 num_bytes = int(nb_binary, 2)
 truncated_data = image_data[nb_parts:num_bytes + nb_parts]
 self._time('Got required data', t)
 decoded_data = self.decode(''.join(chr(number) 
 for number in truncated_data))
 self._time('Decoded data', t)
 return decoded_data
 def encode(self, data, b64=False):
 """Encode the image data from the raw input into something that
 can be converted into bytes.
 """
 data = zlib.compress(cPickle.dumps(data))
 if b64:
 data = base64.b64encode(data)
 return data
 def decode(self, data, b64=False):
 """Decode the data returned from encode()."""
 try:
 if b64:
 data = base64.b64decode(data)
 data = cPickle.loads(zlib.decompress(data))
 return data
 except (TypeError, cPickle.UnpicklingError, zlib.error):
 raise ImageStoreError('failed to decode image data')
 def validate_client(self, client):
 """Validate the client manually since pyimgur only gives a
 generic 'Exception' if it is invalid.
 """
 if isinstance(client, pyimgur.Imgur):
 try:
 return (client.client_secret is not None
 and client.refresh_token is not None)
 except AttributeError:
 pass
 return False
 def choose_client(self, imgur_auth):
 """Check the input client or return the default one.
 If an authenticate attempt is successful, the client data 
 will be written to the config file. This can be automatically
 deleted by setting imgur_auth to False.
 Parameters:
 imgur_auth (str, bool, pyimgur.Imgur): The client used
 by the Imgur API.
 The code will make an attempt to read it from 
 multiple formats.
 Order of attempts:
 Decode input string
 Prompt for login details if imgur_auth == True
 Read raw output from pyimgur.Imgur
 Reset config value if imgur_auth == False
 Read and decode config string
 Use default value
 """
 used_auth = None
 client = None
 config_parser = UsefulThings.NewConfigParser(ISGlobals.location)
 #Decode client from input string or ask for input
 if isinstance(imgur_auth, str):
 try:
 client = self.decode(imgur_auth.strip(), b64=True)
 if not self.validate_client(client):
 raise ImageStoreError()
 except (ImageStoreError, AttributeError):
 imgur_auth = None
 client = None
 else:
 used_auth = 'Decoded input'
 if (client is None 
 and (isinstance(imgur_auth, pyimgur.Imgur) or imgur_auth is True)):
 if imgur_auth is True:
 imgur_auth = imgur_log_in()
 #Use raw input as client 
 if self.validate_client(imgur_auth):
 #Write to config
 config = {'ImageStore': {'LastImgurClient': encoded_client}}
 encoded_client = self.encode(imgur_auth, b64=True)
 config_parser.write(config, 
 write_values=True, 
 update_values=True)
 self._print('Your encoded client string is as follows. '
 'Use this for imgur_auth to bypass the login.', 1)
 self._print(encoded_client, 1)
 used_auth = 'Input'
 client = imgur_auth
 else:
 #Move onto next part instead
 imgur_auth = None
 if client is None and not isinstance(imgur_auth, pyimgur.Imgur):
 #Reset
 try:
 if imgur_auth is False:
 raise ImageStoreError()
 #Read config for previous client info
 config_values = config_parser.read()['ImageStore']
 encoded_client = config_values['LastImgurClient']
 client = self.decode(''.join(encoded_client), b64=True)
 if not self.validate_client(client):
 raise ImageStoreError()
 used_auth = 'Previously cached client'
 #Use default value
 except (KeyError, ImageStoreError):
 if not self.validate_client(client):
 #Reset config
 config = {'ImageStore': {'LastImgurClient': ''}}
 config_parser.write(empty_config,
 write_values=True, 
 update_values=True)
 #Use default client
 encoded_client = ['eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL',
 'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr',
 'OF5ifzhPrFm+I7B1GDnbriY70yn2cW7Pia',
 'bltWKjYC9pu4qptef5SMMcfbaFDCRrqopl',
 '9kFT7C5ZUVmBovxOmihRScIl6cb8Kea8rx',
 '79h17/7G0c4nDI3Czcek8ptfP7Gw1ZrNne',
 'E1i0VPNaO+ODANTKmBpk6IKiTiqP6I3SeW',
 'jF/un/2QGwlFxBG8tKKJepAlTGcOs6xEC+',
 'yILdjIn8tCwEmc0KpVaA2awrbSU3ngunlH',
 'GBG8EtOVVbAKuLX5WUh8en+9fNrt00z82u',
 'qMgauJ52oi5f7/i9FzTbIxqBXAXnuDAOrN',
 'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4=']
 client = self.decode(''.join(encoded_client), b64=True)
 used_auth = 'Default client'
 self._print('Imgur authentication being used: {}'.format(used_auth), 1)
 return client
def imgur_log_in():
 client = pyimgur.Imgur('0d10882abf66dec', 
 '5647516abf707a428c23b558bd2832aeb59a11a7')
 auth_url = client.authorization_url('pin')
 webbrowser.open(auth_url)
 pin = raw_input('Please enter the pin number shown... ')
 print pin
 try:
 client.exchange_pin(pin)
 except requests.HTTPError:
 print 'Error: Invalid pin number'
 return None
 return client

Here's some examples of usage with the output to show how long it takes.

Write a large mp3:
>>> with open('B:/Music/Linkin Park/13 - Numb.mp3') as f:
... is = ImageStore().write(f.read(), upload_image=True)
Writing to image...
 3.41 seconds: Encoded data
 3.45 seconds: Calculated header
 3.51 seconds: Calculated image size
 3.51 seconds: Padded data
 5.34 seconds: Created image
 6.33 seconds: Saved file
 Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
 Imgur authentication being used: Previously cached client
 47.85 seconds: Uploaded image
 Image failed to upload correctly - file was too large.
 51.63 seconds: Verified file
Total time taken: 51.63 seconds
{'path': 'D:/Peter/Documents/ImageStore/ImageDataStore.png', 'size': 7333390}
Write a smaller mp3:
>>> with open('B:/Music/Linkin Park/Pushing Me Away.mp3') as f:
... is = ImageStore().write(f.read(), upload_image=True)
Writing to image...
 1.86 seconds: Encoded data
 1.88 seconds: Calculated header
 1.92 seconds: Calculated image size
 1.92 seconds: Padded data
 2.89 seconds: Created image
 3.41 seconds: Saved file
 Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
 Imgur authentication being used: Previously cached client
 24.19 seconds: Uploaded image
 Link to image: http://i.imgur.com/H4mLTI5.png
 Link to delete image: http://imgur.com/delete/bRFxgtiFOannUjV
 29.29 seconds: Verified file
Total time taken: 29.29 seconds
{'url': 'http://i.imgur.com/H4mLTI5.png', 'path': 'D:/Peter/Documents/ImageStore/ImageDataStore.png', 'url_delete': 'http://imgur.com/delete/bRFxgtiFOannUjV', 'size': 4129767}
Save mp3 from URL:
>>> with open('B:/new file.mp3', 'w') as f:
... f.write(ImageStore('http://i.imgur.com/H4mLTI5.png').read())
Reading image...
 1.23 seconds: Read image
 1.61 seconds: Processed image
 1.63 seconds: Got required data
 2.69 seconds: Decoded data
Total time taken: 2.70 seconds
#B:/new file.mp3 still contains the exif data too, obviously I'm not going to leave that link up though :P
To generate the image below:
>>> with open(some_wallpaper) as f:
... ImageStore().write(f.read(), upload_image=True, custom_image='http://topwalls.net/wallpapers/2012/01/Nature-sea-scenery-travel-photography-image-768x1366.jpg')
Writing to image...
 0.27 seconds: Encoded data
 0.27 seconds: Calculated header
 2.90 seconds: Read custom image
 Image resolution: 1366x768 (1049088 pixels)
 Input data: 582474 bytes
 Checking the smallest possible bits per byte to use...
 1: Up to 4659792 bytes
 2: Up to 2329896 bytes
 3: Up to 1553264 bytes
 4: Up to 1164948 bytes
 5: Up to 931958 bytes
 2.91 seconds: Calculated bits to use
 3.80 seconds: Processed input data and image
 5.65 seconds: Reduced bits of custom image
 7.20 seconds: Merged input data with custom image
 7.97 seconds: Created image
 8.39 seconds: Saved file
 Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
 Imgur authentication being used: Previously cached client
 21.28 seconds: Uploaded image
 Link to image: http://i.imgur.com/RRUe0Mo.png
 Link to delete image: not telling
 29.04 seconds: Verified file
Total time taken: 29.05 seconds

If you run this image through the code and save the data with a jpg extension, you'll get a 1080p leafy wallpaper. It needed 5 bits to store the data, so it's dropped the quality quite considerably, normally it's not really noticeable. image

Link to the latest version is here.

Question 2

Please do not update the code in your question once it has been answered - that invalidates the answers. If you would like further advice, ask a followup question. See this question on meta and the related/linked questions therein.

Question 3

Ah right thanks, wasn't aware of that, I'll put up a link to where the latest version is then

Question 4

In short, your class is too big. Try splitting it into two or three classes, Steganography, Imgur and URLOpen.

Making Steganography it's own class is a must for this program.

Line's

First off, I tried following that 72/80 line thing for the first time

It's really good that you done that. For your strings say encoded_client. You can use operator-less raw string concatenation. Also if you don't like on delimiter lining you can use an alternate.

encoded_client = (
 'eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL'
 'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr'
 ...
 'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4='
)

I think that looks nicer. And now you don't have to care about concatenating them.

As for your cuts/renames. I personally would say that that good variable names are better than 81 characters in a line. However normally you can do something to achieve both. First, there is normally a problem if you are indenting more than say 5 times. You could probably re-factor the code into two or more functions.

# This is kinda hard to read.
# But is allowed.
decoded_data = self.decode(''.join(chr(number) 
 for number in truncated_data))
 
# This is just wrong.
decoded_data = self.decode(''.join(chr(number) 
 for number in truncated_data))
# This is one way to solve the problem
decoded_data = self.decode(
 ''.join(chr(number) for number in truncated_data)
)
# And an alternate
decoded_data = self.decode(
 ''.join(chr(number) for number in truncated_data))

I prefer the third way, however some prefer the forth way. So if you see the third way and not the forth way in my review then either is good.

As for very long comprehensions I think splitting them into multiple lines is very good. And you do that for the most part.

# Harder to read.
(chunk for chunk in my_list if chunk)
# Easier to read
(
 chunk
 for chunk in my_list
 if chunk
)

The above is what I find. I know that I use the former more than the latter, but I find the latter easier to read, and if my_list was a generator, then you could easily merge them together.

General

global is used to change things in global scope when in a lower scope. While I can see override_upload in a function, I can't change it.

It's like there is a number on a display past some glass. You can see it, and you know what it is. But when you try to use your pen to overwrite it you just get the writing on the glass. Now you can't see the number on display, as your new writing looks like it's on display. And everyone behind you sees yours as if it were the writing on display.

In short, you're not behind any glass. And so you don't need global.
Try to limit the size of trys. You have two imports in a try. Change this so there are two different trys, and do in if if you need to limit the second try.
I hate my_dict = {};my_dict['a'] = 'a'. It seems stupid, you can just put the letters in the dict. This both stops you slaving away when changing the name of the dict. And make it so that I know what the dict contains straight away. my_dict = {'a': 'a'}.
Using on delimiter line calls impairs your programs readability. It's easier to just read on the left side of the screen, then to jump from left to right.
Your ISGlobals.get function can be removed if you are ok with changing the classes __dict__. self.__dict__.update(global_dict) is how you would do this. However if you can't change your program to not use strings you would have to use getattr, and that can get ugly. Or you could use the dictionary instead.
URL's are normally very alike to Unix paths. Rather than re-inventing the wheel, you can use posixpath. This will allow for possibly safer path changes as your current path methods aren't too safe.
I think mixing opening via URL and system is a very bad Idea. I would rather as a developer have the option for one or the other, I don't want to download some file, when I specified a file on the system.
Your write is way too long. To be honest I kinda don't want to read something that large. It's ~300 lines long, and has indents up to and including 6. It should really be split up into more functions no matter how you look at it.
I think you really need to remove the Imgur implementation in this program. All it does is add complexity to your program.

First I think that your class should have an 'encode' - put data in the image, and a 'decode' - take information out of it.

Then you can pass a PIL image to it, and the functions do there magic. No confusion over how to handle Imgur, no confusion over reading or writing and no compression.

Your aim should not be to throw as many features into this as possible. (Just to note I'm not saying don't write this with Imgur support, that's a really cool idea.) I just think Imgur should be designed afterwards. And defiantly not implemented in the same class, let alone the same functions.

My preferance on API's

The way I would like to run you program is something like the following:

file_path = '...'
# The option for me to NOT use `load_url`.
image = (
 load_url(file_path)
 if is_url(file_path) else
 load(file_path)
)
image = encode('My message', image, bits=4)
save_image(image)

This is way more customisable, and allows me the end user to have more control over the system. If I don't want to upload to Imgur, as shown above, then I don't. If I don't want to download the file, I don't. If I want to encode it, then I add a function to do that. While this will make a harder to understand main, it's better that way as then your code is easier to maintain.

Lets say I use none of the protocols 'http', 'https' or 'ftp', but I have a way to use some other protocol, and can get the image into a PIL image. Can I use your program? Not really.

write

Some of your error messages are not descriptive, and can be taken as rude.

'well done, you just broke the laws of physics'

Changing that to 'message too large for image' is better.

Also you should do the check before encoding the data... We might be 'dumb', but we don't want our time waisted.

You have a lot of noise surrounding all your code, and so I don't fully understand it.

But I would really recommend splitting it into functions. You could probably make the functions:

Make header
Custom image
Normal image
Create image (Draw image and Assign pixels)
Save image
Upload image

But not only that, you could probably make some smaller helper functions too.

How I would implement the encoding

But if I were to implement your encode function, I would take message, image (list) and bits. I would then do binary logic to get the numbers. You could possibly get a faster system if you store the numbers inverted.

# Byte = 8 bits
# Chunk = `bits` bits
# Converts from a byte to bits then to chunks
def from_bytes(message, bits):
 bit = 0
 total = 0
 for byte in message:
 for _ in range(8):
 bit += 1
 total <<= 1
 total += (byte & 128) >> 7
 if bit == bits:
 yield total
 bit = 0
 total = 0
 byte <<= 1
 while True:
 yield 0
# Add image and message data.
def encode(message, image, bits):
 NOT_BITS = ~((1 << bits) - 1)
 return (
 (byte & NOT_BITS) + data
 for byte, data in zip(image, from_bytes(message, bits))
 #for byte, data in itertools.izip(image, from_bytes(message, bits)) # Python2
 )
# Modified decode. Non-8bits one.
# I removed the header for simplicity
def decode(image, bits):
 image_data = ''.join(
 str(bin(i))[2:].zfill(8)[8 - bits:]
 for i in image
 )
 return [
 int(j, 2)
 for j in (
 image_data[i:i + 8]
 for i in range(0, len(image_data), 8)
 )
 ]

If you were to link this up with PIL to get proper encoding and decoding. Then you force the user to pass a PIL image, then you could drasticly reduce the complexity of your program.

It allows you to do things like:

# You may want to have `chr` and `ord` done inside,
# but this is more flexible.
''.join(map(chr, decode(encode(map(ord, 'Hi'), [0, 0, 0, 0], 4), 4)))
# Hi

So I would say that it's safe. But I would put a check on the size of the bits.

Also slap PIL onto this and put it into a class and then you have half your program in almost no lines. Where you could encode or decode any image.

Then you just need to get a couple of functions/a class to make the Imgur and other web aspects of the program. And it should be much easier to read.

Performance wise the above is the best I could get, but I didn't experiment too much. My speed test and results.

Program design

As we talked a little more about your code. It seems that you don't know how you should split your code up. What should a class do, what should or shouldn't be in one, etc.

As your program is very large, we want to minimise the amount of arguments into your function. And you should really aim at removing some aspects of the program from the main class.

As the end user may just want to encode a list, and use something other than PIL, you would want to give them that option. I know that I wouldn't really use the Imgur part, as I have no need to upload to Imgur when I have enough storage here. But others may want to. To be honest I like when the end user can use an API as if it's just like using a Python one. Simple, clean, and nice to read.

from PIL import Image
import urllib2
image_location = 'my_image_location.png'
new_image = 'my_image_location_new.png'
load_from_url = False
add_header = True
message = "Haya there I'm a message!"
amount_of_bits = 3
# Open an image.
image = Image.open(
 StringIO.StringIO(urllib2.urlopen(image_location).read())
 if load_from_url else
 image_location
).convert('RGB')
width, height = image.size
# Add header if you wish.
if add_header:
 header = Steganography.generate_header(message, amount_of_bits)
 message = header + message
# Flatten the image and encode it
encoded_generator = Steganography.encode(
 message,
 (
 j
 for i in image.getdata()
 for j in i
 )
 amount_of_bits
)
# Stolen directly from your write function
# May be worth making this a function, but not in `Steganography`
encoded_image = Image.new('RGB', (width, height))
encoded_data = encoded_image.load()
for y in range(height):
 for x in range(width):
 image_data[x, y] = (
 next(encoded_generator),
 next(encoded_generator),
 next(encoded_generator)
 )
encoded_image.save(new_image)

So this is way more verbose than the way you have it at the moment. However, it is much simpler to read, understand, modify and use.

There are 6/7 variables that will change on an end user using the entire of your program. However this does not mean that it's perfect to go in a function! You can't imagine all the different possibilitys that the end user needs, however you know how to write a function to perform a type of steganography. And so your class should only contain the information to do that.

In the above there is only two functions that aren't defined. Steganography.generate_header and Steganography.encode. I already put how I would make the encode, and I'm sure you know how to implement your header. However the last bit could be put in a function, make_image below.

class Steganography:
 @static
 def generate_header(message, bits):
 # return header
 
 @static
 def read_header(iterator, bits):
 # you could do some cool stuff here.
 # But I'll let you try it yourself,
 # have a look at `make_image` for reference
 
 @static
 def encode(message, iterator, bits):
 # return encode_generator
 
 @static
 def decode(iterator, bits):
 # return message
 
class ImageHelpers:
 @static
 def make_image(iterator, height, width):
 encoded_image = Image.new('RGB', (width, height))
 encoded_data = encoded_image.load()
 # Work for anything that can work in a for loop.
 iterator = iter(iterator)
 for y in range(height):
 for x in range(width):
 image_data[x, y] = (
 next(iterator),
 next(iterator),
 next(iterator)
 )
 return encoded_image
 
 # You gave the option to pass no image.
 @static
 def make_blank_image(...):

If you re-write your code with only this input, no global variables and to work in the example above you will have easier to maintain code. There will no longer be any noise in the hardest parts of the program. And should also be small like the examples I gave above.

At first you should not 'just get your code to work'. You should get it to work on it's own. If I asked for a packet of 'chips', then I would expect 'chips', not 'chips' with 'a sausage' and 'some fish'. Just like when I ask for 'steganography' I would expect to get 'steganography', not 'steganography' with 'PIL' and 'Imgur'.

After building this base then you can edit the code to do all the default/global stuff. Also, I would recommend avoiding read and write in the steganography class, this is as it will add PIL and Imgur to the steganography class. If you must do that, then I would make new functions to do it for you.

Question 5

Let us continue this discussion in chat.

Question 6

There is quite a lot of code to review, so I'm sorry if I miss out on the details. However let us give it a go, and give you some comments on the style and code of your solution.

Style review

Good variable and function names – It seems like you overall has quite good variable names, and are consistent in your naming. You do have some exceptions like output_StringIO but they are understandable why you use those
Indentation is mostly good – This is almost a given using Python, however in some places you could use the trick of inverting the test, and breaking/returning to reduce nesting level. See code below
```
#Instead of this, which increases indentation level...
if save_image:
 # Save image code
# ... you could revert it, and do
if not save_image:
 return # or break
```

Save image code

Switch if/else blocks – In some cases you have really short else blocks, but lengthy if blocks. If you switch these around, it is easier to follow your logic, instead of browsing a lot up and down to verify what the actual condition was, and why am I here.
```
if my_positive_condition:
 # 50 lines of code
else:
 # 2 lines of code
# Replace with:
if not my_positive_condition:
 # 2 lines of code
else: 
 # 50 lines of code
```
Seems to be good comments and docstrings – Seemingly appropriate level of documenting and commenting code. Not too much unneccessary comments, either. Also like the self._print(), which easily allows you remove all output
Try avoiding magic numbers – Numbers like 8 are references throughout the code, and could be replaced by a constant like BITS_PR_BYTE. Not a major biggie, but if you were to change into another bit format, i.e. 16 bits imaging, you would have a lot of places to clean up
Use more vertical spacing – I would add some vertical spacing here and there. Especially like at the start of write() were you have a sequence of if/if/elif/if/else thightly connected. This is easily read as one block of if/elif/elif/elif/else.

Code review

In general I like the idea of yours to store extra data in the lower bits of the image data. I would possibly be a little more strict and not allow the entire image to be replaced by data, but that is merely a point of view. However there are some other code aspects I would like address:

Add a magic header – I would strongly recommend to add a magic header in the first bytes, so that you are able to recognize if this is one of your files, or if it has been tampered (or simply is just another plain image file). One option could be to encode a magic string in the lower 2/3 bits of the first 10 pixels, which would be similar in all image processed by you
Split the larger functions – The write() and read() methods are way to large. As a general rule of thumb, most methods should fit on the screen without the need of scrolling. This would also reduce the parameter count, which is also a good thing
Consider introducing more classes – It seems like you could consider having one class for handling the image, another one for handling upload, and possibly also another for handling actual file write/read operations. Don't be afraid of classes if they can help you increase readability and general understanding of how to do your work. Monolithic code is never good...
Verify your logic in ISGlobals – The class starts of with code not belonging to any class/instance methods? And within the __init__() method you seem to (re-)assign to required_globals ignoring that you've already have something there. And to finally set them based on the result of a config_parser.write() also seems strange...

Question 7

Thanks a lot for the tips. I did think something looked strange about a small else after a big if, so I'll get them switched. I didn't think of 16 bit images, I'll get it under bits_per_byte_max then maybe later add a check from PIL. Replacing the entire image is mainly for aesthetic purposes, since if I don't, it's a very obvious line where the data has finished. I had an end of data marker in my old code similar to your magic header, so I suppose I could implement something like that at the start. read() isn't so long, but Joe suggested some ways of breaking up write()

Question 8

I tried to cut down on classes are someone said on some earlier code that I should stick it all under one class haha, but I'll try split it into a few. The purpose of ISGlobals is to set default values, but then read the config to see if the user has specified something else (then one final pass to get all the values together). required_globals is only really used for comments that should always be in the config file. I'll make the config_parser.read() function accept an input sometime in the future, but currently it needs to be write() to update and return all the values passed in.

Question 9

@Peter, A lot related to coding comes down to personal preferences, but most do agree on the somewhat flimsical statement of "most methods should fit on the screen without the need of scrolling". You are of course free to do whatever you like, but the easier it reads the easier it is to maintain both now and in the future!

Question 10

I still feel a bit funny about the 80 characters but will stick with it haha, but I've not heard the fit on screen without scrolling thing before so I'll see if I can cut it down. I just thought there'd be no point making a new function if it's only used once throughout the class, assuming the comments show what it's doing.

Question 11

The point with the "one screen" is that you can see the overview of what happens easily. Some state is hard as if you can't see what the function does directly, it is too complicated. You might have heard of the "Single responsiblity"...

Peilonrayz ♦Peilonrayz 44.4k7 gold badges80 silver badges157 bronze badges · Accepted Answer · 2015-10-01 22:35:57Z

In short, your class is too big. Try splitting it into two or three classes, Steganography, Imgur and URLOpen.

Making Steganography it's own class is a must for this program.

Line's

First off, I tried following that 72/80 line thing for the first time

It's really good that you done that. For your strings say encoded_client. You can use operator-less raw string concatenation. Also if you don't like on delimiter lining you can use an alternate.

encoded_client = (
 'eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL'
 'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr'
 ...
 'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4='
)

I think that looks nicer. And now you don't have to care about concatenating them.

As for your cuts/renames. I personally would say that that good variable names are better than 81 characters in a line. However normally you can do something to achieve both. First, there is normally a problem if you are indenting more than say 5 times. You could probably re-factor the code into two or more functions.

# This is kinda hard to read.
# But is allowed.
decoded_data = self.decode(''.join(chr(number) 
 for number in truncated_data))
 
# This is just wrong.
decoded_data = self.decode(''.join(chr(number) 
 for number in truncated_data))
# This is one way to solve the problem
decoded_data = self.decode(
 ''.join(chr(number) for number in truncated_data)
)
# And an alternate
decoded_data = self.decode(
 ''.join(chr(number) for number in truncated_data))

I prefer the third way, however some prefer the forth way. So if you see the third way and not the forth way in my review then either is good.

As for very long comprehensions I think splitting them into multiple lines is very good. And you do that for the most part.

# Harder to read.
(chunk for chunk in my_list if chunk)
# Easier to read
(
 chunk
 for chunk in my_list
 if chunk
)

The above is what I find. I know that I use the former more than the latter, but I find the latter easier to read, and if my_list was a generator, then you could easily merge them together.

General

global is used to change things in global scope when in a lower scope. While I can see override_upload in a function, I can't change it.

It's like there is a number on a display past some glass. You can see it, and you know what it is. But when you try to use your pen to overwrite it you just get the writing on the glass. Now you can't see the number on display, as your new writing looks like it's on display. And everyone behind you sees yours as if it were the writing on display.

In short, you're not behind any glass. And so you don't need global.
Try to limit the size of trys. You have two imports in a try. Change this so there are two different trys, and do in if if you need to limit the second try.
I hate my_dict = {};my_dict['a'] = 'a'. It seems stupid, you can just put the letters in the dict. This both stops you slaving away when changing the name of the dict. And make it so that I know what the dict contains straight away. my_dict = {'a': 'a'}.
Using on delimiter line calls impairs your programs readability. It's easier to just read on the left side of the screen, then to jump from left to right.
Your ISGlobals.get function can be removed if you are ok with changing the classes __dict__. self.__dict__.update(global_dict) is how you would do this. However if you can't change your program to not use strings you would have to use getattr, and that can get ugly. Or you could use the dictionary instead.
URL's are normally very alike to Unix paths. Rather than re-inventing the wheel, you can use posixpath. This will allow for possibly safer path changes as your current path methods aren't too safe.
I think mixing opening via URL and system is a very bad Idea. I would rather as a developer have the option for one or the other, I don't want to download some file, when I specified a file on the system.
Your write is way too long. To be honest I kinda don't want to read something that large. It's ~300 lines long, and has indents up to and including 6. It should really be split up into more functions no matter how you look at it.
I think you really need to remove the Imgur implementation in this program. All it does is add complexity to your program.

First I think that your class should have an 'encode' - put data in the image, and a 'decode' - take information out of it.

Then you can pass a PIL image to it, and the functions do there magic. No confusion over how to handle Imgur, no confusion over reading or writing and no compression.

Your aim should not be to throw as many features into this as possible. (Just to note I'm not saying don't write this with Imgur support, that's a really cool idea.) I just think Imgur should be designed afterwards. And defiantly not implemented in the same class, let alone the same functions.

My preferance on API's

The way I would like to run you program is something like the following:

file_path = '...'
# The option for me to NOT use `load_url`.
image = (
 load_url(file_path)
 if is_url(file_path) else
 load(file_path)
)
image = encode('My message', image, bits=4)
save_image(image)

This is way more customisable, and allows me the end user to have more control over the system. If I don't want to upload to Imgur, as shown above, then I don't. If I don't want to download the file, I don't. If I want to encode it, then I add a function to do that. While this will make a harder to understand main, it's better that way as then your code is easier to maintain.

Lets say I use none of the protocols 'http', 'https' or 'ftp', but I have a way to use some other protocol, and can get the image into a PIL image. Can I use your program? Not really.

write

Some of your error messages are not descriptive, and can be taken as rude.

'well done, you just broke the laws of physics'

Changing that to 'message too large for image' is better.

Also you should do the check before encoding the data... We might be 'dumb', but we don't want our time waisted.

You have a lot of noise surrounding all your code, and so I don't fully understand it.

But I would really recommend splitting it into functions. You could probably make the functions:

Make header
Custom image
Normal image
Create image (Draw image and Assign pixels)
Save image
Upload image

But not only that, you could probably make some smaller helper functions too.

How I would implement the encoding

But if I were to implement your encode function, I would take message, image (list) and bits. I would then do binary logic to get the numbers. You could possibly get a faster system if you store the numbers inverted.

# Byte = 8 bits
# Chunk = `bits` bits
# Converts from a byte to bits then to chunks
def from_bytes(message, bits):
 bit = 0
 total = 0
 for byte in message:
 for _ in range(8):
 bit += 1
 total <<= 1
 total += (byte & 128) >> 7
 if bit == bits:
 yield total
 bit = 0
 total = 0
 byte <<= 1
 while True:
 yield 0
# Add image and message data.
def encode(message, image, bits):
 NOT_BITS = ~((1 << bits) - 1)
 return (
 (byte & NOT_BITS) + data
 for byte, data in zip(image, from_bytes(message, bits))
 #for byte, data in itertools.izip(image, from_bytes(message, bits)) # Python2
 )
# Modified decode. Non-8bits one.
# I removed the header for simplicity
def decode(image, bits):
 image_data = ''.join(
 str(bin(i))[2:].zfill(8)[8 - bits:]
 for i in image
 )
 return [
 int(j, 2)
 for j in (
 image_data[i:i + 8]
 for i in range(0, len(image_data), 8)
 )
 ]

If you were to link this up with PIL to get proper encoding and decoding. Then you force the user to pass a PIL image, then you could drasticly reduce the complexity of your program.

It allows you to do things like:

# You may want to have `chr` and `ord` done inside,
# but this is more flexible.
''.join(map(chr, decode(encode(map(ord, 'Hi'), [0, 0, 0, 0], 4), 4)))
# Hi

So I would say that it's safe. But I would put a check on the size of the bits.

Also slap PIL onto this and put it into a class and then you have half your program in almost no lines. Where you could encode or decode any image.

Then you just need to get a couple of functions/a class to make the Imgur and other web aspects of the program. And it should be much easier to read.

Performance wise the above is the best I could get, but I didn't experiment too much. My speed test and results.

Program design

As we talked a little more about your code. It seems that you don't know how you should split your code up. What should a class do, what should or shouldn't be in one, etc.

As your program is very large, we want to minimise the amount of arguments into your function. And you should really aim at removing some aspects of the program from the main class.

As the end user may just want to encode a list, and use something other than PIL, you would want to give them that option. I know that I wouldn't really use the Imgur part, as I have no need to upload to Imgur when I have enough storage here. But others may want to. To be honest I like when the end user can use an API as if it's just like using a Python one. Simple, clean, and nice to read.

from PIL import Image
import urllib2
image_location = 'my_image_location.png'
new_image = 'my_image_location_new.png'
load_from_url = False
add_header = True
message = "Haya there I'm a message!"
amount_of_bits = 3
# Open an image.
image = Image.open(
 StringIO.StringIO(urllib2.urlopen(image_location).read())
 if load_from_url else
 image_location
).convert('RGB')
width, height = image.size
# Add header if you wish.
if add_header:
 header = Steganography.generate_header(message, amount_of_bits)
 message = header + message
# Flatten the image and encode it
encoded_generator = Steganography.encode(
 message,
 (
 j
 for i in image.getdata()
 for j in i
 )
 amount_of_bits
)
# Stolen directly from your write function
# May be worth making this a function, but not in `Steganography`
encoded_image = Image.new('RGB', (width, height))
encoded_data = encoded_image.load()
for y in range(height):
 for x in range(width):
 image_data[x, y] = (
 next(encoded_generator),
 next(encoded_generator),
 next(encoded_generator)
 )
encoded_image.save(new_image)

So this is way more verbose than the way you have it at the moment. However, it is much simpler to read, understand, modify and use.

There are 6/7 variables that will change on an end user using the entire of your program. However this does not mean that it's perfect to go in a function! You can't imagine all the different possibilitys that the end user needs, however you know how to write a function to perform a type of steganography. And so your class should only contain the information to do that.

In the above there is only two functions that aren't defined. Steganography.generate_header and Steganography.encode. I already put how I would make the encode, and I'm sure you know how to implement your header. However the last bit could be put in a function, make_image below.

class Steganography:
 @static
 def generate_header(message, bits):
 # return header
 
 @static
 def read_header(iterator, bits):
 # you could do some cool stuff here.
 # But I'll let you try it yourself,
 # have a look at `make_image` for reference
 
 @static
 def encode(message, iterator, bits):
 # return encode_generator
 
 @static
 def decode(iterator, bits):
 # return message
 
class ImageHelpers:
 @static
 def make_image(iterator, height, width):
 encoded_image = Image.new('RGB', (width, height))
 encoded_data = encoded_image.load()
 # Work for anything that can work in a for loop.
 iterator = iter(iterator)
 for y in range(height):
 for x in range(width):
 image_data[x, y] = (
 next(iterator),
 next(iterator),
 next(iterator)
 )
 return encoded_image
 
 # You gave the option to pass no image.
 @static
 def make_blank_image(...):

If you re-write your code with only this input, no global variables and to work in the example above you will have easier to maintain code. There will no longer be any noise in the hardest parts of the program. And should also be small like the examples I gave above.

At first you should not 'just get your code to work'. You should get it to work on it's own. If I asked for a packet of 'chips', then I would expect 'chips', not 'chips' with 'a sausage' and 'some fish'. Just like when I ask for 'steganography' I would expect to get 'steganography', not 'steganography' with 'PIL' and 'Imgur'.

After building this base then you can edit the code to do all the default/global stuff. Also, I would recommend avoiding read and write in the steganography class, this is as it will add PIL and Imgur to the steganography class. If you must do that, then I would make new functions to do it for you.

\$\begingroup\$ Let us continue this discussion in chat. \$\endgroup\$

Peilonrayz
– Peilonrayz ♦

2015年10月02日 20:12:44 +00:00
Commented Oct 2, 2015 at 20:12

Stack Exchange Network

Storing data in the pixels of an image

2 Answers 2

Line's

General

My preferance on API's

write

How I would implement the encoding

Program design

Style review

Save image code

Code review

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Storing data in the pixels of an image

2 Answers 2

Line's

General

My preferance on API's

write

How I would implement the encoding

Program design

Style review

Save image code

Code review

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions