I've recently redone something I wrote around a year ago since I'd done it wrong (it required the original image to decode an imagine, which can be avoided), so I'm wondering if I could either get feedback on the style of writing, or on which bits can be broken, since I've tried to make it fairly robust. The idea is to store stuff in images by modifying the pixels, so you could send files over image hosts and stuff like that.
First off, I tried following that 72/80 line thing for the first time, and had to do some not so nice cuts/renames, especially with the list comprehension bits, so that's why a lot of stuff is split over multiple lines.
The one bit I'm not so sure about is the Imgur authentication. Since user accounts get a 4MB upload limit and anonymous accounts get a 1MB upload limit (and this obviously requires all images to stay as PNG), I've kind of forced the code to use an account if upload is enabled. If it is not the users account (there is an option to authenticate your own account), it'll just upload to a default account I set up, where the pyimgur.Imgur
client is un-pickled from a string. If you successfully authenticate, it'll use your account instead until you use imgur_auth = False
, where it'll reset it.
So, as to the actual code, if you input an image to write over, it'll calculate the minimum bits it can use to store the required data. The first red value will be used as a header, then the data will be split and replace the last few bits of each colour in the image. This goes up to 7, then after that at 8, there is no point in keeping the original resolution since the original image would exist no more, so it just reverts to as if you never input an image.
If you don't input an image, you can choose a ratio since the resolution can be anything, and it defaults to '16:9'
. If you input an image and it reverts to this method, it calculates the ratio from the resolution of the image.
I mentioned a header before and I go into more detail in the code, but basically, the header is a number between 0 and 255, where the final integer is the bits used per colour, and the first two is how many bytes are used to store the length of data. If it is 28, that means 2 bytes are used to store the data length. You would get the data length from the numbers at data[1:3]
, by converting them to binary, then putting them together and converting back to an integer. As a simple example, if data[1:3] = [255, 255]
, pieced together would be 1111111111111111
, converted back to int would be 65535
. The data would be stored at data[4:65539]
. The maximum value this can represent is around 2.6*10^239
.
If the bits per colour is below 8, you need to get the last n bits of every single colour, join it together in one big string, then split every 8 characters.
Also, because I poorly implemented a default settings class last time, I got a config file working this time. I initially did it with ConfigParser
, but got loads of bugs, so I just wrote a new thing from scratch based off it.
Requirements: PIL, requests, pyimgur, UsefulThings (dropbox link)
UsefulThings is just a group of functions and stuff that I can reuse elsewhere so this code doesn't get messy with non related stuff. Link to it here, you can just copy the functions into the code then find and replace 'UsefulThings.'
with ''
.
from __future__ import division
from collections import defaultdict
from math import log
import os
import random
import cPickle
import StringIO
import urllib2
import webbrowser
import zlib
import UsefulThings
import logging
logging.getLogger('requests').setLevel(logging.WARNING)
try:
from PIL import Image
except ImportError:
raise ImportError('python imaging library module was not found')
#Import internet bits, stop upload features if any don't exist
global override_upload
override_upload = False
try:
import pyimgur
import requests
except ImportError:
output_text = ['Warning: Error importing pyimgur',
', disabling the upload features.']
try:
import requests
except ImportError:
output_text = output_text[0] + [' and requests'] + output_text[1]
print ''.join(output_text)
override_upload = True
class ISGlobals(object):
"""Class for determining default values, where it will read any
changes from a config file.
There are three different groups of variables:
Required (shown and are reset if edited) - This would mainly be
for things to show the user, such as comments.
Default (shown and can be edited) - These are the core variables
used in the code, things the user would normally pass in
through the main function.
Hidden (not shown but can be edited) - These are the less
important variables, which aren't normally needed. Things
such as overrides should go here, which can be used if a
program uses this code and needs to force a few things to
happen.
By default, the config file is stored in appdata, though this has
only been tested if it works on Windows.
"""
#Build list of directories
link_dict = {}
link_dict['%USERDIR'] = os.path.expanduser( "~" ).replace( "\\", "/" )
link_dict['%PYTHONDIR'] = os.getcwd().replace( "\\", "/" )
link_dict['%APPDATA'] = os.getenv('APPDATA')
location = '%APPDATA/VFXConfig.ini'
for k in link_dict:
location = location.replace(k, link_dict[k])
def __init__(self, save_all=False):
"""Define the default values, then check them against the
values stored in the config.
"""
#Set to true to overwrite all values
reset_config = False
required_globals = defaultdict(dict)
default_globals = defaultdict(dict)
hidden_globals = defaultdict(dict)
rg = required_globals['ImageStore']
dg = default_globals['ImageStore']
hg = hidden_globals['ImageStore']
rg[(';You can use %PYTHONDIR, %USERDIR or %APPDATA'
' as links to directories.')] = None
dg['ShowAllValuesHereOnNextRun'] = False
dg['DefaultImageName'] = 'ImageDataStore.png'
dg['DefaultImageDirectory'] = '%USERDIR/ImageStore'
dg['DefaultCustomImage'] = 'http://bit.ly/1G3u3cV'
dg['UseDefaultCustomImageIfNone'] = False
dg['DefaultShouldVerify'] = True
dg['DefaultShouldSave'] = True
dg['DefaultShouldUpload'] = False
dg['DefaultShouldOpenOnUpload'] = False
hg['ForceDefaultVerify'] = False
hg['ForceNoSave'] = False
hg['ForceNoUpload'] = False
hg['ForceNoOpenOnUpload'] = False
hg['ForceDefaultCustomImage'] = False
hg['ForceNoCustomImage'] = False
required_globals = dict(required_globals)
default_globals = dict(default_globals)
hidden_globals = dict(hidden_globals)
#Write the values to the config
config_parser = UsefulThings.NewConfigParser(self.location)
required_globals = config_parser.write(required_globals,
write_values=True,
update_values=True)
default_globals = config_parser.write(default_globals,
write_values=True,
update_values=reset_config)
hidden_globals = config_parser.write(hidden_globals,
write_values=save_all,
update_values=reset_config)
#Get all the important values
#Uses write() so the hidden_globals will be returned too
all_globals = dict(default_globals).copy()
all_globals['ImageStore'].update(hidden_globals['ImageStore'])
self.global_dict = config_parser.write(all_globals,
write_values=False,
update_values=False)
self.global_dict = self.global_dict['ImageStore']
self.global_dict = {k: v for k, v in self.global_dict.iteritems()
if k[0] not in config_parser.comment_markers}
#Convert links to the correct format
for i in self.link_dict:
self.global_dict = {k: (v.replace(i, self.link_dict[i]) if
isinstance(v, str) else v)
for k, v in self.global_dict.iteritems()}
if not save_all and self.global_dict['ShowAllValuesHereOnNextRun']:
ISGlobals(save_all=True)
def get(self, x):
"""Get one of the stored values.
This should be used after assigning ISGlobals to a variable,
so that it doesn't rebuild the list each time a new variable is
looked up.
"""
return self.global_dict[x]
class ImageStoreError(Exception):
pass
class ImageStore(object):
"""Class for writing and reading data in the actual pixels in an
image.
Header Information:
To mark the bits used and end of file, a header is stored at the
beginning of the image.
The first byte of the image (red value of the pixel), is used as
an integer that stores two values. The last number determins how
many bits are used per pixel, and the first two numbers
determine how many bytes are being used to store the number of
bytes of data being held in the image.
The second part is what was just mentioned above, where the
number of bytes is converted to binary and joined, then split
into single bytes to allow it to be written to the image. To
avoid a noticeable line of pixels in the top corner, this is
encoded with the rest of the image. This means if the bits per
pixel value is below 8, the image must be decoded before this
information can be read.
For example, if the number was 123, the image would be storing 3
bits per colour, and the next 12 bytes would contain the length
of data. Once the image is read and original data pieced back
together, bytes at the index of [1:14] will store how many bytes
the data is. The data will then be found at [14:14+length].
As to a practical example, if the number was 48, and the next 4
bytes were [39, 29, 177, 252], the number of bytes would be
656257532, and the data would be at image_data[6:656257538].
The absolute maximum data size through using this method is
2.6*10^239 bytes, which is more than the number of atoms in the
universe, so I think it's safe to say nobody will go above it.
"""
max_data_len = int('1' * 8 * 99, 2)
imgur_main = 'http://i.imgur.com'
imgur_delete = 'http://imgur.com/delete'
imgur_api = 'https://api.imgur.com/3/{}'
def __init__(self, image_path=None, allow_print=True):
"""Format the image path ready to either read or write.
Note that there are no checks on the directory or file, since
you do not need a writeable image for ImageStore.read(), and
the image/path doesn't need to exist for ImageStore.write().
Parameters:
image_path (str or None): Path to the image.
Depending on if the path or filename is not provided,
the code will automatically fill in the blanks with
default values.
allow_print (bool): If printing the current progress of the
code execution is allowed.
Set to False to totally disable all printing.
"""
self.defaults = ISGlobals().get
self.allow_print = allow_print
#Get a valid image path based on the input
default_dir = self.defaults('DefaultImageDirectory')
default_name = self.defaults('DefaultImageName')
self.path = image_path
if image_path is None:
self.path = '{}/{}'.format(default_dir, default_name)
else:
if ':/' not in self.path:
if self.path.startswith('/'):
self.path = self.path[1:]
self.path = '{}/{}'.format(default_dir, self.path)
#Format the extension
self._original_extension = None
path_split = self.path.split('/')
if '.' in path_split[-1]:
self._original_extension = path_split[-1].split('.')[-1]
self.path = (self.path[::-1].split('.', 1)[1])[::-1]
self.path += '.png'
def _save_image(self, image_object):
"""Save the PIL image object to disk.
If the file path doesn't exist, it will be created first.
"""
if UsefulThings.make_file_path(self.path):
try:
image_object.save(self.path, 'PNG')
except IOError:
raise IOError('unable to write image')
else:
raise IOError('unable to write image path: {}'.format(self.path))
def _read_image(self, image_location, require_png=False):
"""Open an image, supports opening from a URL as well as normal
files.
Parameters:
image_location (str): Path to image.
require_png (bool): If the image must be a PNG.
If True and image isn't a PNG, it will throw an error.
"""
if not isinstance(image_location, (str, unicode)):
return None
#Load from URL
if any(value in image_location for value in ('http://',
'https://',
'ftp://')):
try:
image_location = StringIO.StringIO(urllib2.urlopen(
image_location).read())
except urllib2.URLError:
raise urllib2.URLError('failed to load image from URL')
#Open image
try:
return Image.open(image_location).convert('RGB')
except IOError:
#Check if image exists but just not with png extension
if require_png:
if (self._original_extension is not None
and 'png' not in self._original_extension
and '.png' in image_location):
try:
Image.open(image_location.replace('.png',
'.{}'.format(self._original_extension)))
raise ImageStoreError('image format needs to be PNG')
except IOError:
pass
raise IOError('no image file found')
raise IOError('failed to open image file')
def _imgur_request(self, client, url_prefix, parameters):
"""Send request to the Imgur API.
Currently only used for uploading images.
"""
return client._send_request(self.imgur_api.format(url_prefix),
method='POST',
params=parameters)
def _print(self, item, indent_num=0):
"""Print wrapper to allow disabling all messages."""
if self.allow_print:
print '{}{}'.format(' ' * indent_num, item)
def _time(self, verb, TimeThisObject):
"""Wrapper for formatting the time correctly.
Allows the format to be easily edited without having to change
each string.
"""
self._print('{}: {}'.format(TimeThisObject.output(), verb), 1)
def write(self, input_data, custom_image=None, image_ratio=None,
verify=None, save_image=None, upload_image=None,
open_on_upload=None, imgur_title=None, imgur_description=None,
imgur_auth=None):
"""Write data directly into an image.
If a custom image is used, bits per byte is calculated, which
determins how many bits are used to store data in each colour
value in the image. This can range from 1, which is virtually
invisible to the human eye, to 8, which would entirely erase the
original image.
The input image is processed to remove the last few bits from
each colour, and is padded back to 8 bits using the input data.
Parameters:
input_data (any): Data to be written to the image.
May be in almost any format as it is pickled.
custom_image (str or None/False, optional): Path or URL to
an image to write over.
Since the output has to be a PNG, a large custom image
will result in a large file size, no matter what the
input data is.
Leave as None to write the smallest image possible, or
use the default custom image depending on the config.
image_ratio (str or None, optional): Set the width to height
ratio if there is no custom image, in the format 'w:h'.
If a custom image is given, but the data turns out too
large to store in it, although it'll scrap the custom
image and act like one wasn't provided, it will inherit
the ratio from that image.
Will default to a 16:9 resolution if left as None.
verify (bool or None, optional): Read the image after
creation to make sure the read data matches the original
input data.
Will check the file and/or URL depending on where the
image was saved.
Disabling it will not have any adverse effects, since
the code should catch most problems during writing.
If None, it will use the default value provided in the
config.
save_image (bool or None, optional): If the file should be
saved to disk.
If None, it will use the default value provided in the
config.
upload_image (bool or None, optional): If the file should be
uploaded to imgur.
If None, it will use the default value provided in the
config.
open_on_upload (bool or None, optional): If the uploaded
link should be opened by the default web browser.
If None, it will use the default value provided in the
config.
imgur_title (str or None, optional): Title to give to the
Imgur upload.
imgur_description (str or None, optional): Description to
give to the Imgur upload.
imgur_auth (class, str or None, optional): Used to upload
images to the account of the authenticated user.
Use imgur_log_in() to get the pyimgur instance for this,
and after code execution, a string will be provided to
use here for the purpose of not having to authenticate
again.
"""
#Get default values from config
if verify is None or self.defaults('ForceDefaultVerify'):
verify = self.defaults('DefaultShouldVerify')
if self.defaults('ForceNoSave'):
save_image = False
elif save_image is None:
save_image = self.defaults('DefaultShouldSave')
if override_upload or self.defaults('ForceNoUpload'):
upload_image = False
else:
if upload_image is None:
upload_image = self.defaults('DefaultShouldUpload')
if self.defaults('ForceNoOpenOnUpload'):
open_on_upload = False
elif open_on_upload is None:
open_on_upload = self.defaults('DefaultShouldOpenOnUpload')
bits_per_byte = 8
self._print('Writing to image...', 0)
with UsefulThings.TimeThis(print_time=self.allow_print) as t:
encoded_data = [ord(letter) for letter in self.encode(input_data)]
self._time('Encoded data', t)
#Build header info
num_bytes = len(encoded_data)
if num_bytes > self.max_data_len:
message = 'well done, you just broke the laws of physics'
raise ImageStoreError(message)
nb_binary = str(bin(num_bytes))[2:]
nb_length = len(nb_binary)
nb_integer_parts = nb_length // 8 + (1 if nb_length % 8 else 0)
nb_total_length = nb_integer_parts * 8
nb_new = [nb_binary.zfill(nb_total_length)[i:i+8]
for i in range(0, nb_total_length, 8)]
pixel_data = [int(i, 2) for i in nb_new] + encoded_data
self._time('Calculated header', t)
#Try read custom image from config if nothing has been given
if (not self.defaults('ForceNoCustomImage')
and (custom_image is True
or (custom_image is None
and self.defaults('UseDefaultCustomImageIfNone'))
or self.defaults('ForceDefaultCustomImage'))):
try:
self._read_image(self.defaults('DefaultCustomImage'))
custom_image = self.defaults('DefaultCustomImage')
except (IOError, urllib2.URLError):
pass
#Totally disable
if (custom_image is False
or self.defaults('ForceNoCustomImage')):
custom_image = None
if custom_image is not None:
#Read and process custom image
custom_image_input = self._read_image(custom_image)
image_width, image_height = custom_image_input.size
max_image_bytes = image_width * image_height
self._time('Read custom image', t)
#Calculate required bits per byte
total_data_bytes = len(pixel_data) + 1
self._print('Image resolution: {}x{} ({} pixels)'.format(
image_width, image_height, max_image_bytes), 1)
self._print('Input data: {} bytes'.format(total_data_bytes), 1)
self._print(('Checking the smallest possible '
'bits per byte to use...'), 1)
bits_per_byte = 1
while bits_per_byte < 9:
storage_needed = total_data_bytes * (8 / (bits_per_byte))
self._print(" {}: Up to {} bytes".format(bits_per_byte,
int(round(storage_needed))), 1)
if storage_needed < max_image_bytes:
break
bits_per_byte += 1
#Data won't fit in image, revert to normal method
if bits_per_byte == 8:
custom_image = None
image_ratio = '{}:{}'.format(image_width, image_height)
break
self._time('Calculated bits to use', t)
#Only continue if data fits in image
if custom_image is not None:
#Process both parts of data
joined_binary_data = ''.join(str(bin(x))[2:].zfill(8)
for x in pixel_data)
#Pad the end a little to stop an unusual error
joined_binary_data += '0' * bits_per_byte
split_binary_data = UsefulThings.split_list(
joined_binary_data,
bits_per_byte)
num_pixels_needed = len(split_binary_data)
split_image_data = UsefulThings.flatten_list(
custom_image_input.getdata())
self._time('Processed input data and image', t)
reduced_image = [str(bin(i))[2:].zfill(8)[:-bits_per_byte]
for i in split_image_data]
self._time('Reduced bits of custom image', t)
#Duplicate split_binary_data until it is long enough
#Faster overall compared to picking random values
extra_length_needed = (len(split_image_data) -
len(split_binary_data))
while len(split_binary_data) < extra_length_needed:
split_binary_data *= 2
pixel_data = [int(reduced_image[i] + split_binary_data[i-1],
2) for i in range(len(reduced_image))]
self._time('Merged input data with custom image', t)
else:
self._print('Data does not fit in image, '
'reverting to normal method...', 1)
if custom_image is None:
#Convert input to numbers
pixel_data = [0] + pixel_data
input_bytes = len(pixel_data)
pixel_data += [0] * (3 - input_bytes % 3) #Pad to multiple of 3
total_bytes = len(pixel_data) // 3
required_pixels = int(total_bytes * 8 / bits_per_byte)
#Calculate width and height of image
if ':' in str(image_ratio):
image_ratio_split = [max(1, float(i))
for i in image_ratio.split(':')]
else:
image_ratio_split = [16, 9]
x = pow(required_pixels * image_ratio_split[0] *
image_ratio_split[1], 0.5)
#Don't let any dimensions go over the number of bytes
image_width = max(1, min(required_pixels,
int(round(x/image_ratio_split[1]))))
image_width //= 3
image_width *= 3
image_height = required_pixels / image_width
#Round height up if left as a decimal
if float(image_height) != float(int(image_height)):
image_height += 1
image_height = int(image_height)
image_dimensions = image_width * image_height
self._time('Calculated image size', t)
remaining_pixels = image_dimensions - required_pixels
pixel_data += [random.choice(pixel_data)
for i in range(3 * remaining_pixels)]
self._time('Padded data', t)
#Write first byte as header
initial_header = int(str(nb_integer_parts) + str(bits_per_byte))
pixel_data[0] = initial_header
#Draw image
image_output = Image.new('RGB', (image_width, image_height))
image_data = image_output.load()
#Assign pixels
for y in range(image_height):
for x in range(image_width):
cp = 3 * (x + y * image_width)
image_data[x, y] = tuple(pixel_data[cp:cp + 3])
self._time('Created image', t)
#Build StringIO file
#Even if not uploading, can be used for file size
output_StringIO = StringIO.StringIO()
image_output.save(output_StringIO, 'PNG')
contents = output_StringIO.getvalue()
output_StringIO.close()
if save_image:
self._save_image(image_output)
self._time('Saved file', t)
self._print('Path to file: {}'.format(self.path), 2)
if upload_image:
client = self.choose_client(imgur_auth)
#Renew the token
if self.validate_client(client):
client.refresh_access_token()
#Send upload request to Imgur
imgur_params = {'image': contents.encode('base64'),
'title': imgur_title,
'description': imgur_description}
image_upload = self._imgur_request(client,
'image',
imgur_params)
uploaded_image_type = image_upload['type'].split('/')[1]
uploaded_image_size = image_upload['size']
uploaded_image_link = image_upload['link']
uploaded_image_id = image_upload['id']
uploaded_image_delete_link = image_upload['deletehash']
self._time('Uploaded image', t)
#Detect if image has uploaded correctly, delete if not
if (uploaded_image_type.lower() == 'png' and
uploaded_image_size == len(contents)):
i_link = 'Link to image: {}'
i_delete = 'Link to delete image: {}/{}'
self._print(i_link.format(uploaded_image_link), 2)
self._print(i_delete.format(self.imgur_delete,
uploaded_image_delete_link), 2)
if open_on_upload:
webbrowser.open(uploaded_image_link)
else:
output = 'Image failed to upload correctly - '
#Check if it was converted
if uploaded_image_type.lower() != 'png':
output += 'file was too large.'
else:
output += 'unknown reason'
self._print(output, 1)
upload_image = False
pyimgur.Image(image_upload, client).delete()
#Validate the image
if save_image or upload_image:
if verify:
failed = 0
read_save_image = input_data
read_upload_image = input_data
if save_image:
read_save_image = ImageStore(self.path,
allow_print=False).read()
if upload_image:
read_upload_image = ImageStore(uploaded_image_link,
allow_print=False).read()
#Deactivate features based on the problem
if read_save_image != read_upload_image:
failed += 1
upload_image = False
pyimgur.Image(image_upload, client).delete()
if read_save_image != input_data:
failed += 1
save_image = False
if failed == 2:
raise ImageStoreError('image failed validation')
self._time('Verified file', t)
#Build dictionary to return
output_path = {'size': len(contents)}
if save_image:
output_path['path'] = self.path
if upload_image:
output_path['url'] = '{}/{}.{}'.format(self.imgur_main,
uploaded_image_id,
uploaded_image_type)
output_path['url_delete'] = '{}/{}'.format(
self.imgur_delete,
uploaded_image_delete_link)
return output_path
def read(self):
"""Attempt to read the stored data from an image.
To undo the write process, if a custom image is used, each
colour must be broken down into bits and the last few bits are
taken then pieced together. If these are split into groups of 8
and converted back to characters, it results in the original
encoded string.
"""
self._print('Reading image...', 0)
with UsefulThings.TimeThis(print_time=self.allow_print) as t:
image_input = self._read_image(self.path,
require_png=True)
self._time('Read image', t)
#Get data and header
image_data = UsefulThings.flatten_list(image_input.getdata())
image_header = str(image_data[0])
bytes_per_pixel = int(image_header[-1])
nb_parts = int(image_header[:-1])
image_data = image_data[1:]
self._time('Processed image', t)
#Get the required pixels as binary, piece together, and
# convert back to int
if bytes_per_pixel != 8:
image_data_parts = [str(bin(i))[2:].zfill(8)[8-bytes_per_pixel:]
for i in image_data]
image_data_binary = ''.join(image_data_parts)
image_data_split = UsefulThings.split_list(image_data_binary, 8)
num_bytes = int(''.join(image_data_split[:nb_parts]), 2)
truncated_data = [int(i, 2) for i in
image_data_split[nb_parts:num_bytes + nb_parts]]
else:
#Does same as above, but avoids converting the entire
# data to binary to save time
nb_raw = image_data[:nb_parts]
nb_binary = ''.join(str(bin(i))[2:].zfill(8)
for i in nb_raw)
num_bytes = int(nb_binary, 2)
truncated_data = image_data[nb_parts:num_bytes + nb_parts]
self._time('Got required data', t)
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
self._time('Decoded data', t)
return decoded_data
def encode(self, data, b64=False):
"""Encode the image data from the raw input into something that
can be converted into bytes.
"""
data = zlib.compress(cPickle.dumps(data))
if b64:
data = base64.b64encode(data)
return data
def decode(self, data, b64=False):
"""Decode the data returned from encode()."""
try:
if b64:
data = base64.b64decode(data)
data = cPickle.loads(zlib.decompress(data))
return data
except (TypeError, cPickle.UnpicklingError, zlib.error):
raise ImageStoreError('failed to decode image data')
def validate_client(self, client):
"""Validate the client manually since pyimgur only gives a
generic 'Exception' if it is invalid.
"""
if isinstance(client, pyimgur.Imgur):
try:
return (client.client_secret is not None
and client.refresh_token is not None)
except AttributeError:
pass
return False
def choose_client(self, imgur_auth):
"""Check the input client or return the default one.
If an authenticate attempt is successful, the client data
will be written to the config file. This can be automatically
deleted by setting imgur_auth to False.
Parameters:
imgur_auth (str, bool, pyimgur.Imgur): The client used
by the Imgur API.
The code will make an attempt to read it from
multiple formats.
Order of attempts:
Decode input string
Prompt for login details if imgur_auth == True
Read raw output from pyimgur.Imgur
Reset config value if imgur_auth == False
Read and decode config string
Use default value
"""
used_auth = None
client = None
config_parser = UsefulThings.NewConfigParser(ISGlobals.location)
#Decode client from input string or ask for input
if isinstance(imgur_auth, str):
try:
client = self.decode(imgur_auth.strip(), b64=True)
if not self.validate_client(client):
raise ImageStoreError()
except (ImageStoreError, AttributeError):
imgur_auth = None
client = None
else:
used_auth = 'Decoded input'
if (client is None
and (isinstance(imgur_auth, pyimgur.Imgur) or imgur_auth is True)):
if imgur_auth is True:
imgur_auth = imgur_log_in()
#Use raw input as client
if self.validate_client(imgur_auth):
#Write to config
config = {'ImageStore': {'LastImgurClient': encoded_client}}
encoded_client = self.encode(imgur_auth, b64=True)
config_parser.write(config,
write_values=True,
update_values=True)
self._print('Your encoded client string is as follows. '
'Use this for imgur_auth to bypass the login.', 1)
self._print(encoded_client, 1)
used_auth = 'Input'
client = imgur_auth
else:
#Move onto next part instead
imgur_auth = None
if client is None and not isinstance(imgur_auth, pyimgur.Imgur):
#Reset
try:
if imgur_auth is False:
raise ImageStoreError()
#Read config for previous client info
config_values = config_parser.read()['ImageStore']
encoded_client = config_values['LastImgurClient']
client = self.decode(''.join(encoded_client), b64=True)
if not self.validate_client(client):
raise ImageStoreError()
used_auth = 'Previously cached client'
#Use default value
except (KeyError, ImageStoreError):
if not self.validate_client(client):
#Reset config
config = {'ImageStore': {'LastImgurClient': ''}}
config_parser.write(empty_config,
write_values=True,
update_values=True)
#Use default client
encoded_client = ['eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL',
'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr',
'OF5ifzhPrFm+I7B1GDnbriY70yn2cW7Pia',
'bltWKjYC9pu4qptef5SMMcfbaFDCRrqopl',
'9kFT7C5ZUVmBovxOmihRScIl6cb8Kea8rx',
'79h17/7G0c4nDI3Czcek8ptfP7Gw1ZrNne',
'E1i0VPNaO+ODANTKmBpk6IKiTiqP6I3SeW',
'jF/un/2QGwlFxBG8tKKJepAlTGcOs6xEC+',
'yILdjIn8tCwEmc0KpVaA2awrbSU3ngunlH',
'GBG8EtOVVbAKuLX5WUh8en+9fNrt00z82u',
'qMgauJ52oi5f7/i9FzTbIxqBXAXnuDAOrN',
'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4=']
client = self.decode(''.join(encoded_client), b64=True)
used_auth = 'Default client'
self._print('Imgur authentication being used: {}'.format(used_auth), 1)
return client
def imgur_log_in():
client = pyimgur.Imgur('0d10882abf66dec',
'5647516abf707a428c23b558bd2832aeb59a11a7')
auth_url = client.authorization_url('pin')
webbrowser.open(auth_url)
pin = raw_input('Please enter the pin number shown... ')
print pin
try:
client.exchange_pin(pin)
except requests.HTTPError:
print 'Error: Invalid pin number'
return None
return client
Here's some examples of usage with the output to show how long it takes.
Write a large mp3:
>>> with open('B:/Music/Linkin Park/13 - Numb.mp3') as f:
... is = ImageStore().write(f.read(), upload_image=True)
Writing to image...
3.41 seconds: Encoded data
3.45 seconds: Calculated header
3.51 seconds: Calculated image size
3.51 seconds: Padded data
5.34 seconds: Created image
6.33 seconds: Saved file
Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
Imgur authentication being used: Previously cached client
47.85 seconds: Uploaded image
Image failed to upload correctly - file was too large.
51.63 seconds: Verified file
Total time taken: 51.63 seconds
{'path': 'D:/Peter/Documents/ImageStore/ImageDataStore.png', 'size': 7333390}
Write a smaller mp3:
>>> with open('B:/Music/Linkin Park/Pushing Me Away.mp3') as f:
... is = ImageStore().write(f.read(), upload_image=True)
Writing to image...
1.86 seconds: Encoded data
1.88 seconds: Calculated header
1.92 seconds: Calculated image size
1.92 seconds: Padded data
2.89 seconds: Created image
3.41 seconds: Saved file
Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
Imgur authentication being used: Previously cached client
24.19 seconds: Uploaded image
Link to image: http://i.imgur.com/H4mLTI5.png
Link to delete image: http://imgur.com/delete/bRFxgtiFOannUjV
29.29 seconds: Verified file
Total time taken: 29.29 seconds
{'url': 'http://i.imgur.com/H4mLTI5.png', 'path': 'D:/Peter/Documents/ImageStore/ImageDataStore.png', 'url_delete': 'http://imgur.com/delete/bRFxgtiFOannUjV', 'size': 4129767}
Save mp3 from URL:
>>> with open('B:/new file.mp3', 'w') as f:
... f.write(ImageStore('http://i.imgur.com/H4mLTI5.png').read())
Reading image...
1.23 seconds: Read image
1.61 seconds: Processed image
1.63 seconds: Got required data
2.69 seconds: Decoded data
Total time taken: 2.70 seconds
#B:/new file.mp3 still contains the exif data too, obviously I'm not going to leave that link up though :P
To generate the image below:
>>> with open(some_wallpaper) as f:
... ImageStore().write(f.read(), upload_image=True, custom_image='http://topwalls.net/wallpapers/2012/01/Nature-sea-scenery-travel-photography-image-768x1366.jpg')
Writing to image...
0.27 seconds: Encoded data
0.27 seconds: Calculated header
2.90 seconds: Read custom image
Image resolution: 1366x768 (1049088 pixels)
Input data: 582474 bytes
Checking the smallest possible bits per byte to use...
1: Up to 4659792 bytes
2: Up to 2329896 bytes
3: Up to 1553264 bytes
4: Up to 1164948 bytes
5: Up to 931958 bytes
2.91 seconds: Calculated bits to use
3.80 seconds: Processed input data and image
5.65 seconds: Reduced bits of custom image
7.20 seconds: Merged input data with custom image
7.97 seconds: Created image
8.39 seconds: Saved file
Path to file: D:/Peter/Documents/ImageStore/ImageDataStore.png
Imgur authentication being used: Previously cached client
21.28 seconds: Uploaded image
Link to image: http://i.imgur.com/RRUe0Mo.png
Link to delete image: not telling
29.04 seconds: Verified file
Total time taken: 29.05 seconds
If you run this image through the code and save the data with a jpg extension, you'll get a 1080p leafy wallpaper. It needed 5 bits to store the data, so it's dropped the quality quite considerably, normally it's not really noticeable. image
Link to the latest version is here.
-
\$\begingroup\$ Please do not update the code in your question once it has been answered - that invalidates the answers. If you would like further advice, ask a followup question. See this question on meta and the related/linked questions therein. \$\endgroup\$Mat– Mat2015年10月06日 16:36:08 +00:00Commented Oct 6, 2015 at 16:36
-
\$\begingroup\$ Ah right thanks, wasn't aware of that, I'll put up a link to where the latest version is then \$\endgroup\$Peter– Peter2015年10月06日 18:22:57 +00:00Commented Oct 6, 2015 at 18:22
2 Answers 2
In short, your class is too big. Try splitting it into two or three classes, Steganography
, Imgur
and URLOpen
.
Making Steganography
it's own class is a must for this program.
Line's
First off, I tried following that 72/80 line thing for the first time
It's really good that you done that.
For your strings say encoded_client
.
You can use operator-less raw string concatenation.
Also if you don't like on delimiter lining you can use an alternate.
encoded_client = (
'eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL'
'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr'
...
'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4='
)
I think that looks nicer. And now you don't have to care about concatenating them.
As for your cuts/renames. I personally would say that that good variable names are better than 81 characters in a line. However normally you can do something to achieve both. First, there is normally a problem if you are indenting more than say 5 times. You could probably re-factor the code into two or more functions.
# This is kinda hard to read.
# But is allowed.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is just wrong.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is one way to solve the problem
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data)
)
# And an alternate
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data))
I prefer the third way, however some prefer the forth way. So if you see the third way and not the forth way in my review then either is good.
As for very long comprehensions I think splitting them into multiple lines is very good. And you do that for the most part.
# Harder to read.
(chunk for chunk in my_list if chunk)
# Easier to read
(
chunk
for chunk in my_list
if chunk
)
The above is what I find.
I know that I use the former more than the latter,
but I find the latter easier to read, and if my_list
was a generator,
then you could easily merge them together.
General
global
is used to change things in global scope when in a lower scope. While I can seeoverride_upload
in a function, I can't change it.It's like there is a number on a display past some glass. You can see it, and you know what it is. But when you try to use your pen to overwrite it you just get the writing on the glass. Now you can't see the number on display, as your new writing looks like it's on display. And everyone behind you sees yours as if it were the writing on display.
In short, you're not behind any glass. And so you don't need
global
.Try to limit the size of
try
s. You have two imports in atry
. Change this so there are two differenttry
s, and do in if if you need to limit the secondtry
.I hate
my_dict = {};my_dict['a'] = 'a'
. It seems stupid, you can just put the letters in the dict. This both stops you slaving away when changing the name of the dict. And make it so that I know what the dict contains straight away.my_dict = {'a': 'a'}
.Using on delimiter line calls impairs your programs readability. It's easier to just read on the left side of the screen, then to jump from left to right.
Your
ISGlobals.get
function can be removed if you are ok with changing the classes__dict__
.self.__dict__.update(global_dict)
is how you would do this. However if you can't change your program to not use strings you would have to usegetattr
, and that can get ugly. Or you could use the dictionary instead.URL's are normally very alike to Unix paths. Rather than re-inventing the wheel, you can use
posixpath
. This will allow for possibly safer path changes as your current path methods aren't too safe.I think mixing opening via URL and system is a very bad Idea. I would rather as a developer have the option for one or the other, I don't want to download some file, when I specified a file on the system.
Your
write
is way too long. To be honest I kinda don't want to read something that large. It's ~300 lines long, and has indents up to and including 6. It should really be split up into more functions no matter how you look at it.I think you really need to remove the Imgur implementation in this program. All it does is add complexity to your program.
First I think that your class should have an 'encode' - put data in the image, and a 'decode' - take information out of it.
Then you can pass a PIL image to it, and the functions do there magic. No confusion over how to handle Imgur, no confusion over reading or writing and no compression.
Your aim should not be to throw as many features into this as possible. (Just to note I'm not saying don't write this with Imgur support, that's a really cool idea.) I just think Imgur should be designed afterwards. And defiantly not implemented in the same class, let alone the same functions.
My preferance on API's
The way I would like to run you program is something like the following:
file_path = '...'
# The option for me to NOT use `load_url`.
image = (
load_url(file_path)
if is_url(file_path) else
load(file_path)
)
image = encode('My message', image, bits=4)
save_image(image)
This is way more customisable, and allows me the end user to have more control over the system.
If I don't want to upload to Imgur, as shown above, then I don't.
If I don't want to download the file, I don't.
If I want to encode it, then I add a function to do that.
While this will make a harder to understand main
,
it's better that way as then your code is easier to maintain.
Lets say I use none of the protocols 'http', 'https' or 'ftp', but I have a way to use some other protocol, and can get the image into a PIL image. Can I use your program? Not really.
write
Some of your error messages are not descriptive, and can be taken as rude.
'well done, you just broke the laws of physics'
Changing that to 'message too large for image' is better.
Also you should do the check before encoding the data... We might be 'dumb', but we don't want our time waisted.
You have a lot of noise surrounding all your code, and so I don't fully understand it.
But I would really recommend splitting it into functions. You could probably make the functions:
- Make header
- Custom image
- Normal image
- Create image (Draw image and Assign pixels)
- Save image
- Upload image
But not only that, you could probably make some smaller helper functions too.
How I would implement the encoding
But if I were to implement your encode function,
I would take message
, image
(list) and bits
.
I would then do binary logic to get the numbers.
You could possibly get a faster system if you store the numbers inverted.
# Byte = 8 bits
# Chunk = `bits` bits
# Converts from a byte to bits then to chunks
def from_bytes(message, bits):
bit = 0
total = 0
for byte in message:
for _ in range(8):
bit += 1
total <<= 1
total += (byte & 128) >> 7
if bit == bits:
yield total
bit = 0
total = 0
byte <<= 1
while True:
yield 0
# Add image and message data.
def encode(message, image, bits):
NOT_BITS = ~((1 << bits) - 1)
return (
(byte & NOT_BITS) + data
for byte, data in zip(image, from_bytes(message, bits))
#for byte, data in itertools.izip(image, from_bytes(message, bits)) # Python2
)
# Modified decode. Non-8bits one.
# I removed the header for simplicity
def decode(image, bits):
image_data = ''.join(
str(bin(i))[2:].zfill(8)[8 - bits:]
for i in image
)
return [
int(j, 2)
for j in (
image_data[i:i + 8]
for i in range(0, len(image_data), 8)
)
]
If you were to link this up with PIL to get proper encoding and decoding. Then you force the user to pass a PIL image, then you could drasticly reduce the complexity of your program.
It allows you to do things like:
# You may want to have `chr` and `ord` done inside,
# but this is more flexible.
''.join(map(chr, decode(encode(map(ord, 'Hi'), [0, 0, 0, 0], 4), 4)))
# Hi
So I would say that it's safe. But I would put a check on the size of the bits.
Also slap PIL onto this and put it into a class and then you have half your program in almost no lines. Where you could encode or decode any image.
Then you just need to get a couple of functions/a class to make the Imgur and other web aspects of the program. And it should be much easier to read.
Performance wise the above is the best I could get, but I didn't experiment too much. My speed test and results.
Program design
As we talked a little more about your code. It seems that you don't know how you should split your code up. What should a class do, what should or shouldn't be in one, etc.
As your program is very large, we want to minimise the amount of arguments into your function. And you should really aim at removing some aspects of the program from the main class.
As the end user may just want to encode a list, and use something other than PIL, you would want to give them that option. I know that I wouldn't really use the Imgur part, as I have no need to upload to Imgur when I have enough storage here. But others may want to. To be honest I like when the end user can use an API as if it's just like using a Python one. Simple, clean, and nice to read.
from PIL import Image
import urllib2
image_location = 'my_image_location.png'
new_image = 'my_image_location_new.png'
load_from_url = False
add_header = True
message = "Haya there I'm a message!"
amount_of_bits = 3
# Open an image.
image = Image.open(
StringIO.StringIO(urllib2.urlopen(image_location).read())
if load_from_url else
image_location
).convert('RGB')
width, height = image.size
# Add header if you wish.
if add_header:
header = Steganography.generate_header(message, amount_of_bits)
message = header + message
# Flatten the image and encode it
encoded_generator = Steganography.encode(
message,
(
j
for i in image.getdata()
for j in i
)
amount_of_bits
)
# Stolen directly from your write function
# May be worth making this a function, but not in `Steganography`
encoded_image = Image.new('RGB', (width, height))
encoded_data = encoded_image.load()
for y in range(height):
for x in range(width):
image_data[x, y] = (
next(encoded_generator),
next(encoded_generator),
next(encoded_generator)
)
encoded_image.save(new_image)
So this is way more verbose than the way you have it at the moment. However, it is much simpler to read, understand, modify and use.
There are 6/7 variables that will change on an end user using the entire of your program. However this does not mean that it's perfect to go in a function! You can't imagine all the different possibilitys that the end user needs, however you know how to write a function to perform a type of steganography. And so your class should only contain the information to do that.
In the above there is only two functions that aren't defined.
Steganography.generate_header
and Steganography.encode
.
I already put how I would make the encode, and I'm sure you know how to implement your header.
However the last bit could be put in a function, make_image
below.
class Steganography:
@static
def generate_header(message, bits):
# return header
@static
def read_header(iterator, bits):
# you could do some cool stuff here.
# But I'll let you try it yourself,
# have a look at `make_image` for reference
@static
def encode(message, iterator, bits):
# return encode_generator
@static
def decode(iterator, bits):
# return message
class ImageHelpers:
@static
def make_image(iterator, height, width):
encoded_image = Image.new('RGB', (width, height))
encoded_data = encoded_image.load()
# Work for anything that can work in a for loop.
iterator = iter(iterator)
for y in range(height):
for x in range(width):
image_data[x, y] = (
next(iterator),
next(iterator),
next(iterator)
)
return encoded_image
# You gave the option to pass no image.
@static
def make_blank_image(...):
If you re-write your code with only this input, no global variables and to work in the example above you will have easier to maintain code. There will no longer be any noise in the hardest parts of the program. And should also be small like the examples I gave above.
At first you should not 'just get your code to work'. You should get it to work on it's own. If I asked for a packet of 'chips', then I would expect 'chips', not 'chips' with 'a sausage' and 'some fish'. Just like when I ask for 'steganography' I would expect to get 'steganography', not 'steganography' with 'PIL' and 'Imgur'.
After building this base then you can edit the code to do all the default/global stuff.
Also, I would recommend avoiding read
and write
in the steganography
class,
this is as it will add PIL
and Imgur
to the steganography
class.
If you must do that, then I would make new functions to do it for you.
-
\$\begingroup\$ Let us continue this discussion in chat. \$\endgroup\$2015年10月02日 20:12:44 +00:00Commented Oct 2, 2015 at 20:12
There is quite a lot of code to review, so I'm sorry if I miss out on the details. However let us give it a go, and give you some comments on the style and code of your solution.
Style review
- Good variable and function names – It seems like you overall has quite good variable names, and are consistent in your naming. You do have some exceptions like
output_StringIO
but they are understandable why you use those Indentation is mostly good – This is almost a given using Python, however in some places you could use the trick of inverting the test, and breaking/returning to reduce nesting level. See code below
#Instead of this, which increases indentation level... if save_image: # Save image code # ... you could revert it, and do if not save_image: return # or break
Save image code
Switch
if/else
blocks – In some cases you have really shortelse
blocks, but lengthyif
blocks. If you switch these around, it is easier to follow your logic, instead of browsing a lot up and down to verify what the actual condition was, and why am I here.if my_positive_condition: # 50 lines of code else: # 2 lines of code # Replace with: if not my_positive_condition: # 2 lines of code else: # 50 lines of code
Seems to be good comments and docstrings – Seemingly appropriate level of documenting and commenting code. Not too much unneccessary comments, either. Also like the
self._print()
, which easily allows you remove all output- Try avoiding magic numbers – Numbers like
8
are references throughout the code, and could be replaced by a constant likeBITS_PR_BYTE
. Not a major biggie, but if you were to change into another bit format, i.e. 16 bits imaging, you would have a lot of places to clean up - Use more vertical spacing – I would add some vertical spacing here and there. Especially like at the start of
write()
were you have a sequence ofif/if/elif/if/else
thightly connected. This is easily read as one block ofif/elif/elif/elif/else
.
Code review
In general I like the idea of yours to store extra data in the lower bits of the image data. I would possibly be a little more strict and not allow the entire image to be replaced by data, but that is merely a point of view. However there are some other code aspects I would like address:
- Add a magic header – I would strongly recommend to add a magic header in the first bytes, so that you are able to recognize if this is one of your files, or if it has been tampered (or simply is just another plain image file). One option could be to encode a magic string in the lower 2/3 bits of the first 10 pixels, which would be similar in all image processed by you
- Split the larger functions – The
write()
andread()
methods are way to large. As a general rule of thumb, most methods should fit on the screen without the need of scrolling. This would also reduce the parameter count, which is also a good thing Consider introducing more classes – It seems like you could consider having one class for handling the image, another one for handling upload, and possibly also another for handling actual file write/read operations. Don't be afraid of classes if they can help you increase readability and general understanding of how to do your work. Monolithic code is never good...
Verify your logic in
ISGlobals
– The class starts of with code not belonging to any class/instance methods? And within the__init__()
method you seem to (re-)assign torequired_globals
ignoring that you've already have something there. And to finally set them based on the result of aconfig_parser.write()
also seems strange...
-
\$\begingroup\$ Thanks a lot for the tips. I did think something looked strange about a small
else
after a bigif
, so I'll get them switched. I didn't think of 16 bit images, I'll get it underbits_per_byte_max
then maybe later add a check from PIL. Replacing the entire image is mainly for aesthetic purposes, since if I don't, it's a very obvious line where the data has finished. I had an end of data marker in my old code similar to your magic header, so I suppose I could implement something like that at the start.read()
isn't so long, but Joe suggested some ways of breaking upwrite()
\$\endgroup\$Peter– Peter2015年10月01日 23:27:22 +00:00Commented Oct 1, 2015 at 23:27 -
\$\begingroup\$ I tried to cut down on classes are someone said on some earlier code that I should stick it all under one class haha, but I'll try split it into a few. The purpose of
ISGlobals
is to set default values, but then read the config to see if the user has specified something else (then one final pass to get all the values together).required_globals
is only really used for comments that should always be in the config file. I'll make theconfig_parser.read()
function accept an input sometime in the future, but currently it needs to bewrite()
to update and return all the values passed in. \$\endgroup\$Peter– Peter2015年10月01日 23:33:00 +00:00Commented Oct 1, 2015 at 23:33 -
\$\begingroup\$ @Peter, A lot related to coding comes down to personal preferences, but most do agree on the somewhat flimsical statement of "most methods should fit on the screen without the need of scrolling". You are of course free to do whatever you like, but the easier it reads the easier it is to maintain both now and in the future! \$\endgroup\$holroy– holroy2015年10月01日 23:41:30 +00:00Commented Oct 1, 2015 at 23:41
-
\$\begingroup\$ I still feel a bit funny about the 80 characters but will stick with it haha, but I've not heard the fit on screen without scrolling thing before so I'll see if I can cut it down. I just thought there'd be no point making a new function if it's only used once throughout the class, assuming the comments show what it's doing. \$\endgroup\$Peter– Peter2015年10月01日 23:55:54 +00:00Commented Oct 1, 2015 at 23:55
-
\$\begingroup\$ The point with the "one screen" is that you can see the overview of what happens easily. Some state is hard as if you can't see what the function does directly, it is too complicated. You might have heard of the "Single responsiblity"... \$\endgroup\$holroy– holroy2015年10月02日 00:05:09 +00:00Commented Oct 2, 2015 at 0:05