Python Steganographer using PIL

Question 1

I've made a Python Steganographer using the Python Imaging Library. It basically encodes the binary text in the last 'n' bits of the red content in each pixel of the image. Here is my code, please suggest some improvements.

from PIL import Image
import os
def str2bin(message):
 binary = bin(int.from_bytes(message.encode('utf-8'), 'big'))
 return binary[2:]
def bin2str(binary):
 n = int(binary, 2)
 return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
def hide(filename, message, bits=2):
 image = Image.open(filename)
 binary = str2bin(message) + '00000000'
 if (len(binary)) % 8 != 0:
 binary = '0'*(8 - ((len(binary)) % 8)) + binary
 data = list(image.getdata())
 newData = []
 index = 0
 for pixel in data:
 if index < len(binary):
 pixel = list(pixel)
 pixel[0] >>= bits
 pixel[0] <<= bits
 pixel[0] += int('0b' + binary[index:index+bits], 2)
 pixel = tuple(pixel)
 index += bits
 newData.append(pixel)
 image.putdata(newData)
 image.save('\\'.join(filename.split('\\')[0:-1]) + '/coded-'+os.path.basename(filename), 'PNG')
 return len(binary)
def unhide(filename, bits=2):
 image = Image.open(filename)
 data = image.getdata()
 binary = ''
 index = 0
 while not (len(binary) % 8 == 0 and binary[-8:] == '00000000'):
 value = '00000000' + bin(data[index][0])[2:]
 binary += value[-bits:]
 index += 1
 message = bin2str(binary)
 return message
if __name__ == '__main__':
 file = open('tmiab.txt', encoding='utf-8')
 hide('E:\\Python\\Steganography\\img.png', file.read(), 2)
 print(unhide('E:\\Python\\Steganography\\coded-img.png', 2)[:10000])

I am able to encode the novel Three Men In A Boat in a 1600 by 1200 pixel image by just replacing the last two bits.

I am looking for some improvements like reducing time consumption.

Please help. Thanks.

Question 2

Time consumption suggestions

Doing list(Image.open(filename).getdata()) means that you read the entire image before starting any processing. This means time and memory overhead in the order of number of pixels. It looks like you only ever need to process one pixel at a time, so you can instead iterate over the results of getdata() directly, treating the pixels as a stream. This becomes crucial when dealing with even bigger data structures, especially when the full data won't fit in memory. If the result of getdata() is not the best for your purposes there are many other ways of iterating over pixels in an image. Iterating over the text data is probably less crucial (since the data should be much smaller), but would also improve matters over reading the entire thing at once.

You are currently doing many value conversions, from UTF-8 string to binary and back, binary to int and back, and pixel to list (presumably to separate colour values). Avoiding these conversions should massively improve the speed of your program. In general, settling on a single "datatype of exchange" is a good idea, and in this case integers are an obvious choice - they map easily to both text and pixel values, and they are blazingly fast to work with.

General suggestions

You have optional parameters which are never used with anything other than their default values. This adds clutter, and makes me suspicious of whether the functions would even do the right thing if called with anything else.

Use argparse to allow specifying the input data file and image file. This makes the program much more versatile and acceptance testable.

Test driving this code should point towards some limitations of this program:

What if the data doesn't fit in memory?
What if the image is too small to encode the entire text?
What if the user specifies bits> 8?
Why assume UTF-8 rather than just encoding arbitrary byte values? The extra limitation of UTF-8 makes it harder to ensure that the data is correctly encoded and decoded, and limits the utility of the program. You might for example want to encode an image within another image.
What if the image is not eight bits per pixel?
What is the maximum perceptive colour distance between an unmodified and a steganographic pixel? That is, could a person possibly detect the colour difference, for example in a high quality photo with big uniform patches or a uniformly coloured generated picture?

pep8 is a great little tool to ensure your code is idiomatic Python. For example: in Python variable names are always under_scored by convention.

l0b0 l0b0 9,11722 silver badges36 bronze badges · Accepted Answer · 2017-06-10 08:12:14Z

Time consumption suggestions

Doing list(Image.open(filename).getdata()) means that you read the entire image before starting any processing. This means time and memory overhead in the order of number of pixels. It looks like you only ever need to process one pixel at a time, so you can instead iterate over the results of getdata() directly, treating the pixels as a stream. This becomes crucial when dealing with even bigger data structures, especially when the full data won't fit in memory. If the result of getdata() is not the best for your purposes there are many other ways of iterating over pixels in an image. Iterating over the text data is probably less crucial (since the data should be much smaller), but would also improve matters over reading the entire thing at once.

You are currently doing many value conversions, from UTF-8 string to binary and back, binary to int and back, and pixel to list (presumably to separate colour values). Avoiding these conversions should massively improve the speed of your program. In general, settling on a single "datatype of exchange" is a good idea, and in this case integers are an obvious choice - they map easily to both text and pixel values, and they are blazingly fast to work with.

General suggestions

You have optional parameters which are never used with anything other than their default values. This adds clutter, and makes me suspicious of whether the functions would even do the right thing if called with anything else.

Use argparse to allow specifying the input data file and image file. This makes the program much more versatile and acceptance testable.

Test driving this code should point towards some limitations of this program:

What if the data doesn't fit in memory?
What if the image is too small to encode the entire text?
What if the user specifies bits> 8?
Why assume UTF-8 rather than just encoding arbitrary byte values? The extra limitation of UTF-8 makes it harder to ensure that the data is correctly encoded and decoded, and limits the utility of the program. You might for example want to encode an image within another image.
What if the image is not eight bits per pixel?
What is the maximum perceptive colour distance between an unmodified and a steganographic pixel? That is, could a person possibly detect the colour difference, for example in a high quality photo with big uniform patches or a uniformly coloured generated picture?

pep8 is a great little tool to ensure your code is idiomatic Python. For example: in Python variable names are always under_scored by convention.

Stack Exchange Network

Python Steganographer using PIL

1 Answer 1

Time consumption suggestions

General suggestions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Python Steganographer using PIL

1 Answer 1

Time consumption suggestions

General suggestions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions