Slow image generation

Question 1

I build a Tower of Hanoi solver which print the solution as an image

It works as expected but generating the image is relatively slow compared to the time to calculate the answer.

Here is the code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import argparse
from PIL import Image
def hanoi(disks, source, helper, target, steps):
 if disks > 0:
 hanoi(disks - 1, source, target, helper, steps)
 target.append(source.pop())
 steps.append([SOURCE[:], HELPER[:], TARGET[:]])
 hanoi(disks - 1, helper, source, target, steps)
def save_image(name):
 print('\nSaving image {}.png'.format(name))
 data = []
 peg = args.disks * 2
 cells = peg * 3 + 40 # 40 is to put some spaces between pegs and the border
 for step in steps:
 for _ in range(5): # White space
 data.append([1 for _ in range(cells)])
 src = step[0]
 hlp = step[1]
 trg = step[2]
 size = max(len(src), len(hlp), len(trg))
 for _ in range(size - len(src)):
 src.append(0)
 for _ in range(size - len(hlp)):
 hlp.append(0)
 for _ in range(size - len(trg)):
 trg.append(0)
 src.reverse()
 hlp.reverse()
 trg.reverse()
 for s, h, t in zip(src, hlp, trg):
 blanksrc = peg - 2 * s
 blankhlp = peg - 2 * h
 blanktrg = peg - 2 * t
 row = [1 for _ in range(10)]
 row += [1 for _ in range(blanksrc // 2)]
 row += [0 for _ in range(s * 2)]
 row += [1 for _ in range(blanksrc // 2)]
 row += [1 for _ in range(10)]
 row += [1 for _ in range(blankhlp // 2)]
 row += [0 for _ in range(h * 2)]
 row += [1 for _ in range(blankhlp // 2)]
 row += [1 for _ in range(10)]
 row += [1 for _ in range(blanktrg // 2)]
 row += [0 for _ in range(t * 2)]
 row += [1 for _ in range(blanktrg // 2)]
 row += [1 for _ in range(10)]
 data.append(row)
 for _ in range(5): # White space
 data.append([1 for _ in range(cells)])
 data.append([0 for _ in range(cells)]) # Black line to separate steps
 da = [bit for row in data for bit in row]
 image = Image.new('1', (cells, len(data)))
 image.putdata(da)
 image.save('{}.png'.format(name))
if __name__ == '__main__':
 parser = argparse.ArgumentParser()
 parser.add_argument('-d', '--disks', type=int, default=4,
 help='Number of disks, default 4')
 parser.add_argument('-f', '--filename', help='Filename', required=True)
 args = parser.parse_args()
 if not args.disks > 0:
 raise ValueError('There must be at least one disk')
 SOURCE = list(reversed(range(1, args.disks + 1)))
 TARGET = []
 HELPER = []
 steps = [[SOURCE[:], HELPER[:], TARGET[:]]]
 hanoi(args.disks, SOURCE, HELPER, TARGET, steps)
 save_image(args.filename)

As I add more disks in the problem, compared to the computation of the answer, the time taken to generate the image is longer and longer.

How can I make it faster and why it is so slow?

Question 2

I don't think the code is slow. It's just that there are \$n^2\$ steps

Question 3

@MaartenFabré Is it not possible to reduce the complexity of the image generation?

Question 4

there are certainly steps to improve the performance, readability and testability of the code, by \$n^2\$ remains \$n^2\$

Question 5

Like I said before, the reason why this takes a lot of time is because the number of steps is proportional to the square of the number of disks.

But there are some other improvements to be made to this code.

`range`

list(reversed(range(1, args.disks + 1))) can be done more easily as list(range(disks, 0, -1))

Global variables

Your image saving algorithm uses a lot of global scope (args.filename, steps,...), and with the src.reverce() even modifies these global variables, which is a sure way to introduce difficult to find bugs. If your function needs those parameters, pass them as arguments, and certainly don't change them.

You can prevent SOURCE, HELPER, TARGET to need in global scope, by passing these along in a dictionary. You can also use a namedtuple:

source = list(range(disks, 0, -1))
target = list()
helper = list()
state = dict(
 source=source,
 target=target,
 helper=helper
)
hanoi_gen(disks, source, helper, target, state)

Variable naming

I had to look really well at the code to find out what certain variables were. cells for example is the width of the image, peg is the maximum with of a peg etc. Name your variables clearly. s, h, t in zip(src, hlp, trg) is another example that can do with better names.

Immutable variables

You use lists throughout your code. The advantage of lists is that they are mutable, but this is also the disadvantage. To make sure other code doesn't inadvertently change your data, use immutable containers like tuple where appropriate. So instead of SOURCE[:], I used tuple(...).

Generator

Instead of instantiating all those lists at the same time, you can work with generators:

def hanoi_gen(disks, source, helper, target, state):
 if disks:
 yield from hanoi_gen(disks - 1, source, target, helper, state)
 target.append(source.pop())
 yield tuple(state['source']), tuple(state['target']), tuple(state['helper'])
 yield from hanoi_gen(disks - 1, helper, source, target, state)
def solve_tower(disks):
 source = list(range(disks, 0, -1))
 target = list()
 helper = list()
 yield tuple(source), tuple(target), tuple(helper)
 state = dict(
 source=source,
 target=target,
 helper=helper,
 )
 yield from hanoi_gen(disks, source, helper, target, state)
steps = tuple(solve_tower(2))
assert steps == (
 ((2, 1), (), ()),
 ((2,), (), (1,)),
 ((), (2,), (1,)),
 ((), (2, 1), ()),
)

Magic numbers

There are some magic numbers in your code, 10, 40, 5, ...

Better is to extract global constants from this:

BUFFER_PEG = 10
LINE_WIDTH = 1
BUFFER_STEP = 5
WHITE = 1
BLACK = 0

And use them like:

image_width = disks * 2 * 3 + 4 * BUFFER_PEG

BLACK and WHITE can also be done with an enum.IntEnum:

from enum import IntEnum
class Color(IntEnum):
 BLACK = 0
 WHITE = 1

DRY

Compartmentalize!

There is a lot of repeated code, which makes this hard to maintain and test:

from itertools import repeat
def whitespace(width, image_width):
 return repeat(Color.WHITE, width * image_width)
def line(width, image_width):
 return repeat(Color.BLACK, width * image_width)

Create easy to use generators to add whitespace or black lines.

Pad the disk

def pad_disk(disk_width, num_disks):
 blank_width = num_disks - disk_width
 yield from repeat(Color.WHITE, blank_width)
 yield from repeat(Color.BLACK, disk_width * 2)
 yield from repeat(Color.WHITE, blank_width)

Centrally pads a disk to twice the number of disks in play. This can be easily tested: the portion of a disk of width 1 in a stack of 4 disks:

assert tuple(pad_disk(1, num_disks=4)) == (1, 1, 1, 0, 0, 1, 1, 1)

Format a row

def buffer_peg():
 return repeat(Color.WHITE, BUFFER_PEG)
def format_row(disks, num_disks):
 yield from buffer_peg()
 for disk_width in disks:
 yield from pad_disk(disk_width, num_disks)
 yield from buffer_peg()

This can be easily tested like this:

row = [2, 0, 1]
num_disks = 4
assert tuple(format_row(row, num_disks)) == tuple(chain(
 buffer_peg(),
 (1, 1, 0, 0, 0, 0, 1, 1,),
 buffer_peg(),
 (1, 1, 1, 1, 1, 1, 1, 1,),
 buffer_peg(),
 (1, 1, 1, 0, 0, 1, 1, 1,),
 buffer_peg(),
))

Format individual steps

Here, I use a small helper function to reverse the peg, and pad it with 0s:

def pad_left_reverse(peg, size, fillvalue=0):
 yield from repeat(fillvalue, size - len(peg))
 yield from reversed(peg)

Then all of this:

src = step[0]
hlp = step[1]
trg = step[2]
size = max(len(src), len(hlp), len(trg))
for _ in range(size - len(src)):
 src.append(0)
for _ in range(size - len(hlp)):
 hlp.append(0)
for _ in range(size - len(trg)):
 trg.append(0)
src.reverse()
hlp.reverse()
trg.reverse()

Can be replaced with:

def format_step(step, num_disks):
 pegs = map(
 lambda peg: pad_left_reverse(peg, num_disks, fillvalue=Color.WHITE),
 step
 )

And on the plus-side, this doesn't reverse the original input.

I replaced the size = max(len(src), len(hlp), len(trg)) with the number of disks, to keep all the steps equally high. If you can live with the uneven heights, size = len(max(step, key=len)) is an alternative formulation.

 for row in zip(*pegs):
 # print(row)
 yield from format_row(row, peg_width)

Replaces the next 20 lines.

step = [2, 1], [5,4,3], []
num_disks = 5
step_data = list(format_step(step, num_disks))

Outputs something like:

1111111111111111111111111111111100000011111111111111111111111111111111
1111111111111100111111111111111000000001111111111111111111111111111111
1111111111111000011111111111110000000000111111111111111111111111111111

Format the steps

def format_steps(steps, image_width, num_disks):
 for step in steps:
 yield from whitespace(BUFFER_STEP, image_width)
 yield from format_step(step, num_disks)
 yield from whitespace(BUFFER_STEP, image_width)
 yield from line(LINE_WIDTH, image_width)

This speaks for itself.

Context managers

If you open resources that need closing afterwards, like a file or a PIL.Image, use a with-statement.

The main

if __name__ == '__main__':
 num_disks = 5
 steps = solve_tower(num_disks)
 image_width = num_disks * 2 * 3 + 4 * BUFFER_PEG
 data = list(format_steps(steps, image_width, num_disks))
 with Image.new('1', (image_width, len(data) // image_width)) as image:
 image.putdata(data)
 name = 'my_hanoi.png'
 image.save(name)

All-in-all this code is slightly longer than your code, and will not necessarily be much faster, it is a lot clearer for me, and a lot more parts can be individually tested.

The full code can be found here, and some tests here.

Question 6

Thanks a lot for all this explainations, I learned a lot from them

Maarten Fabré Maarten Fabré 9,3901 gold badge15 silver badges27 bronze badges · Accepted Answer · 2018-07-04 12:07:04Z

Like I said before, the reason why this takes a lot of time is because the number of steps is proportional to the square of the number of disks.

But there are some other improvements to be made to this code.

`range`

list(reversed(range(1, args.disks + 1))) can be done more easily as list(range(disks, 0, -1))

Global variables

Your image saving algorithm uses a lot of global scope (args.filename, steps,...), and with the src.reverce() even modifies these global variables, which is a sure way to introduce difficult to find bugs. If your function needs those parameters, pass them as arguments, and certainly don't change them.

You can prevent SOURCE, HELPER, TARGET to need in global scope, by passing these along in a dictionary. You can also use a namedtuple:

source = list(range(disks, 0, -1))
target = list()
helper = list()
state = dict(
 source=source,
 target=target,
 helper=helper
)
hanoi_gen(disks, source, helper, target, state)

Variable naming

I had to look really well at the code to find out what certain variables were. cells for example is the width of the image, peg is the maximum with of a peg etc. Name your variables clearly. s, h, t in zip(src, hlp, trg) is another example that can do with better names.

Immutable variables

You use lists throughout your code. The advantage of lists is that they are mutable, but this is also the disadvantage. To make sure other code doesn't inadvertently change your data, use immutable containers like tuple where appropriate. So instead of SOURCE[:], I used tuple(...).

Generator

Instead of instantiating all those lists at the same time, you can work with generators:

def hanoi_gen(disks, source, helper, target, state):
 if disks:
 yield from hanoi_gen(disks - 1, source, target, helper, state)
 target.append(source.pop())
 yield tuple(state['source']), tuple(state['target']), tuple(state['helper'])
 yield from hanoi_gen(disks - 1, helper, source, target, state)
def solve_tower(disks):
 source = list(range(disks, 0, -1))
 target = list()
 helper = list()
 yield tuple(source), tuple(target), tuple(helper)
 state = dict(
 source=source,
 target=target,
 helper=helper,
 )
 yield from hanoi_gen(disks, source, helper, target, state)
steps = tuple(solve_tower(2))
assert steps == (
 ((2, 1), (), ()),
 ((2,), (), (1,)),
 ((), (2,), (1,)),
 ((), (2, 1), ()),
)

Magic numbers

There are some magic numbers in your code, 10, 40, 5, ...

Better is to extract global constants from this:

BUFFER_PEG = 10
LINE_WIDTH = 1
BUFFER_STEP = 5
WHITE = 1
BLACK = 0

And use them like:

image_width = disks * 2 * 3 + 4 * BUFFER_PEG

BLACK and WHITE can also be done with an enum.IntEnum:

from enum import IntEnum
class Color(IntEnum):
 BLACK = 0
 WHITE = 1

DRY

Compartmentalize!

There is a lot of repeated code, which makes this hard to maintain and test:

from itertools import repeat
def whitespace(width, image_width):
 return repeat(Color.WHITE, width * image_width)
def line(width, image_width):
 return repeat(Color.BLACK, width * image_width)

Create easy to use generators to add whitespace or black lines.

Pad the disk

def pad_disk(disk_width, num_disks):
 blank_width = num_disks - disk_width
 yield from repeat(Color.WHITE, blank_width)
 yield from repeat(Color.BLACK, disk_width * 2)
 yield from repeat(Color.WHITE, blank_width)

Centrally pads a disk to twice the number of disks in play. This can be easily tested: the portion of a disk of width 1 in a stack of 4 disks:

assert tuple(pad_disk(1, num_disks=4)) == (1, 1, 1, 0, 0, 1, 1, 1)

Format a row

def buffer_peg():
 return repeat(Color.WHITE, BUFFER_PEG)
def format_row(disks, num_disks):
 yield from buffer_peg()
 for disk_width in disks:
 yield from pad_disk(disk_width, num_disks)
 yield from buffer_peg()

This can be easily tested like this:

row = [2, 0, 1]
num_disks = 4
assert tuple(format_row(row, num_disks)) == tuple(chain(
 buffer_peg(),
 (1, 1, 0, 0, 0, 0, 1, 1,),
 buffer_peg(),
 (1, 1, 1, 1, 1, 1, 1, 1,),
 buffer_peg(),
 (1, 1, 1, 0, 0, 1, 1, 1,),
 buffer_peg(),
))

Format individual steps

Here, I use a small helper function to reverse the peg, and pad it with 0s:

def pad_left_reverse(peg, size, fillvalue=0):
 yield from repeat(fillvalue, size - len(peg))
 yield from reversed(peg)

Then all of this:

src = step[0]
hlp = step[1]
trg = step[2]
size = max(len(src), len(hlp), len(trg))
for _ in range(size - len(src)):
 src.append(0)
for _ in range(size - len(hlp)):
 hlp.append(0)
for _ in range(size - len(trg)):
 trg.append(0)
src.reverse()
hlp.reverse()
trg.reverse()

Can be replaced with:

def format_step(step, num_disks):
 pegs = map(
 lambda peg: pad_left_reverse(peg, num_disks, fillvalue=Color.WHITE),
 step
 )

And on the plus-side, this doesn't reverse the original input.

I replaced the size = max(len(src), len(hlp), len(trg)) with the number of disks, to keep all the steps equally high. If you can live with the uneven heights, size = len(max(step, key=len)) is an alternative formulation.

 for row in zip(*pegs):
 # print(row)
 yield from format_row(row, peg_width)

Replaces the next 20 lines.

step = [2, 1], [5,4,3], []
num_disks = 5
step_data = list(format_step(step, num_disks))

Outputs something like:

1111111111111111111111111111111100000011111111111111111111111111111111
1111111111111100111111111111111000000001111111111111111111111111111111
1111111111111000011111111111110000000000111111111111111111111111111111

Format the steps

def format_steps(steps, image_width, num_disks):
 for step in steps:
 yield from whitespace(BUFFER_STEP, image_width)
 yield from format_step(step, num_disks)
 yield from whitespace(BUFFER_STEP, image_width)
 yield from line(LINE_WIDTH, image_width)

This speaks for itself.

Context managers

If you open resources that need closing afterwards, like a file or a PIL.Image, use a with-statement.

The main

if __name__ == '__main__':
 num_disks = 5
 steps = solve_tower(num_disks)
 image_width = num_disks * 2 * 3 + 4 * BUFFER_PEG
 data = list(format_steps(steps, image_width, num_disks))
 with Image.new('1', (image_width, len(data) // image_width)) as image:
 image.putdata(data)
 name = 'my_hanoi.png'
 image.save(name)

All-in-all this code is slightly longer than your code, and will not necessarily be much faster, it is a lot clearer for me, and a lot more parts can be individually tested.

The full code can be found here, and some tests here.

Thanks a lot for all this explainations, I learned a lot from them

Stack Exchange Network

Slow image generation

1 Answer 1

`range`

Global variables

Variable naming

Immutable variables

Generator

Magic numbers

DRY

Pad the disk

Format a row

Format individual steps

Format the steps

Context managers

The main

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Slow image generation

1 Answer 1

range

Global variables

Variable naming

Immutable variables

Generator

Magic numbers

DRY

Pad the disk

Format a row

Format individual steps

Format the steps

Context managers

The main

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`range`