I need to do the following in Python 3.x:
- Interpret an array of bytes as an array of single-precision floats.
- Then group each four consecutive floats into subarrays, i.e. transform
[a,b,c,d,e,f,g,h]
into[[a,b,c,d], [e,f,g,h]]
. The subarrays are called pixels, and the array of pixels forms an image. - Flip the image vertically.
Here is what I have now:
floats = array.array('f')
floats.fromstring(tile_data)
pix = []
for y in range(tile_h - 1, -1, -1):
stride = tile_w * 4
start_index = y * stride
end_index = start_index + stride
pix.extend(floats[i:i + 4] for i in range(start_index, end_index, 4))
tile_data
is the input array of raw bytes, tile_w
and tile_h
are respectively the width and height of the image, pix
is the final upside down image.
While this code works correctly, it takes around 50 ms to complete on my machine for a 256x256 image.
Is there anything obviously slow in this code? Would numpy be a potentially good avenue for optimization?
Edit: here is a standalone program to run the code and measure performance:
import array
import random
import struct
import time
# Size of the problem.
tile_w = 256
tile_h = 256
# Generate input data.
tile_data = []
for f in (random.uniform(0.0, 1.0) for _ in range(tile_w * tile_h * 4)):
tile_data.extend(struct.pack("f", f))
tile_data = bytes(tile_data)
start_time = time.time()
# Code of interest.
floats = array.array('f')
floats.fromstring(tile_data)
pix = []
for y in range(tile_h - 1, -1, -1):
stride = tile_w * 4
start_index = y * stride
end_index = start_index + stride
pix.extend(floats[i:i + 4] for i in range(start_index, end_index, 4))
print("runtime: {0} ms".format((time.time() - start_time) * 1000))
1 Answer 1
Would numpy be a potentially good avenue for optimization?
Yes. In general, pushing python loops into C extensions often makes sense.
You might prefer to start the timer after pix = []
, since you're not focused on improving fromstring
's performance.
There's nothing obviously slow, beyond bulk data movement and regrouping, and it would take changing the problem if you wanted to introduce a level of indirection to avoid that work. One could hoist the stride
constant out of the loop, but that's pretty far down the list of worries.
-
\$\begingroup\$ Thanks for your input! Regarding "you're not focused on improving fromstring's performance", yes I actually am. Basically I'm reading pixel data from a child process' stdout so I need the bytes->floats conversion. Turns out it insignicant compared to the loop below. As expected, hoisting
stride
out of the loop has no measurable incidence. Thanks again! \$\endgroup\$François Beaune– François Beaune2017年07月30日 20:27:02 +00:00Commented Jul 30, 2017 at 20:27
Explore related questions
See similar questions with these tags.