I'm rewriting a full color Mandelbrot Set explorer in Python using tkinter. For it, I need to be able to convert a Tuple[int, int, int]
into a hex string in the form #123456
. Here are example uses of the two variants that I came up with:
>>>rgb_to_hex(123, 45, 6)
'#7b2d06'
>>>rgb_to_hex(99, 88, 77)
'#63584d'
>>>tup_rgb_to_hex((123, 45, 6))
'#7b2d06'
>>>tup_rgb_to_hex((99, 88, 77))
'#63584d'
>>>rgb_to_hex(*(123, 45, 6))
'#7b2d06'
>>>rgb_to_hex(*(99, 88, 77))
'#63584d'
The functions I've come up with are very simple, but intolerably slow. This code is a rare case where performance is a real concern. It will need to be called once per pixel, and my goal is to support the creation of images up to 50,000x30,000 pixels (1500000000 in total). 100 million executions take ~300 seconds:
>>>timeit.timeit(lambda: rgb_to_hex(255, 254, 253), number=int(1e8))
304.3993674000001
Which, unless my math is fubared, means this function alone will take 75 minutes in total for my extreme case.
I wrote two versions. The latter was to reduce redundancy (and since I'll be handling tuples anyways), but it was even slower, so I ended up just using unpacking on the first version:
# Takes a tuple instead
>>>timeit.timeit(lambda: tup_rgb_to_hex((255, 254, 253)), number=int(1e8))
342.8174099
# Unpacks arguments
>>>timeit.timeit(lambda: rgb_to_hex(*(255, 254, 253)), number=int(1e8))
308.64342439999973
The code:
from typing import Tuple
def _channel_to_hex(color_val: int) -> str:
raw: str = hex(color_val)[2:]
return raw.zfill(2)
def rgb_to_hex(red: int, green: int, blue: int) -> str:
return "#" + _channel_to_hex(red) + _channel_to_hex(green) + _channel_to_hex(blue)
def tup_rgb_to_hex(rgb: Tuple[int, int, int]) -> str:
return "#" + "".join([_channel_to_hex(c) for c in rgb])
I'd prefer to be able to use the tup_
variant for cleanliness, but there may not be a good way to automate the iteration with acceptable amounts of overhead.
Any performance-related tips (or anything else if you see something) are welcome.
5 Answers 5
You seem to be jumping through some unnecessary hoops. Just format a string directly:
from timeit import timeit
def _channel_to_hex(color_val: int) -> str:
raw: str = hex(color_val)[2:]
return raw.zfill(2)
def rgb_to_hex(red: int, green: int, blue: int) -> str:
return "#" + _channel_to_hex(red) + _channel_to_hex(green) + _channel_to_hex(blue)
def direct_format(r, g, b):
return f'#{r:02x}{g:02x}{b:02x}'
def one_word(r, g, b):
rgb = r<<16 | g<<8 | b
return f'#{rgb:06x}'
def main():
N = 100000
methods = (
rgb_to_hex,
direct_format,
one_word,
)
for method in methods:
hex = method(1, 2, 255)
assert '#0102ff' == hex
def run():
return method(1, 2, 255)
dur = timeit(run, number=N)
print(f'{method.__name__:15} {1e6*dur/N:.2f} us')
main()
produces:
rgb_to_hex 6.75 us
direct_format 3.14 us
one_word 2.74 us
That said, the faster thing to do is almost certainly to generate an image in-memory with a different framework and then send it to tkinter.
-
1\$\begingroup\$ Good suggestions, but I ended up just learning how to use Pillow (basically your suggestion at the bottom). You can import Pillow image objects directly into tkinter. It ended up being much cleaner. Thank you. \$\endgroup\$Carcigenicate– Carcigenicate2019年09月24日 15:38:33 +00:00Commented Sep 24, 2019 at 15:38
Another approach to generate the hex string is to directly reuse methods of format strings rather than writing your own function.
rgb_to_hex = "#{:02x}{:02x}{:02x}".format # rgb_to_hex(r, g, b) expands to "...".format(r, g, b)
rgb_tup_to_hex = "#%02x%02x%02x".__mod__ # rgb_tup_to_hex((r, g, b)) expands to "..." % (r, g, b)
These are faster (rgb_to_hex_orig
is renamed from the rgb_to_hex
function in the question):
rgb_tup = (0x20, 0xFB, 0xC2)
%timeit rgb_to_hex_orig(*rgb_tup)
%timeit direct_format(*rgb_tup)
%timeit one_word(*rgb_tup)
%timeit rgb_to_hex(*rgb_tup)
%timeit rgb_tup_to_hex(rgb_tup)
Results:
1.57 μs ± 5.14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.18 μs ± 5.34 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
704 ns ± 3.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
672 ns ± 4.54 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
502 ns ± 7.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
rgb_tup_to_hex
is the fastest partially due to it takes a tuple directly as its argument and avoids the small overhead of unpacking arguments.
However, I doubt these improvements would help solve your problem given its magnitude.
Using the Pillow / PIL library, pixel values can be directly set based on indices using tuples. Therefore converting tuples to strings are not really necessary. Here are examples showing basics of displaying PIL
images in tkinter
. This is likely still slow if the changes are done pixel by pixel. For extensive changes, the ImageDraw module or Image.putdata could be used.
memoization
The _channel_to_hex
is called 3 times per pixel. It only takes 256 different inputs, so a logical first step would be to memoize the results. This can be done with either functools.lru_cache
from functools import lru_cache
@lru_cache(None)
def _channel_to_hex(color_val: int) -> str:
raw: str = hex(color_val)[2:]
return raw.zfill(2)
This already reduces the time needed with about a 3rd
An alternative is using a dict:
color_hexes ={
color_val: hex(color_val)[2:].zfill(2)
for color_val in range(256)
}
def rgb_to_hex_dict(red: int, green: int, blue: int) -> str:
return "#" + color_hexes[red] + color_hexes[green] + color_hexes[blue]
If the color-tuples are also limited (256**3 in worst case), so these can also be memoized
color_tuple_hexes = {
rgb_to_hex_dict(*color_tuple)
for color_tuple in itertools.product(range(256), repeat=3)
}
This takes about 15 seconds on my machine, but only needs to be done once.
If only a limited set of tuples is used, you can also use lru_cache
@lru_cache(None)
def rgb_to_hex_dict(red: int, green: int, blue: int) -> str:
return "#" + color_hexes[red] + color_hexes[green] + color_hexes[blue]
numpy
if you have your data in a 3-dimensional numpy array, for example:
color_data = np.random.randint(256, size=(10,10,3))
You could do something like this:
coeffs = np.array([256**i for i in range(3)])
np_hex = (color_data * coeffs[np.newaxis, np.newaxis, :]).sum(axis=2)
np.vectorize(lambda x: "#" + hex(x)[2:].zfill(6))(np_hex)
Numpy is your best friend.
Given your comment:
The tuples are produced by "color scheme" functions. The functions take the (real, imaginary) coordinates of the pixel and how many iterations it took that pixel to fail, and return a three-tuple. They could return anything to indicate the color (that code is completely in my control), I just thought a three-tuple would by simplest. In theory, I could expect the functions to directly return a hex string, but that's just kicking the can down the road a bit since they need to be able to generate the string somehow.
Create a numpy array for the image you're going to create, then just assign your values into the array directly. Something like this:
import numpy as np
image = np.empty(shape=(final_image.ysize, final_image.xsize, 3), dtype=np.uint8)
# And instead of calling a function, assign to the array simply like:
image[x_coor, y_coor] = color_tuple
# Or if you really need a function:
image.__setitem__((x_coor, y_coor), color_tuple) # I haven't tested this with numpy, but something like it should work.
You do need to make sure that your arrays are in the same shape as tkinter expects it's images, though. And if you can make another shortcut to put the data into the array sooner, take it.
If you're doing an action this often, then you need to cut out function calls and such as often as you can. If possible, make the slice assignments bigger to set area's at the same time.
-
-
\$\begingroup\$ Yes, you'll get a buffer filled with bogus data. But if you fill it yourself anyway, then it doesn't really matter what you use. I just grabbed the first thing that came to mind. Do you think it's bad practice if you fill your array entirely anyway ? \$\endgroup\$Gloweye– Gloweye2019年09月19日 14:32:36 +00:00Commented Sep 19, 2019 at 14:32
-
\$\begingroup\$ In this case I do not think there is much difference in terms of the functionality or performance. As the doc suggests,
np.ndarray
is a low-level method and it is recommended to use those high-level APIs instead. \$\endgroup\$GZ0– GZ02019年09月19日 14:41:30 +00:00Commented Sep 19, 2019 at 14:41 -
\$\begingroup\$
np.empty(...)
is good candidate to express what you are doing in case you want to follow @Gloweye's comment and get rid ofnp.ndarray(...)
. \$\endgroup\$AlexV– AlexV2019年09月19日 15:04:28 +00:00Commented Sep 19, 2019 at 15:04 -
\$\begingroup\$ I think you need parentheses around
x_coor, y_coor
to make it a tuple. \$\endgroup\$Solomon Ucko– Solomon Ucko2019年09月20日 01:32:46 +00:00Commented Sep 20, 2019 at 1:32
Python compilers for performance
Nuitka
Nuitka compiles any and all Python code into faster architecture-specific C++ code. Nuitka's generated code is faster.
Cython
Cython can compiles any and all Python code into platform-indepndent C code. However, where it really shines is because you can annotate your Cython functions with C types and get a performance boost out of that.
PyPy
PyPy is a JIT that compiles pure Python code. Sometimes it can produce good code, but it has a slow startup time. Although PyPy probably won't give you C-like or FORTRAN-like speeds, it can sometimes double or triple the execution speed of performance-critical sections.
However, PyPy is low on developers, and as such, it does not yet support Python versions 3.7 or 3.8. Most libraries are still written with 3.6 compatibility.
Numba
Numba compiles a small subset of Python. It can achieve C-like or FORTRAN-like speeds with this subset-- when tuned properly, it can automatically parallelize and automatically use the GPU. However, you won't really be writing your code in Python, but in Numba.
Alternatives
You can write performance-critical code in another programming language. One to consider would be D, a modern programming language with excellent C compatibility.
Python integrates easily with languages C. In fact, you can load dynamic libraries written in C* into Python with no glue code.
*D should be able to do this with extern(C):
and -betterC
.
It will need to be called once per pixel
- this is suspicious. What code is accepting a hex string? High-performance graphics code should be dealing in RGB byte triples packed into an int32. \$\endgroup\$number-guessing-game
tag. I'm not sure why that was added. \$\endgroup\$