I was recently tasked with determining which hex RGB colors in list color_list are nearest to each hex RGB color in list target_colors, using euclidean distance as the measuring stick, and only returning items that are within 10 distance from each target color. My first pass at this was for each item in target_colors, check the distance between it and everything in color_list, and return the colors at or below the target distance. This works well enough when color_list is small. But when color_list is large, looping over the entire list for each item in target_colors is slow. I am wondering if there is a better way to go about doing this, so that I don't have to loop over the entire list every time.
My first thought was to cache the results, but given that there are 16 million potential colors, that is potentially 2*16m^2 combinations, it would not be very space efficient.
My next thought was that perhaps the source color list can be sorted in a way that would make searching it faster, but I have to loop over all of the colors to get the distance as the thing to sort on, and I would not be in any better of shape.
My last thought was that given an r, g, or b color, perhaps I could eliminate colors that are obviously going to be out of range. If I have a color like #AABBAA, I instinctively know that #001100 is going to be way too far away, so I could sort the source list of colors l, then eliminate anything that smells like its out of range. But, I get back to the "well I would calculate the distance to figure out if it smells out of range", and I'm back to square one.
distance.rb:
TARGET_DISTANCE = 10.0
# calculate the euclidiean distance between two hex colors
def color_distance(c1, c2)
c1_parts = c1.split('').each_slice(2).map{|c| c.join}.to_a
c2_parts = c2.split('').each_slice(2).map{|c| c.join}.to_a
r_diff = c1_parts[0].to_i(16) - c2_parts[0].to_i(16)
g_diff = c1_parts[1].to_i(16) - c2_parts[1].to_i(16)
b_diff = c1_parts[2].to_i(16) - c2_parts[2].to_i(16)
Math.sqrt(r_diff**2 + g_diff**2 + b_diff**2)
end
def random_color
3.times.map{rand(254).to_s(16).rjust(2, '0')}.join()
end
color_list = []
1.upto(10000) { color_list << random_color }
target_colors = []
1.upto(100) { target_colors << random_color }
results = {}
target_colors.each do |t|
results[t] ||= []
color_list.each do |c|
distance = color_distance(c, t)
if distance <= TARGET_DISTANCE
results[t] << [c, distance]
end
end
end
1 Answer 1
As @vnp says, a solution involving an octree would a more efficient way to find neighboring points.
Before you implement one, though, there are some huge time savings you can get by making some trivial changes:
- Computing square roots is computationally intensive. Do it only when you need to.
- Parsing hex values into integers is somewhat slow, especially when you do it using a lot of string and array operations. It would be more efficient to parse the entire 6-hex-digit string into an integer, then use bit manipulation.
- The same goes for generating random colors. I don't know why you used
rand(254)
rather thanrand(256)
. If you allow color componentsFE
andFF
, then it becomes a simple matter of generating a 24-bit random number, and formatting it as hex. (Better yet, avoid converting the colors into their hex representation altogether, and just store them as integers.)
In addition, 1.upto(10000) { color_list << random_color }
is poor Ruby style. Write it using .map
instead (like you did in the implementation of random_colors
).
I don't know why you wrote results[t] ||= []
instead of results[t] = []
.
With just those changes, I'm able to reduce the execution time by 90% already.
def color_dist_sq(c1, c2)
c1 = c1.to_i(16)
c2 = c2.to_i(16)
[0, 8, 16].map do |bits|
((c1 >> bits & 0xff) - (c2 >> bits & 0xff)) ** 2
end.inject(:+)
end
def random_color
'%06x' % rand(1 << 24)
end
TARGET_DISTANCE_SQ = 10**2
color_list = 10000.times.map { random_color }
target_colors = 100.times.map { random_color }
results = {}
target_colors.each do |t|
results[t] = color_list.inject([]) do |result, c|
dist_sq = color_dist_sq(c, t)
result << [c, Math.sqrt(dist_sq)] if dist_sq <= TARGET_DISTANCE_SQ
result
end
end
Explore related questions
See similar questions with these tags.
random_color
usesrand(254)
rather thanrand(256)
? \$\endgroup\$