The problem will take more than 2 minutes, so if you don't have much time I wouldn't recommend you to deal with the problem.
I found this paper on how to program "a new efficient ellipse detection method", so I thought why not just try it. I programmed it in Python but my implementation is - in contrast to the title of the paper - really slow and needs for an 325x325 image (with 6000 black pixels), like this one here, with multiple circles/ellipses on the order of 30 minutes...
Please read through my code and tell me where I can optimize things (and I think, that there are a lot of things to optimize).
I'll briefly explain the 15 steps, which are listed in the paper (if I explained one step unclear then just take a quick look at the paper):
The steps: (how I understood them)
Store all edge-pixels (pixels in black) in an array
clear the accumulator-array (you'll see the use of it later)
Loop through each array-entry in the "edge-pixels-array"
Loop through each array-entry again, check if the distance between the two coordinates (from step 3+4) is in between the min-radius and max-radius of my ellipse (min-radius and max-radius are just function parameters)
If this is true, then proceed with steps 5-14.
Calculate the center, orientation and the assumed length of the major axis (you can see the equations on the paper, but I don't think that they are necessary)
Loop through each array-entry a third time, check if the distance between this coordinate and the assumed center of the point is between the min-radius and the max-radius. It this is true, then proceed with steps 7.-9.
Calculate the length of minor axis using equations (if you need to look them up on the paper)
Add the assumed parameters of the ellipse to the accumulator-array (center, x-Width, y-Width, orientation)
Wait for the inner (3.)
for
loop to finishAverage all values in the accumulator-array to find the average parameters for the investigated ellipse
Save the average ellipse parameters
Remove the pixels within the detected ellipse (so you don't check the pixels twice...)
Clear accumulator-array
Wait for the outer two
for
loops to finishOutput the found ellipses
I would be glad if someone could help me speed up the process because it takes a really long time.
import cv2
from PIL import Image
import math
def detectEllipse(filePath, minR, maxR, minAmountOfEdges):
image = cv2.imread(outfile) # proceed with lower res.
w, h = len(image[0]), len(image)
# Ellipse Detection
output = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
EdgePixel = []
# Step 1 - Save all Edge-Pixels in an array
for x in range(gray.shape[0]):
for y in range(gray.shape[1]):
if gray[x, y] == 0:
EdgePixel.append([x,y])
# Step 2 - Initialize AccumulatorArray and EllipsesFound-Array
AccumulatorArray = []
EllipsesFound = []
# Step 3 - Loop through all pixels
ignore_indices = set()
for i in range(len(EdgePixel)):
if i in ignore_indices:
continue
# Step 4 - Loop through all pixels a second time
for j in range(len(EdgePixel)):
if j in ignore_indices:
continue
if i != j:
xAverage, yAverage, aAverage, bAverage, orientationAverage = 0, 0, 0, 0, 0
# (Step 12, clear Accumulator-Array)
AccumulatorArray = []
tmp = math.sqrt(abs(math.pow(EdgePixel[i][0]-EdgePixel[j][0], 2)) +
abs(math.pow(EdgePixel[i][1]-EdgePixel[j][1], 2)))
distance = int(tmp)
# Step 4.1 - Check if the distance is "ok"
if(distance / 2 > minR and distance / 2 < maxR):
# Step 5 - Calculate 4 parameters of the ellipse
xMiddle = (EdgePixel[i][0] + EdgePixel[j][0]) / 2
yMiddle = (EdgePixel[i][1] + EdgePixel[j][1]) / 2
a = tmp / 2
if(EdgePixel[j][0] != EdgePixel[i][0]): # To prevent division by 0
orientation = math.atan((EdgePixel[j][1]-EdgePixel[i][1])/
(EdgePixel[j][0]-EdgePixel[i][0]))
else:
orientation = 0
# Step 6 - Lop through all pixels a third time
for k in range(len(EdgePixel)):
if k in ignore_indices:
continue
if len(AccumulatorArray) > minAmoutOfEdges:
continue
if k != i and k != j:
# Distance from x,y to the middlepoint
innerDistance = math.sqrt(abs(math.pow((xMiddle - EdgePixel[k][0]),2)) +
abs(math.pow((yMiddle - EdgePixel[k][1]),2)))
# Distance from x,y to x2,y2
tmp2 = math.sqrt(abs(math.pow((EdgePixel[i][0] - EdgePixel[j][0]),2)) +
abs(math.pow((EdgePixel[i][1] - EdgePixel[j][1]),2)))
# Distance from x,y and x0,y0 has to be smaller then the distance from x1,y1 to x0,y0
if(innerDistance < a and innerDistance > minR and innerDistance < maxR):
# Step 7 - Calculate length of minor axis
# calculate cos^2(tau):
tau = math.pow(((math.pow(a,2)+math.pow(innerDistance,2)-math.pow(tmp2,2))/(2*a*innerDistance)),2)
bSquared = (math.pow(a, 2)*math.pow(innerDistance, 2)*(1-tau))/(math.pow(a, 2)-math.pow(innerDistance, 2)*tau)
# It follows:
b = math.sqrt(bSquared) # this is the minor axis length
# Step 8 - Add the parameters to the AccumulatorArray
Data = [xMiddle, yMiddle, a, b, orientation]
AccumulatorArray.append(Data)
# Step 9 (repeat from Step 6 till here)
# Step 10 - Check if the algorithm has detected enough Edge-Points and then average the results
if len(AccumulatorArray) > minAmoutOfEdges:
# Averageing
for k in range(len(AccumulatorArray)):
tmpData = AccumulatorArray[k]
xAverage += tmpData[0]
yAverage += tmpData[1]
aAverage += tmpData[2]
bAverage += tmpData[3]
orientationAverage += tmpData[4]
xAverage = int(xAverage / len(AccumulatorArray))
yAverage = int(yAverage / len(AccumulatorArray))
aAverage = int(aAverage / len(AccumulatorArray))
bAverage = int(bAverage / len(AccumulatorArray))
orientationAverage = int(orientationAverage / len(AccumulatorArray))
# Step 11 - Save the found Ellipse-Parameters
EllipsesFound.append([xAverage, yAverage, aAverage, bAverage, orientationAverage])
# Step 12 - Remove the Pixels on the detected ellipse from the Edge-Array
for k in range(len(EdgePixel)):
if ((math.pow(EdgePixel[k][0] - xAverage, 2) / math.pow((aAverage+5), 2)) +
((math.pow(EdgePixel[k][1] - yAverage, 2) / math.pow((bAverage+5), 2)))) <= 1:
ignore_indices.add(k)
return
detectEllipses("LinkToImage", 5, 15, 100)
Here is a Profile of the program, with most time is spent in pow2
(simply multiplies 2 values) and math.sqrt
:
65082750 function calls (65080245 primitive calls) in 22.924 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 17.666 17.666 22.724 22.724 ElipseDetection.py:16(detectEllipses) 34239900 2.410 0.000 2.410 0.000 ElipseDetection.py:162(pow2) 15660662 1.806 0.000 1.806 0.000 {built-in method math.sqrt} 14612430/14612383 0.699 0.000 0.699 0.000 {built-in method builtins.len} 488000 0.062 0.000 0.062 0.000 {method 'append' of 'list' objects}
3 Answers 3
Original points
# Step 2 - Initialize AccumulatorArray and EllipsesFound-Array AccumulatorArray = []
It's not needed in this scope, and in the scope where it is used the first thing that happens is that it's reinitialised, so this line is completely pointless.
for i in range(len(EdgePixel)): if i in ignore_indices: continue # Step 4 - Loop through all pixels a second time for j in range(len(EdgePixel)): if j in ignore_indices: continue if i != j:
This should be one of if i < j:
or if j < i:
because otherwise any pair of points which is rejected as a major axis will be tested a second time with the indices reversed, doubling the workload.
xAverage, yAverage, aAverage, bAverage, orientationAverage = 0, 0, 0, 0, 0
What's the scope of these variables?
tmp = math.sqrt(abs(math.pow(EdgePixel[i][0]-EdgePixel[j][0], 2)) + abs(math.pow(EdgePixel[i][1]-EdgePixel[j][1], 2))) distance = int(tmp) # Step 4.1 - Check if the distance is "ok" if(distance / 2 > minR and distance / 2 < maxR):
What are those abs
for? It looks like you're trying to work around a bug in the standard library, but in that case you should document the bug clearly.
Why take the sqrt
? As far as I can see distance
isn't used anywhere else. Since sqrt
is a monotonically increasing function over the range of interest you could instead square minR
and maxR
once and then compare the squared distances.
FWIW distance
is also not a useful name in my opinion. distanceIJ
would explain which of the many distances it is, but better still would be majorAxis
.
# Step 5 - Calculate 4 parameters of the ellipse xMiddle = (EdgePixel[i][0] + EdgePixel[j][0]) / 2 yMiddle = (EdgePixel[i][1] + EdgePixel[j][1]) / 2 a = tmp / 2 if(EdgePixel[j][0] != EdgePixel[i][0]): # To prevent division by 0 orientation = math.atan((EdgePixel[j][1]-EdgePixel[i][1])/ (EdgePixel[j][0]-EdgePixel[i][0])) else: orientation = 0
Firstly, your indentation seems broken here. Before pasting code into any StackExchange site you should ensure that it has no tabs or that your tabstop is 4 spaces, because the indentation will be forceably converted to spaces using that tabstop.
This would be more readable with names for EdgePixel[i][0]
etc. Could be ax, ay, bx, by
, xp, yp, xq, yq
, or anything simple and consistent.
The correct way to avoid division by zero is to use atan2
instead of atan
.
# Step 6 - Lop through all pixels a third time for k in range(len(EdgePixel)): if k in ignore_indices: continue if len(AccumulatorArray) > minAmoutOfEdges: continue if k != i and k != j:
Check the spelling.
What's the second skip condition about? I don't see that in the algorithm description, and it seems to reduce the ability to discriminate between multiple candidate ellipses.
Why the inconsistent use of continue
for two conditions and then a massive nested if
block for the third? It would seem more consistent to write
for k in range(len(EdgePixel)):
if k in ignore_indices or k == i or k == j:
continue
and then lose a level of indentation on the main content of the loop.
# Distance from x,y to the middlepoint innerDistance = math.sqrt(abs(math.pow((xMiddle - EdgePixel[k][0]),2)) + abs(math.pow((yMiddle - EdgePixel[k][1]),2))) # Distance from x,y to x2,y2 tmp2 = math.sqrt(abs(math.pow((EdgePixel[i][0] - EdgePixel[j][0]),2)) + abs(math.pow((EdgePixel[i][1] - EdgePixel[j][1]),2))) # Distance from x,y and x0,y0 has to be smaller then the distance from x1,y1 to x0,y0 if(innerDistance < a and innerDistance > minR and innerDistance < maxR):
Previous comments about abs
and sqrt
apply also here.
# Step 8 - Add the parameters to the AccumulatorArray Data = [xMiddle, yMiddle, a, b, orientation] AccumulatorArray.append(Data)
What about k
? If you store that you greatly simplify the update to ignore_indices
, because it's simply a case of adding the k
of the selected parameters from the accumulator rather than repeating the calculation to determine which points are on the ellipse.
# Step 10 - Check if the algorithm has detected enough Edge-Points and then average the results if len(AccumulatorArray) > minAmoutOfEdges: # Averageing for k in range(len(AccumulatorArray)): tmpData = AccumulatorArray[k] xAverage += tmpData[0] yAverage += tmpData[1] aAverage += tmpData[2] bAverage += tmpData[3] orientationAverage += tmpData[4] xAverage = int(xAverage / len(AccumulatorArray)) yAverage = int(yAverage / len(AccumulatorArray)) aAverage = int(aAverage / len(AccumulatorArray)) bAverage = int(bAverage / len(AccumulatorArray)) orientationAverage = int(orientationAverage / len(AccumulatorArray))
This is just wrong. The specification calls for you to find the most common minor axis (presumably with some margin of error, although that's not stunningly clear) and to take just the points which correspond to that minor axis as the points on the ellipse.
The details are a bit tricky. You're casting to int
when producing the output of the parameters, so do you assume that all minor axes should be integers? If so, that makes the grouping relatively easy. Otherwise the easiest implementation would be to add a parameter bMargin
, sort the values of b
and scan to find the value of b
which has most points in the range [b, b + bMargin]
. Then if that cluster has minAmountOfEdges
points, you take the points in the cluster, average their parameters to get the values for EllipsesFound
, and add their k
values to ignore_indices
.
Additional points
tmp2 = math.sqrt(abs(math.pow((EdgePixel[i][0] - EdgePixel[j][0]),2)) + abs(math.pow((EdgePixel[i][1] - EdgePixel[j][1]),2)))
This is another bug. It should be using k
rather than i
.
Why accumulate values of xMiddle
, yMiddle
, a
and orientation
and then take their averages? The only element of Data
which actually depends on k
is b
, so that the only one that's worth accumulating and averaging.
By looking at how the results of sqrt
were used and observing that they're nearly all squared again, it's possible to simplify a lot. This is steps 3 to 9 (admittedly with too few comments and needing a better name than foo
):
ignore_indices = set()
for i in range(len(EdgePixel)):
if i in ignore_indices:
continue
px, py = EdgePixel[i][0], EdgePixel[i][1]
for j in range(len(EdgePixel)):
if j <= i or j in ignore_indices:
continue
qx, qy = EdgePixel[j][0], EdgePixel[j][1]
majorAxisSq = math.pow(px - qx, 2) + math.pow(py - qy, 2)
if (majorAxisSq < minAxisSq or majorAxisSq > maxAxisSq):
continue
AccumulatorArray = []
cx, cy = (px + qx) / 2, (py + qy) / 2
for k in range(len(EdgePixel)):
if k == i or k == j or k in ignore_indices:
continue
rx, ry = EdgePixel[k][0], EdgePixel[k][1]
crSq = math.pow(cx - rx, 2) + math.pow(cy - ry, 2)
# Require |C-R| < majorAxis and point not so close to centre that minor axis (< |C-R|) is too small
if crSq > majorAxisSq or crSq < minAxisSq:
continue
# Calculate length of minor axis. TODO Numerical analysis
var fSq = math.pow(rx - qx, 2) + math.pow(ry - qy, 2);
foo = math.pow(majorAxisSq/4 + crSq - fSq, 2)
bSquared = (majorAxisSq * crSq * (majorAxisSq * crSq - foo)) / (majorAxisSq - 4 * crSq * foo)
minorSemiAxis = math.sqrt(bSquared)
AccumulatorArray.append([minorSemiAxis, k])
Higher level suggestions
Looking at the filters it's become clear what the biggest speed improvement would be. There's a maximum permitted major axis; once you're looking for possible edge points, there's a constraint that every point must be inside the circle which has EdgePoints[i]
and EdgePoints[j]
as its diameter. Rather than scanning every single edge pixel to see whether it meets those criteria, you should find or implement a geometric lookup data structure. It doesn't necessarily have to support circular or semicircular lookups: bounding boxes should be good enough to reduce the points tested considerably. As a first implementation, unless there's suitable library support, I'd try a simple quadtree.
-
\$\begingroup\$ Thanks, I have a few more questions: What Do you mean with geometric lookup Data structure? What's wrong with the average thing? What do you mean by "inconsistent use of continue"? Could you maybe explain? And could you explain step 10 in the paper:"If the vote is greater than the reqired least number for assumed ellipse, one ellipse is detected " because I don't unterstand it :/ (maybe you could add the points in your answer :)) \$\endgroup\$Gykonik– Gykonik2017年01月31日 17:45:28 +00:00Commented Jan 31, 2017 at 17:45
-
\$\begingroup\$ @Gykonik, have you seen my first edit (second revision), which expanded on the "inconsistent use of continue"? I've just made a third edit which adds some links to data structures and expands slightly on the other points you mention. \$\endgroup\$Peter Taylor– Peter Taylor2017年01月31日 18:00:10 +00:00Commented Jan 31, 2017 at 18:00
-
\$\begingroup\$ Perfect, i'll look through this tomorrow! :) \$\endgroup\$Gykonik– Gykonik2017年01月31日 18:16:27 +00:00Commented Jan 31, 2017 at 18:16
-
\$\begingroup\$ And what Do you think about caching? :) \$\endgroup\$Gykonik– Gykonik2017年01月31日 18:16:50 +00:00Commented Jan 31, 2017 at 18:16
-
\$\begingroup\$ And could you Explain step 10 in the paper? "If the vote is greater than the reqired least number for assumed ellipse, one ellipse is detected" I don't understand the meaing of this... \$\endgroup\$Gykonik– Gykonik2017年01月31日 19:23:35 +00:00Commented Jan 31, 2017 at 19:23
Disclaimer: I don't do python
# Step 6 - Lop through all pixels a third time for k in range(len(EdgePixel)): if k in ignore_indices: continue if len(AccumulatorArray) > minAmoutOfEdges: continue if k != i and k != j: # Distance from x,y to the middlepoint innerDistance = math.sqrt(abs(math.pow((xMiddle - EdgePixel[k][0]),2)) + abs(math.pow((yMiddle - EdgePixel[k][1]),2))) # Distance from x,y to x2,y2 tmp2 = math.sqrt(abs(math.pow((EdgePixel[i][0] - EdgePixel[j][0]),2)) + abs(math.pow((EdgePixel[i][1] - EdgePixel[j][1]),2)))
You are calculating tmp2
although you might not need it. If the condition if(innerDistance < a and innerDistance > minR and innerDistance < maxR):
won't be true the calculation of tmp2
would be superflous.
You are doing a lot of math.pow()
and all of the calls are with the power of 2
. I bet implementing a method pow2(value)
which returns value * value
will be much faster.
-
\$\begingroup\$ Actually,
tmp2
just seems to betmp
, so the calculation is unconditionally superfluous. \$\endgroup\$Peter Taylor– Peter Taylor2017年01月31日 14:48:53 +00:00Commented Jan 31, 2017 at 14:48 -
\$\begingroup\$ Actually, correction, it's buggy. Should use
k
instead ofi
. \$\endgroup\$Peter Taylor– Peter Taylor2017年01月31日 14:56:09 +00:00Commented Jan 31, 2017 at 14:56 -
\$\begingroup\$ When I think about it again I'm not quiet shure, if
math.pow()
wouldn't be faster thenpow2()
because I think, that first one is very optimized and maybe also caches some calculations... \$\endgroup\$Gykonik– Gykonik2017年02月01日 12:05:59 +00:00Commented Feb 1, 2017 at 12:05 -
\$\begingroup\$ @Gykonik not quite sure if this is still valid, but you can check for yourself: stackoverflow.com/a/5246964/2655508 \$\endgroup\$Heslacher– Heslacher2017年02月01日 12:13:05 +00:00Commented Feb 1, 2017 at 12:13
-
\$\begingroup\$ I have profiled
math.pow()
andpow2()
in a range of(0, 9999999)
with the result, that first one needed ~5,5s and second one needed ~2,5s. So this question is solved! :P \$\endgroup\$Gykonik– Gykonik2017年02月01日 12:26:25 +00:00Commented Feb 1, 2017 at 12:26
Too much indexing
Python is not a compiled language. Every time something is indexed, the Python interpreter will redo the index lookup. It cannot determine that a value is unchanged and there is no need to repeat the indexing.
Consider just the following statements:
for i in range(len(EdgePixel)):
...
for j in range(len(EdgePixel)):
...
if ...:
for k in range(len(EdgePixel)):
if k != i and k != j:
innerDistance = math.sqrt(abs(math.pow((xMiddle - EdgePixel[k][0]),2)) +
abs(math.pow((yMiddle - EdgePixel[k][1]),2)))
tmp2 = math.sqrt(abs(math.pow((EdgePixel[i][0] - EdgePixel[j][0]),2)) +
abs(math.pow((EdgePixel[i][1] - EdgePixel[j][1]),2)))
Since you have a triple-nested loop, the inner loop executes approximately \$N^3\$ times. In just the innerDistance
calculation, EdgePixel[k]
is evaluated twice. The first time to fetch the EdgePixel[k][0]
value, and then a second time in order to fetch the EdgePixel[k][1]
value. This indexing is not a trivial operation. Each time EdgePixel
is first looked up in the local dictionary, then k
is looked up in the local dictionary, then PyObject_GetItem(PyObject *o, PyObject *key)
is called with those two value to fetch the value of EdgePixel[k]
!
Since the values are not changing inside the loops, look them up once as you are looping in the for
statement itself:
for i, (xi, yi) in enumerate(EdgePixel):
...
for j, (xj, yj) in enumerate(EdgePixel):
...
if ...:
for k, (xk, yk) in enumerate(EdgePixel):
if k != i and k != j:
innerDistance = math.sqrt(abs(math.pow((xMiddle - xk),2)) +
abs(math.pow((yMiddle - yk),2)))
tmp2 = math.sqrt(abs(math.pow((xi - xj),2)) +
abs(math.pow((yi - yj),2)))
Not only is the above faster, it is easier to read, as [0]
and [1]
have been replaced by more semantically meaningful x
and y
named variables.
Move invariants out of loops
tmp2
doesn't depend on k
, or any other value from the innermost loop. It can be computed outside the k
loop.
for i, (xi, yi) in enumerate(EdgePixel):
...
for j, (xj, yj) in enumerate(EdgePixel):
...
if ...:
tmp2 = math.sqrt(abs(math.pow((xi - xj),2)) +
abs(math.pow((yi - yj),2)))
for k, (xk, yk) in enumerate(EdgePixel):
if k != i and k != j:
innerDistance = math.sqrt(abs(math.pow((xMiddle - xk),2)) +
abs(math.pow((yMiddle - yk),2)))
Unless, of course, the calculation is incorrect and is supposed to depend on k
, as indicated by Peter Taylor. In that case, this movement is merely optimizing incorrect code. ;-)
python -m cProfile script_name.py
to see where most of the time is being spent. Maybe you could also add a sample image/way to generate one? \$\endgroup\$for
s. That's 6000^3=216000000000 operations and you have no caching for all the divisions and power operations you're doing. \$\endgroup\$