3
\$\begingroup\$

I have a list of rectangles as tuples (x, y, w, h). I need to find which ones intersect and if they do, average all the intersecting rectangles into a new rectangle. I also track the resultant averaged rectangle "weight" (the number of intersecting rectangles) as an addition member in the new tuple. (x, y, w, h, weight)

The final return is resultant list of averaged sorted by weight.

I am looking for feedback on how to improve performance and make it even more pythonic. Or, as I have discovered many times in python, there may a completely different and better way to approach the problem (such as with NumPy).

I pass a list of rect tuples to ordered_average_intersecting_rects:

import numpy as np
def rects_intersect(r1, r2):
 hoverlaps = (r1[0] <= r2[0] + r2[2]) and (r1[0] + r1[2] >= r2[0])
 voverlaps = (r1[1] <= r2[1] + r2[3]) and (r1[1] + r1[3] >= r2[1])
 return hoverlaps and voverlaps
def average_rects(rects):
 x = np.average([coord[0] for coord in rects])
 y = np.average([coord[1] for coord in rects])
 w = np.average([coord[2] for coord in rects])
 h = np.average([coord[3] for coord in rects])
 return x, y, w, h, len(rects)
def ordered_average_intersecting_rects(rects):
 intersecting_rect_clusters = []
 while rects: # modified from `while len(rect) > 0:` (janos feedback)
 r1 = rects[0]
 rect_cluster = [r1]
 for r2 in rects[1:]:
 if rects_intersect(r1, r2):
 rect_cluster.append(r2)
 rects.remove(r2)
 rects.remove(r1)
 intersecting_rect_clusters.append(rect_cluster)
 averaged_rects = [average_rects(cluster) for cluster in intersecting_rect_clusters]
 averaged_rects.sort(key=lambda w: w[4], reverse=True)
 return averaged_rects

Here is my unit test to help understand how the code is being used:

def test_ordered_average_intersecting_rects(self):
 # Arrange
 rects = [(150, 55, 40, 20, 0),
 (160, 40, 20, 20, 0),
 (100, 120, 20, 20, 0),
 (45, 10, 15, 30, 0),
 (50, 25, 25, 20, 0),
 (35, 35, 20, 30, 0)]
 # Act
 with test_helpers.Timer('average_intersecting_rects') as t:
 result = ordered_average_intersecting_rects(rects)
 # Assert
 assert len(result) == 3
 cluster1 = result[0]
 assert cluster1[4] == 3
 assert int(cluster1[0]) == 43
 assert int(cluster1[1]) == 23
 assert int(cluster1[2]) == 20
 assert int(cluster1[3]) == 26
 cluster2 = result[1]
 assert cluster2[4] == 2
 assert cluster2[0] == 155
 assert cluster2[1] == 47.5
 assert cluster2[2] == 30
 assert cluster2[3] == 20
 cluster3 = result[2]
 assert cluster3[4] == 1
 assert cluster3[0] == 100
 assert cluster3[1] == 120
 assert cluster3[2] == 20
 assert cluster3[3] == 20
asked Sep 6, 2014 at 20:51
\$\endgroup\$
3
  • \$\begingroup\$ What if the first rect intersects the second and the third, but the second and third rect do not intersect? Then all three rects are put into one "cluster", but the intersection of all three is empty. \$\endgroup\$ Commented Sep 7, 2014 at 8:20
  • \$\begingroup\$ Or 1 and 3 intersect, and 2 and 4 intersect, but 1 doesn't intersect 2 (nor does 3 intersect 4)? So, what do you want to do in the face of multiple sets (or resulting empty intersections)? \$\endgroup\$ Commented Sep 7, 2014 at 10:38
  • \$\begingroup\$ Good questions, I thought of that but in the case of my dataset the clusters are usually pretty tight and the size of the rects are almost always large enough to ensure overlap. So you are right, there might be a very few mis-formed clusters but for my application the accuracy is good enough so I decided to not add the incremental complexity to the algorithm to handle the few times I would run into the cases you mention. Unless you have an idea on how to add those cases more easily then my original attempts. \$\endgroup\$ Commented Sep 7, 2014 at 13:36

1 Answer 1

2
\$\begingroup\$

To improve performance, wait for another review with some clever numpy trick.

To make it more Pythonic, it would be better to represent a rectangle with a Rectangle class instead of a simple tuple, for example:

class Rectangle(object):
 def __init__(self, x, y, width, height):
 self.x = x
 self.y = y
 self.width = width
 self.height = height
 def intersect(self, other):
 hoverlaps = (self.x <= other.x + other.width) and (self.x + self.width >= other.x)
 voverlaps = (self.y <= other.y + other.height) and (self.y + self.height >= other.y)
 return hoverlaps and voverlaps

The intersect method fits naturally into it, and now the code is easier to read.

The averaging method becomes:

def average_rects(rects):
 x = np.average([rect.x for rect in rects])
 y = np.average([rect.y for rect in rects])
 w = np.average([rect.width for rect in rects])
 h = np.average([rect.height for rect in rects])
 return x, y, w, h, len(rects)

The main part of ordered_average_intersecting_rects becomes:

while rects:
 r1 = rects[0]
 rect_cluster = [r1]
 for r2 in rects[1:]:
 if r1.intersect(r2):
 rect_cluster.append(r2)
 rects.remove(r2)

And in your testing method you can create rects with:

rects = [Rectangle(150, 55, 40, 20),
 Rectangle(160, 40, 20, 20),
 Rectangle(100, 120, 20, 20),
 Rectangle(45, 10, 15, 30),
 Rectangle(50, 25, 25, 20),
 Rectangle(35, 35, 20, 30)]

And of course you can (and should) go further:

  • average_rects could return a tuple: Rectangle(x, y, w, h), len(rects)
  • The assertions could use cluster1.x, .y, .width, instead of cryptic [0], [1], ...

You can replace while len(rects) > 0: with simply while rects:, because in Python empty sequences are false. This is recommended on PEP8 too.

answered Sep 7, 2014 at 9:29
\$\endgroup\$
6
  • \$\begingroup\$ I stated with a Rect class but moved to tuples because performance is critical - the data sets can get large. It was my understanding that an object creation and object member lookup was slower than tuple creation and accessing indexed tuple elements. If this is not the case (or is not significant) then I would prefer to go back to the Class for readability. \$\endgroup\$ Commented Sep 7, 2014 at 13:40
  • \$\begingroup\$ If performance is critical, tuples should be faster. My suggestions concern mostly readability. I updated now to make this point clear. \$\endgroup\$ Commented Sep 7, 2014 at 13:52
  • \$\begingroup\$ Thanks, and you are right, "To improve performance,"... I am wait for another review with some clever numpy trick :) added numpy tag \$\endgroup\$ Commented Sep 7, 2014 at 13:55
  • \$\begingroup\$ Also update the while len(rects) > 0: every bit of code simplification counts. \$\endgroup\$ Commented Sep 7, 2014 at 13:59
  • 1
    \$\begingroup\$ You're not supposed to edit your original code, as that invalidates reviews \$\endgroup\$ Commented Sep 7, 2014 at 14:05

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.