Finding Bounding Boxes

Question 1

Given a set of overlapping boxes, and a set of points, I'd like to be able to determine which of the boxes contain each point. The number of points is much larger than the number of boxes, and I'd like to do it faster that brute force, so I'm trying to do it with a map-like structure.

My current attempt is below, it seems to pass my initial tests but I think there are some problems.

There's a few places where I decrement std::map iterators - is it well defined in this case? Should I not be using lower_bound?
I keep an empty map on the BoxTree purely so I can make GetBoxesContaining() return by const reference - is there a better solution?

#include <list>
#include <map>
#include <limits>
struct BoxSample {
 double m_lowerX, m_upperX, m_lowerY, m_upperY;
};
//Templated so can handle either the outer or inner map of BoxTree
//Checks that the map contains an element with the same key
template<typename T, typename Bound>
typename std::map<Bound, T>::iterator VerifyBound(typename std::map<Bound, T>& mapRef, Bound key)
{
 auto lb = mapRef.lower_bound(key);
 if(lb == mapRef.end() || (lb != mapRef.begin() && lb->first != key))
 {
 --lb;
 //Populates the new key with the contents from the immediately lowest keypair
 auto parib = mapRef.insert(std::make_pair(key, lb->second));
 return parib.first;
 } else {
 //Either entry already exists or new key <= minimum
 return lb;
 }
}
template<typename Box, typename Bound>
class BoxTree {
 std::map<Bound, std::map<Bound, std::list<Box>>> m_tree;
 std::list<Box> m_empty;
public:
 BoxTree(Bound min)
 :m_tree() {
 m_tree[min][min] = std::list<Box>();
 }
 void Insert(const Box& t)
 {
 auto xLB = VerifyBound(m_tree, t.m_lowerX);
 auto xUB = VerifyBound(m_tree, t.m_upperX);
 while(xLB != xUB) {
 auto yLB = VerifyBound(xLB->second, t.m_lowerY);
 auto yUB = VerifyBound(xLB->second, t.m_upperY);
 while(yLB != yUB)
 {
 yLB->second.push_front(t);
 ++yLB;
 }
 ++xLB;
 }
 }
 //Should return all boxes A such that A.lowerX <= x < A.lowerX && A.lowerY <= y < A.lowerY
 const std::list<Box>& GetBoxesContaining(const Bound& x, const Bound& y) {
 auto xSet = m_tree.lower_bound(x);
 if(xSet != m_tree.begin())
 {
 --xSet;
 auto ySet = xSet->second.lower_bound(y);
 if(ySet != xSet->second.begin())
 {
 --ySet;
 return ySet->second;
 }
 }
 return m_empty;
 }
};
//Sample usage
class BoxTreeSample {
 BoxTree<BoxSample, double> m_impl;
public:
 BoxTreeSample(): m_impl(-1.0 * std::numeric_limits<double>::infinity()) {
 BoxSample box1 = { 0.0, 10.0, 0.0, 10.0 };
 BoxSample box2 = { 0.0, 10.0, 0.0, 20.0 };
 BoxSample box3 = { 5.0, 10.0, 5.0, 15.0 };
 m_impl.Insert(box1);
 m_impl.Insert(box2);
 m_impl.Insert(box3);
 auto t1 = m_impl.GetBoxesContaining(2.5, 12.5);
 auto t2 = m_impl.GetBoxesContaining(7.5, 12.5);
 }
};
int main() {
 auto zz = BoxTreeSample();
 return 0;
}

Question 2

Should do. I just edited to confirm the headers and add main().

Question 3

Does not compile as is. Missing header for std::numeric_limits, and the auto& cases Barry mentions. (Are you using Visual Studio?)

Question 4

Yes, I am using Visual Studio. Setting /Wall does give warnings for the auto&s, which I've changed to auto. I've added the limits header, and it still compiles in VS2012. I can't opine about whether it compiles in gcc.

Question 5

Can I just point out that posted code must compile is not actually a requirement for Code Review. As long as the OP has confirmed that the code works as intended, that should be enough.

Question 6

It looks like some of the test code (inclusion of <limits> and definition of BoxSample) is mixed in with the library code. These should be extracted to the "sample usage".

The test program is strange - I don't see any need for the BoxTreeSample class, as it doesn't provide anything a plain function doesn't. I'd re-write that as a simple main() (with self-checking added):

int main()
{
 BoxTree<BoxSample, double> boxtree{-std::numeric_limits<double>::infinity()};
 BoxSample box1 = { 0.0, 10.0, 0.0, 10.0 };
 BoxSample box2 = { 0.0, 10.0, 0.0, 20.0 };
 BoxSample box3 = { 5.0, 10.0, 5.0, 15.0 };
 boxtree.Insert(box1);
 boxtree.Insert(box2);
 boxtree.Insert(box3);
 auto t1 = boxtree.GetBoxesContaining(2.5, 12.5);
 auto t2 = boxtree.GetBoxesContaining(7.5, 12.5);
 return (t1.size() != 1)
 + (t2.size() != 2);
}

It took me some time to discern the algorithm from the code. An introductory comment would be helpful:

/*
 We represent the set of rectangles by dividing the area into columns,
 and each column into boxes.
 Each x ordinate in our map represents the column immediately to its
 right; that column is in turn represented by a map of y ordinates to
 the box immediately following (a list of the containing rectangles).
 */

The VerifyBound() function doesn't seem to be generally useful; I'd be inclined to make it a private, static member of the template class rather than sharing with all code. The name is misleading, too - it does more than verify, because it can split an existing column or box into two, copying the associated list of rectangles.

It would be nice if we could make it self-starting, so that we don't need to provide min to BoxTree's constructor:

private:
 template<typename T>
 auto GetOrSplit(typename std::map<Bound, T>& mapRef, Bound key)
 {
 auto lb = mapRef.lower_bound(key);
 if (lb != mapRef.end() && lb->first == key) {
 // already a suitable entry to re-use
 return lb;
 }
 if (lb == mapRef.begin()) {
 // new item before any existing ones
 return mapRef.insert({key, {}}).first;
 }
 // split existing item into two
 --lb;
 return mapRef.insert(std::make_pair(key, lb->second)).first;
 }

Now, turning our attention to BoxTree, we see it's templated on Box, but has quite specific requirements about the members it expects from this type. It's worth writing some constraints:

template<typename Box, typename Bound>
 requires requires(Box b, Bound x) {
 x = b.m_lowerX;
 x = b.m_upperX;
 x = b.m_lowerY;
 x = b.m_upperY;
 }
class BoxTree

The choice of std::list in the representation could be harmful to performance, as it's expensive to copy (as we do in VerifyBound()). I'd venture that std::vector would be more performant (order is not significant, so we only need to append, and we never insert or delete elements).

We might also consider whether we want to store so many copies of our Boxes (unfortunately, our current implementation isn't able to use a std::reference_wrapper as our Box type; consider that for an enhancement).

GetBoxesContaining() looks reasonably straightforward. The only change I'd make is that we don't expect it to modify the BoxTree, so let's make it const. And it's the only place where m_empty is used, so we can reduce the scope of that.

Modified code

#include <map>
#include <vector>
template<typename Box, typename Bound>
 requires requires(Box b, Bound x) {
 x = b.m_lowerX;
 x = b.m_upperX;
 x = b.m_lowerY;
 x = b.m_upperY;
 }
class BoxTree
{
 /*
 We represent the set of rectangles by dividing the area into columns,
 and each column into boxes.
 Each x ordinate in our map represents the column immediately to its
 right; that column is in turn represented by a map of y ordinates to
 the box immediately following (a list of the containing rectangles).
 */
 std::map<Bound, std::map<Bound, std::vector<Box>>> m_tree{};
public:
 void Insert(const Box& t)
 {
 for (auto xLB = GetOrSplit(m_tree, t.m_lowerX), xUB = GetOrSplit(m_tree, t.m_upperX);
 xLB != xUB;
 ++xLB)
 {
 auto& column = xLB->second;
 for (auto yLB = GetOrSplit(column, t.m_lowerY), yUB = GetOrSplit(column, t.m_upperY);
 yLB != yUB;
 ++yLB)
 {
 yLB->second.push_back(t);
 }
 }
 }
 const std::vector<Box>& GetBoxesContaining(const Bound& x, const Bound& y) const noexcept
 {
 static auto const no_boxes = std::vector<Box>{};
 auto xSet = m_tree.lower_bound(x);
 if (xSet == m_tree.begin()) {
 return no_boxes;
 }
 auto const& column = (--xSet)->second;
 auto ySet = column.lower_bound(y);
 if (ySet == column.begin()) {
 return no_boxes;
 }
 return (--ySet)->second;
 }
private:
 template<typename T>
 auto GetOrSplit(typename std::map<Bound, T>& mapRef, Bound key)
 {
 auto lb = mapRef.lower_bound(key);
 if (lb != mapRef.end() && lb->first == key) {
 // already a suitable entry to re-use
 return lb;
 }
 if (lb == mapRef.begin()) {
 // new item before any existing ones
 return mapRef.insert({key, {}}).first;
 }
 // split existing item into two
 --lb;
 return mapRef.insert(std::make_pair(key, lb->second)).first;
 }
};

// Simple test program
#include <limits>
struct BoxSample {
 double m_lowerX, m_upperX, m_lowerY, m_upperY;
};
int main()
{
 BoxTree<BoxSample, double> boxtree;
 BoxSample box1 = { 0.0, 10.0, 0.0, 10.0 };
 BoxSample box2 = { 0.0, 10.0, 0.0, 20.0 };
 BoxSample box3 = { 5.0, 10.0, 5.0, 15.0 };
 boxtree.Insert(box1);
 boxtree.Insert(box2);
 boxtree.Insert(box3);
 auto count_boxes = [&boxtree](double x, double y) {
 return boxtree.GetBoxesContaining(x, y).size();
 };
 return (count_boxes(2.5, 12.5) != 1)
 + (count_boxes(7.5, 12.5) != 2)
 + (count_boxes(7.5, 17.5) != 1)
 + (count_boxes(2.5, -2.5) != 0)
 + (count_boxes(-2.5, 2.5) != 0);
}

The data structure used really only makes sense if the number of queries massively outnumbers the boxes added, as each box can cause copying of up to 4 result lists as its bounds are inserted.

For a situation with a different usage pattern, you would want a different data structure (perhaps something based on a quadtree?).

Toby Speight Toby Speight 87.7k14 gold badges104 silver badges325 bronze badges · Answer 1 · 2022-12-01 16:09:57Z

It looks like some of the test code (inclusion of <limits> and definition of BoxSample) is mixed in with the library code. These should be extracted to the "sample usage".

The test program is strange - I don't see any need for the BoxTreeSample class, as it doesn't provide anything a plain function doesn't. I'd re-write that as a simple main() (with self-checking added):

int main()
{
 BoxTree<BoxSample, double> boxtree{-std::numeric_limits<double>::infinity()};
 BoxSample box1 = { 0.0, 10.0, 0.0, 10.0 };
 BoxSample box2 = { 0.0, 10.0, 0.0, 20.0 };
 BoxSample box3 = { 5.0, 10.0, 5.0, 15.0 };
 boxtree.Insert(box1);
 boxtree.Insert(box2);
 boxtree.Insert(box3);
 auto t1 = boxtree.GetBoxesContaining(2.5, 12.5);
 auto t2 = boxtree.GetBoxesContaining(7.5, 12.5);
 return (t1.size() != 1)
 + (t2.size() != 2);
}

It took me some time to discern the algorithm from the code. An introductory comment would be helpful:

/*
 We represent the set of rectangles by dividing the area into columns,
 and each column into boxes.
 Each x ordinate in our map represents the column immediately to its
 right; that column is in turn represented by a map of y ordinates to
 the box immediately following (a list of the containing rectangles).
 */

The VerifyBound() function doesn't seem to be generally useful; I'd be inclined to make it a private, static member of the template class rather than sharing with all code. The name is misleading, too - it does more than verify, because it can split an existing column or box into two, copying the associated list of rectangles.

It would be nice if we could make it self-starting, so that we don't need to provide min to BoxTree's constructor:

private:
 template<typename T>
 auto GetOrSplit(typename std::map<Bound, T>& mapRef, Bound key)
 {
 auto lb = mapRef.lower_bound(key);
 if (lb != mapRef.end() && lb->first == key) {
 // already a suitable entry to re-use
 return lb;
 }
 if (lb == mapRef.begin()) {
 // new item before any existing ones
 return mapRef.insert({key, {}}).first;
 }
 // split existing item into two
 --lb;
 return mapRef.insert(std::make_pair(key, lb->second)).first;
 }

Now, turning our attention to BoxTree, we see it's templated on Box, but has quite specific requirements about the members it expects from this type. It's worth writing some constraints:

template<typename Box, typename Bound>
 requires requires(Box b, Bound x) {
 x = b.m_lowerX;
 x = b.m_upperX;
 x = b.m_lowerY;
 x = b.m_upperY;
 }
class BoxTree

The choice of std::list in the representation could be harmful to performance, as it's expensive to copy (as we do in VerifyBound()). I'd venture that std::vector would be more performant (order is not significant, so we only need to append, and we never insert or delete elements).

We might also consider whether we want to store so many copies of our Boxes (unfortunately, our current implementation isn't able to use a std::reference_wrapper as our Box type; consider that for an enhancement).

GetBoxesContaining() looks reasonably straightforward. The only change I'd make is that we don't expect it to modify the BoxTree, so let's make it const. And it's the only place where m_empty is used, so we can reduce the scope of that.

Modified code

#include <map>
#include <vector>
template<typename Box, typename Bound>
 requires requires(Box b, Bound x) {
 x = b.m_lowerX;
 x = b.m_upperX;
 x = b.m_lowerY;
 x = b.m_upperY;
 }
class BoxTree
{
 /*
 We represent the set of rectangles by dividing the area into columns,
 and each column into boxes.
 Each x ordinate in our map represents the column immediately to its
 right; that column is in turn represented by a map of y ordinates to
 the box immediately following (a list of the containing rectangles).
 */
 std::map<Bound, std::map<Bound, std::vector<Box>>> m_tree{};
public:
 void Insert(const Box& t)
 {
 for (auto xLB = GetOrSplit(m_tree, t.m_lowerX), xUB = GetOrSplit(m_tree, t.m_upperX);
 xLB != xUB;
 ++xLB)
 {
 auto& column = xLB->second;
 for (auto yLB = GetOrSplit(column, t.m_lowerY), yUB = GetOrSplit(column, t.m_upperY);
 yLB != yUB;
 ++yLB)
 {
 yLB->second.push_back(t);
 }
 }
 }
 const std::vector<Box>& GetBoxesContaining(const Bound& x, const Bound& y) const noexcept
 {
 static auto const no_boxes = std::vector<Box>{};
 auto xSet = m_tree.lower_bound(x);
 if (xSet == m_tree.begin()) {
 return no_boxes;
 }
 auto const& column = (--xSet)->second;
 auto ySet = column.lower_bound(y);
 if (ySet == column.begin()) {
 return no_boxes;
 }
 return (--ySet)->second;
 }
private:
 template<typename T>
 auto GetOrSplit(typename std::map<Bound, T>& mapRef, Bound key)
 {
 auto lb = mapRef.lower_bound(key);
 if (lb != mapRef.end() && lb->first == key) {
 // already a suitable entry to re-use
 return lb;
 }
 if (lb == mapRef.begin()) {
 // new item before any existing ones
 return mapRef.insert({key, {}}).first;
 }
 // split existing item into two
 --lb;
 return mapRef.insert(std::make_pair(key, lb->second)).first;
 }
};

// Simple test program
#include <limits>
struct BoxSample {
 double m_lowerX, m_upperX, m_lowerY, m_upperY;
};
int main()
{
 BoxTree<BoxSample, double> boxtree;
 BoxSample box1 = { 0.0, 10.0, 0.0, 10.0 };
 BoxSample box2 = { 0.0, 10.0, 0.0, 20.0 };
 BoxSample box3 = { 5.0, 10.0, 5.0, 15.0 };
 boxtree.Insert(box1);
 boxtree.Insert(box2);
 boxtree.Insert(box3);
 auto count_boxes = [&boxtree](double x, double y) {
 return boxtree.GetBoxesContaining(x, y).size();
 };
 return (count_boxes(2.5, 12.5) != 1)
 + (count_boxes(7.5, 12.5) != 2)
 + (count_boxes(7.5, 17.5) != 1)
 + (count_boxes(2.5, -2.5) != 0)
 + (count_boxes(-2.5, 2.5) != 0);
}

The data structure used really only makes sense if the number of queries massively outnumbers the boxes added, as each box can cause copying of up to 4 result lists as its bounds are inserted.

For a situation with a different usage pattern, you would want a different data structure (perhaps something based on a quadtree?).

Stack Exchange Network

Finding Bounding Boxes

1 Answer 1

Modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Finding Bounding Boxes

1 Answer 1

Modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions