18

I am looking for a nice way of flattening (splitting) a list of potentially-overlapping numeric ranges. The problem is very similar to that of this question: Fastest way to split overlapping date ranges, and many others.

However, the ranges are not only integers, and I am looking for a decent algorithm that can be easily implemented in Javascript or Python, etc.

Example Data: Example data

Example Solution: enter image description here

Apologies if this is a duplicate, but I am yet to find a solution.

asked May 29, 2014 at 2:25
4
  • How do you determine that green is on top of blue, but under yellow and orange? Are the color ranges applied in order? If that's the case, the algorithm seems obvious; just ...erm, apply the color ranges in order. Commented May 29, 2014 at 3:04
  • 1
    Yes, they are applied in order. But that's the problem—how would you 'apply' the ranges? Commented May 29, 2014 at 3:16
  • 1
    Do you often add/remove colors, or do you need to optimize for query speed? How many "ranges" will you usually have? 3? 3000? Commented May 29, 2014 at 20:52
  • Won't be adding/removing colours very frequently, and there'll be anywhere between 10-20 ranges, with 4+ digit precision. That's why the set method isn't quite suitable, because the sets will have to be 1000+ items long. The method I've gone with is the one I posted in Python. Commented May 30, 2014 at 0:22

5 Answers 5

10

Walk from left to right, using a stack to keep track of what color you're on. Instead of a discrete map, use the 10 numbers in your dataset as the break-points.

Starting with an empty stack, and setting start to 0, loop until we reach the end:

  • If stack is empty:
    • Look for the first color starting at or after start, and push it and all lower-ranked colors onto the stack. In your flattened list, mark the beginning of that color.
  • else (If not empty):
    • Find next beginning point for any higher-ranked color at or after start, and find the end of the current color
      • If the next color starts first, push it and anything else on the way to it onto the stack. Update the end of the current color as the beginning of this one, and add the start of this color to the flattened list.
      • If there are none and the current color ends first, set start to the end of this color, pop it off of the stack, and check the next-highest ranked color
        • If start is within the next color's range, add this color to the flattened list, starting at start.
        • If the stack empties, just continue the loop (go back to the first bullet point).

This is a mental run-through given your example data:

# Initial data.
flattened = []
stack = []
start = 0
# Stack is empty. Look for the next starting point at 0 or later: "b", 0 - Push it and all lower levels onto stack
flattened = [ (b, 0, ?) ]
stack = [ r, b ]
start = 0
# End of "b" is 5.4, next higher-colored start is "g" at 2 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, ?) ]
stack = [ r, b, g ]
start = 2
# End of "g" is 12, next higher-colored start is "y" at 3.5 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, ?) ]
stack = [ r, b, g, y ]
start = 3.5
# End of "y" is 6.7, next higher-colored start is "o" at 6.7 - Delimit and continue
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, ?) ]
stack = [ r, b, g, y, o ]
start = 6.7
# End of "o" is 10, and there is nothing starting at 12 or later in a higher color. Next off stack, "y", has already ended. Next off stack, "g", has not ended. Delimit and continue.
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, ?) ]
stack = [ r, b, g ]
start = 10
# End of "g" is 12, there is nothing starting at 12 or later in a higher color. Next off stack, "b", is out of range (already ended). Next off stack, "r", is out of range (not started). Mark end of current color:
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12) ]
stack = []
start = 12
# Stack is empty. Look for the next starting point at 12 or later: "r", 12.5 - Push onto stack
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12), (r, 12.5, ?) ]
stack = [ r ]
start = 12
# End of "r" is 13.8, and there is nothing starting at 12 or higher in a higher color. Mark end and pop off stack.
flattened = [ (b, 0, 2), (g, 2, 3.5), (y, 3.5, 6.7), (o, 6.7, 10), (g, 10, 12), (r, 12.5, 13.8) ]
stack = []
start = 13.8
# Stack is empty and nothing is past 13.8 - We're done.
answered May 29, 2014 at 4:34
4
  • what do you mean by "anything else on the way to it onto the stack" ? Commented Jan 15, 2019 at 12:17
  • 1
    @Guillaume07 Anything of ranks between the current and the chosen next start. The sample data doesn't show it, but imagine the yellow was shifted to start before green - you have to push both green and yellow onto the stack so that when yellow ends, the end of green is still at the right place in the stack so it still shows up in the final result Commented Jan 15, 2019 at 15:18
  • Another think I don't understand, please, is why you tell firstly "If stack is empty: Look for the first color starting at or before start," then in the code sample you comment "# Stack is empty. Look for the next starting point at 0 or later". So once it is before and once it is later Commented Jan 15, 2019 at 15:36
  • 1
    @Guillaume07 Yep, a typo, the correct version is in the code block twice (the second being the comment near the bottom that starts "Stack is empty."). I've edited that bullet point. Commented Jan 16, 2019 at 4:38
3

This solution seems the simplest. (Or at least, the easiest to grasp)

All that is needed is a function to subtract two ranges. In other words, something that will give this:

A ------ A ------ A ----
B ------- and B ------ and B ---------
= ---- = ---- = --- --

Which is simple enough. Then you can simply iterate through each of the ranges, starting from the lowest, and for each one, subtract from it all the ranges above it, in turn. And there you have it.


Here is an implementation of the range subtractor in Python:

def subtractRanges((As, Ae), (Bs, Be)):
 '''SUBTRACTS A FROM B'''
 # e.g, A = ------
 # B = -----------
 # result = -- ---
 # Returns list of new range(s)
 if As > Be or Bs > Ae: # All of B visible
 return [[Bs, Be]]
 result = []
 if As > Bs: # Beginning of B visible
 result.append([Bs, As])
 if Ae < Be: # End of B visible
 result.append([Ae, Be])
 return result

Using this function, the rest can be done like this: (A 'span' means a range, as 'range' is a Python keyword)

spans = [["red", [12.5, 13.8]],
["blue", [0.0, 5.4]],
["green", [2.0, 12.0]],
["yellow", [3.5, 6.7]],
["orange", [6.7, 10.0]]]
i = 0 # Start at lowest span
while i < len(spans):
 for superior in spans[i+1:]: # Iterate through all spans above
 result = subtractRanges(superior[1], spans[i][1])
 if not result: # If span is completely covered
 del spans[i] # Remove it from list
 i -= 1 # Compensate for list shifting
 break # Skip to next span
 else: # If there is at least one resulting span
 spans[i][1] = result[0]
 if len(result) > 1: # If there are two resulting spans
 # Insert another span with the same name
 spans.insert(i+1, [spans[i][0], result[1]])
 i += 1
print spans

This gives [['red', [12.5, 13.8]], ['blue', [0.0, 2.0]], ['green', [2.0, 3.5]], ['green', [10.0, 12.0]], ['yellow', [3.5, 6.7]], ['orange', [6.7, 10.0]]], which is correct.

answered May 29, 2014 at 5:41
2
  • Your output at the end doesn't match expected output in the question... Commented Dec 23, 2016 at 3:43
  • @Izkata Gosh, I was careless. That must have been the output from another test. Fixed now, thanks Commented Dec 23, 2016 at 4:12
2

If the data really is similar in scope to your sample data, you could create a map like this:

map = [0 .. 150]
for each color:
 for loc range start * 10 to range finish * 10:
 map[loc] = color

Then just walk through this map to generate the ranges

curcolor = none
for loc in map:
 if map[loc] != curcolor:
 if curcolor:
 rangeend = loc / 10
 make new range
 rangecolor = map[loc]
 rangestart = loc / 10

To work, the values have to be in a relatively small range as in your sample data.

Edit: to work with true floats, use the map to generate a high level mapping and then refer to the original data to create the boundaries.

map = [0 .. 15]
for each color:
 for loc round(range start) to round(range finish):
 map[loc] = color
curcolor = none
for loc in map
 if map[loc] != curcolor:
 make new range
 if loc = round(range[map[loc]].start) 
 rangestart = range[map[loc]].start
 else
 rangestart = previous rangeend
 rangecolor = map[loc]
 if curcolor:
 if map[loc] == none:
 last rangeend = range[map[loc]].end
 else
 last rangeend = rangestart
 curcolor = rangecolor
answered May 29, 2014 at 2:39
2
  • This is a very nice solution, I have come across it before. However, I am looking for an more generic solution that can manage any arbitrary float ranges... (this wouldn't be the best for something like 563.807 - 770.100) Commented May 29, 2014 at 2:52
  • 1
    I think you could generalize it by rounding the values, and generating the map, but marking a location on the edges as having two colors. Then when you see a location with two colors, go back to the original data to determine the boundary. Commented May 29, 2014 at 3:10
2

Here's a relatively straightforward solution in Scala. It shouldn't be too difficult to port to another language.

case class Range(name: String, left: Double, right: Double) {
 def overlapsLeft(other: Range) =
 other.left < left && left < other.right
 def overlapsRight(other: Range) =
 other.left < right && right < other.right
 def overlapsCompletely(other: Range) =
 left <= other.left && right >= other.right
 def splitLeft(other: Range) = 
 Range(other.name, other.left, left)
 def splitRight(other: Range) = 
 Range(other.name, right, other.right)
}
def apply(ranges: Set[Range], newRange: Range) = {
 val left = ranges.filter(newRange.overlapsLeft)
 val right = ranges.filter(newRange.overlapsRight)
 val overlaps = ranges.filter(newRange.overlapsCompletely)
 val leftSplit = left.map(newRange.splitLeft)
 val rightSplit = right.map(newRange.splitRight)
 ranges -- left -- right -- overlaps ++ leftSplit ++ rightSplit + newRange
}
val ranges = Vector(
 Range("red", 12.5, 13.8),
 Range("blue", 0.0, 5.4),
 Range("green", 2.0, 12.0),
 Range("yellow", 3.5, 6.7),
 Range("orange", 6.7, 10.0))
val flattened = ranges.foldLeft(Set.empty[Range])(apply)
val sorted = flattened.toSeq.sortBy(_.left)
sorted foreach println

apply takes in a Set of all the ranges that have already been applied, finds the overlaps, then returns a new set minus the overlaps and plus the new range and the newly split ranges. foldLeft repeatedly calls apply with each input range.

answered May 29, 2014 at 17:52
0

Just keep a set of ranges sorted by start. Add range that covers everything (-oo..+oo). To add a range r:

let pre = last range that starts before r starts
let post = earliest range that starts before r ends
now iterate from pre to post: split ranges that overlap, remove ranges that are covered, then add r
answered May 29, 2014 at 4:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.