Can you think of a reason the following extension should not be used in production, or a better way of implementing it:
public extension Range {
@warn_unused_result
public func intersect(other: Range) -> Range {
let ds = startIndex.distanceTo(other.startIndex)
let de = startIndex.distanceTo(other.endIndex)
let s = ds <= 0 ? startIndex : startIndex.advancedBy(ds, limit: endIndex)
let e = de <= 0 ? startIndex : startIndex.advancedBy(de, limit: endIndex)
return s..<e
}
}
The method should pass the following assertions:
(5...7).intersect(9...9) == (8..<8) // ....|||.---..
(5...7).intersect(1...3) == (5..<5) // ....---.|||..
(5...7).intersect(8...9) == (8..<8) // ....|||---...
(5...7).intersect(1...4) == (5..<5) // ....---|||...
(5...7).intersect(7...9) == (7..<8) // ....||+--....
(5...7).intersect(1...5) == (5..<6) // ....--+||....
(5...7).intersect(6...9) == (6..<8) // ....|++-.....
(5...7).intersect(1...6) == (5..<7) // ....-++|.....
(5...7).intersect(6...6) == (6..<7) // ....|+|......
(5...7).intersect(4...8) == (5..<8) // ....-+++-....
Note the first four assertions with an empty intersection: the method is currently returning startIndex..<startIndex
if the other
range is on the "left", or endIndex..<endIndex
if it is on the "right". Is this a reasonable behaviour?
2 Answers 2
There is a general problem: The
distanceTo(_: Self) -> Self.Distance
of the
ForwardIndexType
protocol requires that
start
andend
are part of the same sequence when conforming toRandomAccessSequenceType
.
end
is reachable fromself
by incrementation otherwise.
(and RandomAccessSequenceType
is probably a typo and should be
RandomAccessIndexType
).
Here is a simple example where this conditions are not met, and your code crashes:
let r1 = "fπ©π°oo".rangeOfString("π©π°")!
print(r1) // 1..<5
let r2 = "baπ¨π°r".rangeOfString("π¨π°")!
print(r2) // 2..<6
let r = r1.intersect(r2) // fatal error: cannot increment endIndex
Incrementing the start index of r1
never hits the start index
of r2
.
Here is another example:
struct MyIndex: ForwardIndexType {
let value: Int
func successor() -> MyIndex {
return MyIndex(value: self.value + 1)
}
}
func == (lhs: MyIndex, rhs: MyIndex) -> Bool {
return lhs.value == rhs.value
}
is an index which can only be incremented. Then
let r1 = Range(MyIndex(value: 4) ... MyIndex(value: 6))
let r2 = Range(MyIndex(value: 3) ... MyIndex(value: 5))
let r = r1.intersect(r2)
will loop forever, because incrementing MyIndex(value: 4)
never reaches MyIndex(value: 3)
.
So intersecting ranges makes only sense if both ranges refer
to the same sequence. I don't know if that can be ensured at
compile time with a suitable restriction, I assume that it is
not possible. It should at least be documented (similar to the
above mentioned requirements of ForwardIndexType
).
You could restrict the method to ranges of integer elements
public extension Range where Element : IntegerType { ... }
but of course that reduces its usability considerably.
The implementation itself looks good to me. The behavior in the case
of an empty intersection it a sensible choice. Is similar to that of the clamp()
method of ClosedInterval
:
ClosedInterval(3, 4).clamp(ClosedInterval(5, 6)) // 4 ... 4
ClosedInterval(3, 4).clamp(ClosedInterval(1, 2)) // 3 ... 3
which returns the left or right bound, depending on wether the other interval lies on the left or on the right.
If your ranges just represent intervals and do not refer to indices
of sequences, then ClosedInterval
(and the existing clamp()
method) might be an alternative.
-
\$\begingroup\$ Did you see this:
r1.contains(r2.startIndex) // false
, wherer1
andr2
are from your range of string example? Again, thanks for drawing my attention to this! \$\endgroup\$Milos– Milos2016εΉ΄04ζ19ζ₯ 18:37:22 +00:00Commented Apr 19, 2016 at 18:37
Martin's excellent insight into the nature of the problem got me thinking if we could do any better than extension Range where Element : IntegerType
. Here is one possibility:
public extension Range where Element : Comparable {
@warn_unused_result
public func intersect(other: Range) -> Range {
guard endIndex > other.startIndex else {
return endIndex..<endIndex
}
guard other.endIndex > startIndex else {
return startIndex..<startIndex
}
let s = other.startIndex > startIndex ? other.startIndex : startIndex
let e = other.endIndex < endIndex ? other.endIndex : endIndex
return s..<e
}
}
This at least passes Martin's range of string test.
Of course, when working with strings we must take the usual precautions since:
let s1 = "fπ©π°oo"
let s2 = "baπ¨π°r"
let r1 = s1.characters.indices // 0..<7
r1.count // 4
s1.utf16.count // 7
let i = r1.startIndex // 0
i.successor().successor() // 5 (i therefore holds reference to s1!!!)
This means that even though the intersect
method does what it's supposed to:
import Foundation
let flag1r = s1.rangeOfString("π©π°")! // 1..<5
let flag2r = s2.rangeOfString("π¨π°")! // 2..<6
let intersection = flag1r.intersect(flag2r) // 2..<5
The result may still be surprising if you then use it to:
s1[flag1r] // "π©π°"
s1[flag2r] // "οΏ½π°o"
s1[intersection] // "οΏ½π°"
This is because:
let u = "π©π°".unicodeScalars
.map{ "\\u{\(String(0γγ«.value, radix: 16))}" }
.joinWithSeparator("")
print(u) // \u{1f1e9}\u{1f1f0}
"\u{1f1e9}\u{1f1f0}" // π©π°
"\u{1f1e9}" // π©
"\u{1f1f0}" // π°
Personally, I believe that the String API should guaranty that the following is always true:
string.characters.indices.description == (0..<string.characters.count).description
... and NOT, as it is currently the case:
string.characters.indices.description == (0..<string.utf16.count).description
... simply because Character
in swift represents extended grapheme clusters, not utf code points.
The fundamental problem, though, which goes beyond String API, is that the collection index is not currently a scalar value, but also holds a reference to its collection! Swift 3.0 is set to completely rethink indexing throughout the language, so we are yet to see whether we will no longer have to deal with these complexities (unless we explicitly want to).
Note, however, that this particular issue does not (in itself) make the intersect
method on ranges wrong, though it does require the same care that rangeOfString
requires when you are using it to cast a view into a collection whose index type has a dodgy sense of successor()
.
-
\$\begingroup\$
Element : Comparable
is a good idea, I did not think of that. I think that it can be simplified (using less comparisons). If you want this code reviewed again then you can post it as a follow-up question instead of an answer (compare codereview.stackexchange.com/help/someone-answers for the various options). \$\endgroup\$Martin R– Martin R2016εΉ΄04ζ19ζ₯ 19:07:41 +00:00Commented Apr 19, 2016 at 19:07 -
\$\begingroup\$ Sorry for not responding sooner (I was in transit). Yeah, I typed that out in a hurry with half a brain. I suppose this one is just too easy to see and doesn't really deserve another code review. Would you mind if I update this answer with a more efficient code? \$\endgroup\$Milos– Milos2016εΉ΄04ζ19ζ₯ 20:04:10 +00:00Commented Apr 19, 2016 at 20:04
-
\$\begingroup\$ Why should I? It is your question and self-answers are welcome. Just make sure to meet the standards for an answer: describe what you changed and why. \$\endgroup\$Martin R– Martin R2016εΉ΄04ζ19ζ₯ 20:14:55 +00:00Commented Apr 19, 2016 at 20:14
-
\$\begingroup\$ The value of intersecting "unrelated" ranges is still questionable. In the string example,
r
has no meaning for either of the strings. \$\endgroup\$Martin R– Martin R2016εΉ΄04ζ20ζ₯ 05:31:06 +00:00Commented Apr 20, 2016 at 5:31 -
\$\begingroup\$ Certainly... I appended to the answer some thoughts on the topic. \$\endgroup\$Milos– Milos2016εΉ΄04ζ20ζ₯ 08:20:50 +00:00Commented Apr 20, 2016 at 8:20