If I have a string:
let str = "Hello world"
It seems quite reasonable to be able to extract a character:
let thirdChar = str[3]
However, that's not legal. Instead, I have to use the extremely obtuse syntax:
let thirdChar = str[str.index(str.startIndex, offsetBy: 2)]
Similarly, why isn't str[0..<3]
or str[0...2]
legal? The intent is clear.
It's easy enough to create extensions to String that support those expressions:
extension String {
//Allow string[Int] subscripting
subscript(index: Int) -> Character {
return self[self.index(self.startIndex, offsetBy: index)]
}
//Allow open ranges like `string[0..<n]`
subscript(range: Range<Int>) -> Substring {
let start = self.index(self.startIndex, offsetBy: range.lowerBound)
let end = self.index(self.startIndex, offsetBy: range.upperBound)
return self[start..<end]
}
//Allow closed integer range subscripting like `string[0...n]`
subscript(range: ClosedRange<Int>) -> Substring {
let start = self.index(self.startIndex, offsetBy: range.lowerBound)
let end = self.index(self.startIndex, offsetBy: range.upperBound)
return self[start...end]
}
}
Shouldn't that be part of the language? Seems like a no-brainer.
1 Answer 1
Swift strings are made of characters. Characters can be made of any number of Unicode code points. If you have a very long string, and you want to access the one-millionth Character, you'd have to traverse the whole one million Character string, because you have no idea how many codepoints you have.
For very good reasons, Apple doesn't want to make highly inefficient functionality part of the language. And why would you want to get the third character of "Hello, world"? In practice, that's not something you ever want. You might want "the characters before the comma", which you get by using character ranges.
-
2The code in my example is exactly that, a trivial example. I understand that graphemes in Unicode can be made up of 1, 2, 3 or more Unicode code points, and that for a million-character string, it would be inefficient to try to calculate the 987,654th grapheme. However, it is MUCH, MUCH more common to need to parse a couple of hundred characters, and get the 12th through the 20th, and the 21st through the 25th characters.Duncan C– Duncan C2017年12月10日 01:46:18 +00:00Commented Dec 10, 2017 at 1:46
-
2For that very common case what I'm asking about would be extremely useful and save 100,000 separate developers from having to write and debug the same code. Documenting the time complexity and warning that a given interface is best avoided for very long strings seems like a fair compromise to me.Duncan C– Duncan C2017年12月10日 01:46:28 +00:00Commented Dec 10, 2017 at 1:46
-
How would the compiler know that you were going to apply it to very long strings?user1118321– user11183212017年12月10日 03:45:45 +00:00Commented Dec 10, 2017 at 3:45
-
1The compiler might not. That would be a burden for the developer, just like you shouldn't write a bubble sort for a list of a million items.Duncan C– Duncan C2017年12月10日 13:54:22 +00:00Commented Dec 10, 2017 at 13:54
-
1The subscript operator is documented to have a read complexity of O(1). The String type cannot provide that guarantee so it cannot have that operator. It doesn't matter how common inefficient code is.Daniel T.– Daniel T.2017年12月11日 02:10:57 +00:00Commented Dec 11, 2017 at 2:10
str[3]
would be misleading, because such an access cannot be made in constant time. And how does Swift define a character? Is it an Unicode code point? A visual character (grapheme) may be assembled from multiple code points. What happens if your slice includes combining characters but not the base character? Text is complicated. Not offering easy solutions that are likely to be wrong is good design.