Why doesn't Swift allow Int String subscripting and integer ranges directly?

Question 1

If I have a string:

let str = "Hello world"

It seems quite reasonable to be able to extract a character:

let thirdChar = str[3]

However, that's not legal. Instead, I have to use the extremely obtuse syntax:

let thirdChar = str[str.index(str.startIndex, offsetBy: 2)]

Similarly, why isn't str[0..<3] or str[0...2] legal? The intent is clear.

It's easy enough to create extensions to String that support those expressions:

extension String {
 //Allow string[Int] subscripting
 subscript(index: Int) -> Character {
 return self[self.index(self.startIndex, offsetBy: index)]
 }
 //Allow open ranges like `string[0..<n]`
 subscript(range: Range<Int>) -> Substring {
 let start = self.index(self.startIndex, offsetBy: range.lowerBound)
 let end = self.index(self.startIndex, offsetBy: range.upperBound)
 return self[start..<end]
 }
 //Allow closed integer range subscripting like `string[0...n]`
 subscript(range: ClosedRange<Int>) -> Substring {
 let start = self.index(self.startIndex, offsetBy: range.lowerBound)
 let end = self.index(self.startIndex, offsetBy: range.upperBound)
 return self[start...end]
 }
}

Shouldn't that be part of the language? Seems like a no-brainer.

Question 2

Is asking "why" on language specs still considered as "primary opinion-based" if it can have official answers?

Question 3

What is the internal encoding for strings in Swift? If it's a variable-length encoding such as UTF-8 then using a subscript operator str[3] would be misleading, because such an access cannot be made in constant time. And how does Swift define a character? Is it an Unicode code point? A visual character (grapheme) may be assembled from multiple code points. What happens if your slice includes combining characters but not the base character? Text is complicated. Not offering easy solutions that are likely to be wrong is good design.

Question 4

I think amon is right, ive seen stuff about iphone strings with null characters and multiple character characters etc

Question 5

Do you have any intention of accepting an answer here, or was this just intended as a rhetorical question / rant?

Question 6

@Alexander, thanks for the kick in the pants. I lost track of this question when it was moved from SO to SE. I just accepted gnasher's answer.

Question 7

Swift strings are made of characters. Characters can be made of any number of Unicode code points. If you have a very long string, and you want to access the one-millionth Character, you'd have to traverse the whole one million Character string, because you have no idea how many codepoints you have.

For very good reasons, Apple doesn't want to make highly inefficient functionality part of the language. And why would you want to get the third character of "Hello, world"? In practice, that's not something you ever want. You might want "the characters before the comma", which you get by using character ranges.

Question 8

The code in my example is exactly that, a trivial example. I understand that graphemes in Unicode can be made up of 1, 2, 3 or more Unicode code points, and that for a million-character string, it would be inefficient to try to calculate the 987,654th grapheme. However, it is MUCH, MUCH more common to need to parse a couple of hundred characters, and get the 12th through the 20th, and the 21st through the 25th characters.

Question 9

For that very common case what I'm asking about would be extremely useful and save 100,000 separate developers from having to write and debug the same code. Documenting the time complexity and warning that a given interface is best avoided for very long strings seems like a fair compromise to me.

Question 10

How would the compiler know that you were going to apply it to very long strings?

Question 11

The compiler might not. That would be a burden for the developer, just like you shouldn't write a bubble sort for a list of a million items.

Question 12

The subscript operator is documented to have a read complexity of O(1). The String type cannot provide that guarantee so it cannot have that operator. It doesn't matter how common inefficient code is.

gnasher729 gnasher729 49.2k4 gold badges71 silver badges137 bronze badges · Accepted Answer · 2017-12-09 22:41:15Z

12

Swift strings are made of characters. Characters can be made of any number of Unicode code points. If you have a very long string, and you want to access the one-millionth Character, you'd have to traverse the whole one million Character string, because you have no idea how many codepoints you have.

For very good reasons, Apple doesn't want to make highly inefficient functionality part of the language. And why would you want to get the third character of "Hello, world"? In practice, that's not something you ever want. You might want "the characters before the comma", which you get by using character ranges.

Share

Improve this answer

answered Dec 9, 2017 at 22:41

gnasher729's user avatar

gnasher729 gnasher729

49.2k4 gold badges71 silver badges137 bronze badges

8

2

The code in my example is exactly that, a trivial example. I understand that graphemes in Unicode can be made up of 1, 2, 3 or more Unicode code points, and that for a million-character string, it would be inefficient to try to calculate the 987,654th grapheme. However, it is MUCH, MUCH more common to need to parse a couple of hundred characters, and get the 12th through the 20th, and the 21st through the 25th characters.

Duncan C
– Duncan C

2017年12月10日 01:46:18 +00:00
Commented Dec 10, 2017 at 1:46
2

For that very common case what I'm asking about would be extremely useful and save 100,000 separate developers from having to write and debug the same code. Documenting the time complexity and warning that a given interface is best avoided for very long strings seems like a fair compromise to me.

Duncan C
– Duncan C

2017年12月10日 01:46:28 +00:00
Commented Dec 10, 2017 at 1:46
How would the compiler know that you were going to apply it to very long strings?

user1118321
– user1118321

2017年12月10日 03:45:45 +00:00
Commented Dec 10, 2017 at 3:45
1

The compiler might not. That would be a burden for the developer, just like you shouldn't write a bubble sort for a list of a million items.

Duncan C
– Duncan C

2017年12月10日 13:54:22 +00:00
Commented Dec 10, 2017 at 13:54
1

The subscript operator is documented to have a read complexity of O(1). The String type cannot provide that guarantee so it cannot have that operator. It doesn't matter how common inefficient code is.

Daniel T.
– Daniel T.

2017年12月11日 02:10:57 +00:00
Commented Dec 11, 2017 at 2:10

| Show 3 more comments

Stack Exchange Network

Why doesn't Swift allow Int String subscripting and integer ranges directly?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Why doesn't Swift allow Int String subscripting and integer ranges directly?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions