This is a popular question on LeetCode:
76. Minimum Window Substring
Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".
The testcases will be generated such that the answer is unique.
A substring is a contiguous sequence of characters within the string.
Example:
Input: s = "ADOBECODEBANC", t = "ABC"
Output: "BANC"
Explanation: The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.
I converted the java solution provided by LeetCode to Swift since this is the language I am practicing in. Here is my code below:
func minWindowSlidingWindow(_ s: String, _ t: String) -> String
{
if s == t
{
return s
}
var uniqueCharacterHashTable: [Character: Int] = [:]
for character in t
{
if let countOfChar = uniqueCharacterHashTable[character]
{
uniqueCharacterHashTable[character] = countOfChar + 1
continue
}
uniqueCharacterHashTable[character] = 1
}
let uniqueCharactersRequired = uniqueCharacterHashTable.keys.count
var uniqueCharactersFormed = 0
var currentWindowCharacterHashTable: [Character: Int] = [:]
var minSequenceSize = Int.max
var minimumSequenceStart = 0
var minimumSequenceEnd = 0
var currentStartIndexInt = 0
var currentEndIndexInt = 0
while currentEndIndexInt < s.count
{
let endIndex = s.index(s.startIndex, offsetBy: currentEndIndexInt)
var currentCharacter = s[endIndex]
if var characterCount = currentWindowCharacterHashTable[currentCharacter]
{
characterCount += 1
currentWindowCharacterHashTable[currentCharacter] = characterCount
}
else
{
currentWindowCharacterHashTable[currentCharacter] = 1
}
if let _ = uniqueCharacterHashTable[currentCharacter],
currentWindowCharacterHashTable[currentCharacter] == uniqueCharacterHashTable[currentCharacter]
{
uniqueCharactersFormed += 1
}
while currentStartIndexInt <= currentEndIndexInt && uniqueCharactersFormed == uniqueCharactersRequired
{
let startIndex = s.index(s.startIndex, offsetBy: currentStartIndexInt)
currentCharacter = s[startIndex]
if minSequenceSize == Int.max || currentEndIndexInt - currentStartIndexInt + 1 < minSequenceSize
{
minSequenceSize = currentEndIndexInt - currentStartIndexInt + 1
minimumSequenceStart = currentStartIndexInt
minimumSequenceEnd = currentEndIndexInt
}
if let characterCountInWindow = currentWindowCharacterHashTable[currentCharacter]
{
currentWindowCharacterHashTable[currentCharacter] = characterCountInWindow - 1
}
if let _ = uniqueCharacterHashTable[currentCharacter],
let currentCharOriginalCount = uniqueCharacterHashTable[currentCharacter],
let charInWindowCount = currentWindowCharacterHashTable[currentCharacter],
currentCharOriginalCount > charInWindowCount
{
uniqueCharactersFormed -= 1
}
currentStartIndexInt += 1
}
currentEndIndexInt += 1
}
if minSequenceSize == Int.max
{
return ""
}
let startIndex = s.index(s.startIndex, offsetBy: minimumSequenceStart)
let endIndex = s.index(s.startIndex, offsetBy: minimumSequenceEnd)
return String(s[startIndex ... endIndex])
}
This works for the basic test cases and gives the desired output (as far as I know from about 10 test cases) but as the string size gets huge like 100,000 for example - it gets super slow even though I use the same data structures (I think) as suggested in the Java solution.
Can anyone point me as to where the bottleneck in this code lies and how could I optimize this further.
-
\$\begingroup\$ Seems like the paste is not yet approved, I have updated with a sample using another tool .. hope this works at your end. \$\endgroup\$Shawn Frank– Shawn Frank2021年12月02日 11:41:24 +00:00Commented Dec 2, 2021 at 11:41
1 Answer 1
The main bottleneck is how characters of a string are accessed. A Swift String
is a collection of Character
s, and a Character
represents a single extended grapheme cluster, which can be one or more Unicode scalars. That makes counting and index operations like
s.count
s.index(s.startIndex, offsetBy: currentEndIndexInt)
s.index(s.startIndex, offsetBy: currentStartIndexInt)
slow. Instead of repeatedly converting integers to string indices it is better to work with string indices directly. For example,
var currentEndIndexInt = 0
while currentEndIndexInt < s.count {
let endIndex = s.index(s.startIndex, offsetBy: currentEndIndexInt)
let currentCharacter = s[endIndex]
// ...
currentEndIndexInt += 1
}
can be replaced by
var currentEndIndex = s.startIndex
while currentEndIndex != s.endIndex {
let currentCharacter = s[currentEndIndex]
// ...
s.formIndex(after: ¤tEndIndex)
}
or even
for currentEndIndex in s.indices {
var currentCharacter = s[currentEndIndex]
// ...
}
Other simplifications: Counting the number of occurrences of the characters in a string
var uniqueCharacterHashTable: [Character: Int] = [:]
for character in t {
if let countOfChar = uniqueCharacterHashTable[character] {
uniqueCharacterHashTable[character] = countOfChar + 1
continue
}
uniqueCharacterHashTable[character] = 1
}
can be done more concisely with reduce(into:_:)
and dictionary subscripts with a default value:
let uniqueCharacterHashTable = t.reduce(into: [:]) {
0ドル[1,ドル default: 0] += 1
}
In the same spirit can
if var characterCount = currentWindowCharacterHashTable[currentCharacter] {
characterCount += 1
currentWindowCharacterHashTable[currentCharacter] = characterCount
} else {
currentWindowCharacterHashTable[currentCharacter] = 1
}
be shortened to
currentWindowCharacterHashTable[currentCharacter, default: 0] += 1
The let _ = ...
test in
if let _ = uniqueCharacterHashTable[currentCharacter],
currentWindowCharacterHashTable[currentCharacter] == uniqueCharacterHashTable[currentCharacter] { ... }
is not necessary because nil
does always compare "not equal" to a non-nil value. The test for Int.max
in
if minSequenceSize == Int.max || endIndexInt - currentStartIndexInt + 1 < minSequenceSize { ... }
is not necessary. Finally the let _ = ...
test in
if let _ = uniqueCharacterHashTable[currentCharacter],
let currentCharOriginalCount = uniqueCharacterHashTable[currentCharacter],
let charInWindowCount = currentWindowCharacterHashTable[currentCharacter],
currentCharOriginalCount > charInWindowCount { ... }
is not needed because the next line already does the optional assignment.
Putting it all together, the code looks like this:
func minWindowSlidingWindow(_ s: String, _ t: String) -> String {
if s == t {
return s
}
let uniqueCharacterHashTable = t.reduce(into: [:]) {
0ドル[1,ドル default: 0] += 1
}
let uniqueCharactersRequired = uniqueCharacterHashTable.keys.count
var uniqueCharactersFormed = 0
var currentWindowCharacterHashTable: [Character: Int] = [:]
var minSequenceSize = Int.max
var minSequenceStart = s.startIndex
var minSequenceEnd = s.startIndex
var currentStartIndex = s.startIndex
var currentWindowLength = 1
for currentEndIndex in s.indices {
var currentCharacter = s[currentEndIndex]
currentWindowCharacterHashTable[currentCharacter, default: 0] += 1
if currentWindowCharacterHashTable[currentCharacter] == uniqueCharacterHashTable[currentCharacter] {
uniqueCharactersFormed += 1
}
while currentStartIndex <= currentEndIndex && uniqueCharactersFormed == uniqueCharactersRequired {
currentCharacter = s[currentStartIndex]
if currentWindowLength < minSequenceSize {
minSequenceSize = currentWindowLength
minSequenceStart = currentStartIndex
minSequenceEnd = currentEndIndex
}
if let characterCountInWindow = currentWindowCharacterHashTable[currentCharacter] {
currentWindowCharacterHashTable[currentCharacter] = characterCountInWindow - 1
}
if let currentCharOriginalCount = uniqueCharacterHashTable[currentCharacter],
let charInWindowCount = currentWindowCharacterHashTable[currentCharacter],
currentCharOriginalCount > charInWindowCount {
uniqueCharactersFormed -= 1
}
s.formIndex(after: ¤tStartIndex)
currentWindowLength -= 1
}
currentWindowLength += 1
}
if minSequenceSize == Int.max {
return ""
}
return String(s[minSequenceStart...minSequenceEnd])
}
Instead of dummy default values
var minSequenceStart = s.startIndex
var minSequenceEnd = s.startIndex
one can also use optionals, which are only assigned a value once a valid window is found.
Otherwise your code is written clearly. I would perhaps use slightly shorter variables names at some places, but that is a matter of taste.
-
1\$\begingroup\$ If anyone wonders why the answer was written only minutes after the question is posted: The question was originally posted on Stack Overflow, so I was prepared :) \$\endgroup\$Martin R– Martin R2021年12月02日 11:07:40 +00:00Commented Dec 2, 2021 at 11:07
-
\$\begingroup\$ Thanks a lot @Martin R .. the improvements are something I would not have though on my own especially the bit about using string indices directly rather than converting the int indices to string indices - got to learn something new today. Submitting with these changes is accepted on LeetCode. Thank you ! \$\endgroup\$Shawn Frank– Shawn Frank2021年12月02日 11:52:43 +00:00Commented Dec 2, 2021 at 11:52
-
\$\begingroup\$ You are welcome. Working with string indices directly works quite well here because the characters are accessed sequentially from start to end. Another option is to convert the string to an array of characters once
let sArray = Array(s)
because array indexing is fast and done with integer indices. That needs extra memory, but can be useful if you need random access to the characters of a string. \$\endgroup\$Martin R– Martin R2021年12月02日 12:13:45 +00:00Commented Dec 2, 2021 at 12:13 -
\$\begingroup\$ @ShawnFrank: I just realized that I had answered a similar question before. Have a look at codereview.stackexchange.com/q/233978/35991 for some tips to make the code even Swiftier :) \$\endgroup\$Martin R– Martin R2021年12月02日 12:25:55 +00:00Commented Dec 2, 2021 at 12:25
-
\$\begingroup\$ Your other answer was really helpful as well. I am practicing for interviews so I keep my code a little less swifty sometimes as I tend to get confused sometimes. Finally, your suggestion for Array(s) is great as it was something I was unaware of. Really useful and thank you for taking your time to help me today .. much appreciated Martin. \$\endgroup\$Shawn Frank– Shawn Frank2021年12月02日 15:14:07 +00:00Commented Dec 2, 2021 at 15:14
Explore related questions
See similar questions with these tags.