Length of Longest Substring Without Repeating Characters
Problem (from Leetcode)
Given a string, find the length of the longest substring without repeating characters.
Examples
Given abcabcbb
, the answer is abc
, which the length is 3
.
bbbbb
, the answer is b
, with the length of 1
.
Given pwwkew
, the answer is wke
, with the length of 3
. Note that the answer must be a substring, pwke
is a subsequence and not a substring.
Discussion
Approach
- Keep track of the substring characters in a
Set
. - Keep track of the evaluated ("seen") characters in a
Queue
. - Iterate through the characters in the input
String
.- If the substring characters include the current character
- Iterate through the head of the seen characters
Queue
until you find a match in theQueue
. - For each non-matching character from the
Queue
, remove it from theQueue
and remove it from theSet
. - Once you find a matching character, remove it from the head of the
Queue
and add it to the tail of theQueue
.
- Iterate through the head of the seen characters
- Else
- Add the current character to the
Set
- Add the current character to the tail of the
Queue
.
- Add the current character to the
- If the
size
of theSet
is greater than the current longest substring length, replace the current longest substring length with theSet
size
.
- If the substring characters include the current character
General Questions
- Naming sucks - open to a better name - there's gotta be something better than
LongestSubstringWithoutRepeatingCharactersLengthIdentifier
...right? - Is there a cleaner implementation that uses other data structures?
Implementation
import java.util.HashSet;
import java.util.Queue;
import java.util.Set;
import java.util.concurrent.LinkedBlockingQueue;
public class LongestSubstringWithoutRepeatingCharactersLengthIdentifier {
public static int identify(String s) {
int longestSubstringLength = 0;
Set<Character> substringCharacters = new HashSet<>();
Queue<Character> seenCharacters = new LinkedBlockingQueue<>();
for (char currentCharacter : s.toCharArray()) {
if (substringCharacters.contains(currentCharacter)) {
while (seenCharacters.peek() != currentCharacter) {
substringCharacters.remove(seenCharacters.poll());
}
seenCharacters.poll();
} else {
substringCharacters.add(currentCharacter);
}
seenCharacters.add(currentCharacter);
if (substringCharacters.size() > longestSubstringLength) {
longestSubstringLength = substringCharacters.size();
}
}
return longestSubstringLength;
}
}
3 Answers 3
Advice 1
Queue<Character> seenCharacters = new LinkedBlockingQueue<>();
java.util.concurrent.LinkedBlockingQueue
is a concurrent data structure; I suggest you use java.util.ArrayDeque
. Also, I think the better name would be characterWindow
.
Opinion 1
Set<Character> substringCharacters = new HashSet<>();
I would rename the above to substringCharacterFilter
.
Opinion 2
if (substringCharacters.size() > longestSubstringLength) {
longestSubstringLength = substringCharacters.size();
}
I think the above would be more readable like this:
if (longestSubstringLength < substringCharacterFilter.size()) {
longestSubstringLength = substringCharacterFilter.size();
}
Alternative implementation
public static int identify(String s) {
int longestSubstringLength = 0;
Set<Character> substringCharacterFilter = new HashSet<>();
Deque<Character> characterWindow = new ArrayDeque<>();
for (char currentCharacter : s.toCharArray()) {
if (substringCharacterFilter.contains(currentCharacter)) {
while (characterWindow.getFirst() != currentCharacter) {
substringCharacterFilter.remove(characterWindow.removeFirst());
}
characterWindow.removeFirst();
} else {
substringCharacterFilter.add(currentCharacter);
}
characterWindow.addLast(currentCharacter);
if (longestSubstringLength < substringCharacterFilter.size()) {
longestSubstringLength = substringCharacterFilter.size();
}
}
return longestSubstringLength;
}
LinkedBlockingQueue
is not the best choice. It's based on a linked list, which lacks spacial locality. An continuous array-backed queue would be more efficient (there're a few options in the standard library).if (substringCharacters.size() > longestSubstringLength) { longestSubstringLength = substringCharacters.size(); }
I'd use the
Math.max
method here.You can avoid using the queue altogether. You can keep the leftmost index of the current "window" and the set instead.
Queue<Character> seenCharacters = new LinkedBlockingQueue<>();
Why the concurrent version? If you really want this, just use the regular version.
Queue<Character> seenCharacters = new LinkedList<>();
It's not like the method is reentrant. There's no state for threading to confuse.
I'm ignoring the question of whether LinkedList
or ArrayDeque
is better here. Because ...
The truth is that you don't need it. You're just duplicating the string. Instead, iterate over the string by index.
for (int i = 0; i < s.length(); i++) {
char currentCharacter = s.charAt(i);
Also change the Set
to a Map
(or even an array).
Set<Character> substringCharacters = new HashSet<>();
becomes
Map<Character, Integer> lastSeen = new HashMap<>();
and add
int lastDuplicate = 0;
Leading to
if (s.isEmpty()) {
return 0;
}
int longestSubstringLength = 1;
int lastDuplicateIndex = -1;
Map<Character, Integer> lastSeen = new HashMap<>();
lastSeen.put(s.charAt(0), 0);
for (int i = 1; i < s.length(); i++) {
char current = s.charAt(i);
Integer lastIndex = lastSeen.get(current);
if (lastIndex != null && lastDuplicateIndex < lastIndex) {
lastDuplicateIndex = lastIndex;
}
lastSeen.put(current, i);
if (longestSubstringLength < i - lastDuplicateIndex) {
longestSubstringLength = i - lastDuplicateIndex;
}
}
I used a less Hungarian scheme for naming. I.e. no types in the names.
This uses the map to track where the current character was seen last. If that's after the last duplicate, point the last duplicate index to the place where the current character was seen last. That's the new last duplicate.
I tested this on pwwkew
(3), abcabcab
(3), and bbbbb
(1).
-
\$\begingroup\$ hmmm...I'm not sure this would work for
aaa
(which should return1
but I think this method would return3
due to thelongestSubstringLength = i - lastDuplicateIndex
for the firsta
). \$\endgroup\$Jae Bradley– Jae Bradley2017年10月06日 21:17:04 +00:00Commented Oct 6, 2017 at 21:17 -
\$\begingroup\$ It works for
bbbbb
(returns 1). Note that it updateslastDuplicateIndex
for eacha
. I did make a copying error that I've fixed now. \$\endgroup\$mdfst13– mdfst132017年10月07日 01:02:50 +00:00Commented Oct 7, 2017 at 1:02
max...
instead oflongest...
? \$\endgroup\$