This is Leetcode problem 139: "Wordbreak"
Problem: Given a non-empty string
s
and a dictionarywordDict
containing a list of non-empty words, determine ifs
can be segmented into a space-separated sequence of one or more dictionary words.
Note:
The same word in the dictionary may be reused multiple times in the segmentation. You may assume the dictionary does not contain duplicate words.
Example:
- Input: s = "applepenapple", wordDict = ["apple", "pen"]
- Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple". Note that you are allowed to reuse a dictionary word. my solution
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
"""
"""
def prune(i):
if i == len(s):
return True
elif i >len(s):
return False
else:
for word in wordDict:
m = len(word)
if s[i:i+m] == word and prune(i+m):
return True
return False
wordDict.sort(key = lambda x:-len(x))
return prune(0)
I believe the time complexity for my solution to be O(m^n), where m is the amount of words in wordDict
and n is the length of the string s
. Space wise is just the recursion stack so O(n). I'm having a hard time figuring out how to optimize my solution by either using bfs/dfs or dynamic programming. What would I memoize?
-
2\$\begingroup\$ Include the problem description in your post. Don’t make reviewers search for what "Leetcode 139" is. \$\endgroup\$AJNeufeld– AJNeufeld2019年09月16日 21:53:06 +00:00Commented Sep 16, 2019 at 21:53
-
\$\begingroup\$ As AJNeufeld indicated, There are always people lazier than you are (no offense meant) and I'm one of these people who will not make that effort to try to figure out what this thing above is about. I'm not an expert and I might not be writing some review on this code (and I might) and certainly I won't given the absence of the description. And just a side note: the use of classes is not meant for a personal style (no offense once more) but if there is no need for using a class, use a regular function and that would be enough in this case. \$\endgroup\$user203258– user2032582019年09月16日 23:11:34 +00:00Commented Sep 16, 2019 at 23:11
-
\$\begingroup\$ @AJNeufeld hey, thanks for pointing that out! Sorry I am still new to this community i did not know. \$\endgroup\$neet– neet2019年09月16日 23:45:56 +00:00Commented Sep 16, 2019 at 23:45
-
\$\begingroup\$ @EmadBoctor on LeetCode the template they have you use is a class by default, this is not a choice by OP \$\endgroup\$C.Nivs– C.Nivs2019年09月18日 13:32:58 +00:00Commented Sep 18, 2019 at 13:32
-
\$\begingroup\$ @ C.Nivs yeah, recently this template started to become more common than usual I was wondering why. Thanks for letting me know anyway. \$\endgroup\$user203258– user2032582019年09月18日 13:57:26 +00:00Commented Sep 18, 2019 at 13:57
2 Answers 2
"""DocStrings"""
+1 for adding a doc string to the wordBreak()
method, but -1 for not having any content in it.
PEP-008
Your code diverges from the PEP-008 guidelines in several areas:
- Use
snake_case
for functions and variables, notmixedCase
. - Use a single space around operators (
elif i > len(s):
), not 2 spaces, and then no spaces. - Add a blank line between the inner
prune()
function and the outer function, as well as between theclass Solution
and its first method.
Use pylint
or pyflakes
or similar to ensure your code follows proper conventions.
Variable Names
Variable names like s
are very terse, but since it is given that name in the question, is forgivable.
Variable names like wordDict
are misleading, and should be avoided! wordDict
is actually a list
, not a dict
. Although that name was given in the question, you should correct it to word_list
, or simply words
to avoid confusion. (+1 for the type hint; it helps avoid some confusion. But rename the parameter anyway.)
i
is acceptable as a loop index, but there is no obvious loop. idx
or pos
may improve clarity.
m
is unacceptable. It in no way suggests it is the length of a word. Use a more descriptive variable name.
Dead Code
elif i > len(s):
return False
At what point will this branch every be taken? The only way is if prune(i+m)
is called when i+m
is greater than len(s)
. But if m
is the length of word
, and s[i:i+m] == word
is true, it is impossible for i+m
to exceed len(s)
. This branch is dead code, and can be removed.
Algorithmic Improvements
You are sorting your "dictionary" by length, so that you check the longest words first. However, this doesn't actually guarantee any speed improvements. At any step, if the next possible word is the shortest one, you've guaranteed it will be the last one checked. Ditch the sort.
What you want to do is reduce the number of words from the "dictionary" you are testing at each step. One obvious way would be to turn the "dictionary" into an actual dict
, say, storing the list of words which start with a given letter:
word_dict = {
"a": ["apple"],
"p": ["pen"],
}
Or more programmatically:
word_dict = defaultdict(list, ((word[:1], word) for word in wordDict))
Then at each step, instead of comparing s[i:i+len(word)]
to each and every word
in wordDict
, you would compare to the list of words in word_dict[s[i:i+1]]
, which should be significantly smaller.
This method will degenerate if all the dictionary words start with the same letter (["aaaa", "aaab", "aaac", "aaad"]
), but you could handle this by storing the dictionary as a tree structure instead.
Another alternative would be to separate wordDict
into sets of words of different lengths:
words_by_length = { 3: { "pen" }, 5: { "apple" } }
Then, at each stage, you could do a single set
containment test for each word length:
for word_length, word_list in words_by_length.items():
if s[i:i+word_length] in words_list:
# ...
This task can be accomplished using a regular expression.
As an example, for the input wordDict = ["apple", "pen"]
, one can build a regular expression
r"(apple|pen)+"
, and then match it against the entire input string "applepenapple"
(use re.fullmatch
, not re.match
). The underlying algorithm for regular expression matching is based on a finite automata and is very efficient.
-
\$\begingroup\$ You have presented an alternate solution, but have not provided any review of the OP's code. From the Help Center/Answering "Every answer must make at least one insightful observation about the code in the question. Answers that merely provide an alternate solution with no explanation or justification do not constitute valid Code Review answers and may be deleted." \$\endgroup\$AJNeufeld– AJNeufeld2019年09月17日 05:51:21 +00:00Commented Sep 17, 2019 at 5:51
-
\$\begingroup\$ The OP explicitly asked for an improved solution but not a full review. So I did not attempt to provide it. \$\endgroup\$GZ0– GZ02019年09月17日 05:53:00 +00:00Commented Sep 17, 2019 at 5:53
Explore related questions
See similar questions with these tags.