I'm exceeding the time limit for a 10,000 word test case provided on LeetCode:
Given an array of strings, group anagrams together.
For example, given:
["eat", "tea", "tan", "ate", "nat", "bat"]
,
Return:[ ["ate", "eat","tea"], ["nat","tan"], ["bat"] ]
Note: All inputs will be in lower-case.
Any suggestions on how I could speed this up a little bit?
public HashMap<String, List<Integer>> createMap(String[] strs)
{
HashMap<String, List<Integer>> map = new HashMap<String, List<Integer>>();
for(String s: strs)
{
char[] charArray = s.toCharArray();
Arrays.sort(charArray);
String sortedString = new String(charArray);
if(map.containsKey(sortedString))
{
List<Integer> pos = map.get(sortedString);
pos.add(Arrays.asList(strs).indexOf(s));
map.put(sortedString, pos);
}
else
{
List<Integer> pos = new ArrayList<Integer>();
pos.add(Arrays.asList(strs).indexOf(s));
map.put(sortedString, pos);
}
}
return map;
}
public List<List<String>> groupAnagrams(String[] strs) {
HashMap hm = createMap(strs);
List<List<String>> anagrams = new ArrayList<List<String>>();
for(Object k: hm.keySet())
{
String key = k.toString();
ArrayList pos = (ArrayList)hm.get(key);
List<String> a = new ArrayList<String>();
for(Object p : pos)
{
int idx = Integer.parseInt(p.toString());
a.add(strs[idx]);
}
anagrams.add(a);
}
return anagrams;
}
-
4\$\begingroup\$ Please include the description of the programming-challenge into the question so the reviewers can see what you're trying to do and why. \$\endgroup\$Mast– Mast ♦2017年02月04日 08:24:45 +00:00Commented Feb 4, 2017 at 8:24
3 Answers 3
Why so slow
It's important to consider the time complexity of all the operations in your program.
Take a closer looks at this part I extracted from your code:
for(String s: strs) { // ... pos.add(Arrays.asList(strs).indexOf(s));
For each s
, converting a String[]
to a List<String>
, in order to use the indexOf
method to find the index of s
? That doesn't sound so good. How about making that a counting loop?
for (int i = 0; i < strs.length; i++) {
// ...
pos.add(i);
But why use indexes at all? Why not make this method return a Map<String, List<String>>
instead?
for (String s : strs) {
// ...
pos.add(s);
If you do that, then the implementation of the other method becomes simply:
public List<List<String>> groupAnagrams(String[] strs) {
return new ArrayList<>(createMap(strs).values());
}
Good practices
The code violates many good practices and common conventions:
- Never use raw types, for example use
HashMap<String, List<String>>
instead of justHashMap
without type parameters - Prefer to declare variables with interface types, for example
Map<String, List<String>>
instead ofHashMap<String, List<String>>
- Use the diamond operator
<>
when possible, for examplenew HashMap<>
instead ofnew HashMap<String, List<String>>
- Use consistent indentation, and please place braces they way I did in my examples above
Jianmin commented on my answer to a similar question here: Grouping anagrams and it got me thinking about the problem, and also reading this solution.
Janos has some good comments, but I wanted to point out three additional things:
computeIfAbsent
Java Map instances now have the computeIfAbsent
function. Code like:
if(map.containsKey(sortedString)) { List<Integer> pos = map.get(sortedString); pos.add(Arrays.asList(strs).indexOf(s)); map.put(sortedString, pos); } else { List<Integer> pos = new ArrayList<Integer>(); pos.add(Arrays.asList(strs).indexOf(s)); map.put(sortedString, pos); }
should instead be:
List<Integer> pos = map.computeIfAbsent(sortedString, k -> new ArrayList<>());
pos.add(Arrays.asList(strs).indexOf(s));
Of course, you should be using String values and not Integers, as Janos indicated, but the idea is to only have one call in to the Map. See the docs here: computeIfAbsent
function extraction
Your code to compute the "key" from the word, should be extracted in to a separate function. There should be a function:
public static final String keyOf(String word) {
char[] chars = word.toCharArray();
Arrays.sort(chars);
return new String(chars);
}
It clearly is separate, isolated logic, and should be maintained as such.
Streams
Streams have been around for a while, and you should become familiar with them.
I tool the liberty of "solving" the problem using streams, and the code is relatively neat, and concise:
public static final List<List<String>> groupAnagrams(Stream<String> stream) {
Map<String, List<String>> mapped = stream.collect(Collectors.groupingBy(w -> keyOf(w)));
return new ArrayList<>(mapped.values());
}
I have put that in to an ideone page here: https://ideone.com/qLiA8c (that ideone code also has code to format it "pretty" ... like the examples above).
-
\$\begingroup\$ These are all very good and helpful points. I'm not so familiar with Streams and Generics functions, but I will incorporate these changes. Thank you very much! \$\endgroup\$boltthrower– boltthrower2017年02月05日 22:21:57 +00:00Commented Feb 5, 2017 at 22:21
Make Java code more readable, ready to review
I strongly agree with code review conducted by Janos, code review is not just to share you a workable solution. You also have to learn ways to make code more readable, practice better way using Java language in your case.
Use O(N) solution to generate key, avoid \$O(NLogN)\$ Sorting algorithm
The idea to avoid timeout issue is to generate a key for each anagram group using \$O(N)\$ time complexity, instead of naive one by sorting the string. To take advantage of alphabetic number only has constant of size \26ドル\,ドル go through the string once, one char a time, to record the number of occurrence, like a counting sort. Here is the C# code, pass all test cases on leetcode online judge.
Here is the anagram hashed key algorithm in C#, most of important decision is to choose a more efficient sort - counting sort instead of comparison based sorting:
/* O(n) solution
* abc
* hashed to key:
* 01-11-11
* 0 stands for a, 1 is count of a
* further simplify the key:
* 1-1-1
* first 1 is count of a,
* second 1 is count of b,
* third 1 is count of c
*
*/
public static string ConvertHashKeys(string input)
{
if (input == null || input.Length == 0)
{
return string.Empty;
}
int[] countAlphabetic = new int[26];
foreach (char c in input)
{
countAlphabetic[c - 'a']++;
}
return String.Join("-", countAlphabetic);
}
Explore related questions
See similar questions with these tags.