I created a method for finding the most common character in a string (using HashMap
):
public static char getMaxViaHashmap ( String s) {
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
char maxappearchar = ' ';
for (int i = 0; i < s.length(); i++)
{
if ( map.containsKey(s.charAt(i)))
{
map.put (s.charAt(i), map.get(s.charAt(i)) + 1 );
}
else
{
map.put (s.charAt(i), 1);
}
}
int maxValueInMap=(Collections.max(map.values())); // This will return max value in the Hashmap
for (Entry<Character, Integer> entry : map.entrySet())
{
if (entry.getValue()==maxValueInMap)
{
System.out.println("the max char is : " + entry.getKey() + " and displayed " +maxValueInMap+ " times"); // Print the key with max value
maxappearchar = entry.getKey();
}
}
return maxappearchar;
}
Questions:
- What is the performance cost (in terms of \$O()\$)?
- What is the memory cost (in terms of \$O()\$)?
-
\$\begingroup\$ Welcome to CR! Are you on Java 8? \$\endgroup\$h.j.k.– h.j.k.2015年08月17日 13:47:32 +00:00Commented Aug 17, 2015 at 13:47
-
\$\begingroup\$ hey ! yes i am. \$\endgroup\$Roey Cohen– Roey Cohen2015年08月17日 14:18:31 +00:00Commented Aug 17, 2015 at 14:18
5 Answers 5
Normally operations on a hashmap should be pretty much constant (O(n)=1), so it's something like O(n) = n + 2*m (number of characters in the String plus twice the amount of different characters in the string, since you iterate twice over the map to find the max and the corresponding character).
Memory should also be pretty much linear, since you are storing one int per different character.
As Collections.max
iterates once over the map, iterating again over it to get the corresponding character is pretty much wasted. You could remove the Collections.max
call and start by iterating over the entries and remember the one with the highest count, thus saving the whole Collections.max
iteration ( getting a O(n) = n + m runtime).
Your hashmap will probably need to resize itself once or twice (at least). You can reduce the need for that by choosing a good start size for your hashmap (aprox. the number of different characters in your string, which depends on the language), but normally that shouldn't be that much gain (depending on the length of the sentence and the amount of different characters in the sentence).
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
You'd normally declare this as the interface. That way if you want to change the implementation, you only have to do so in one place.
Map<Character, Integer> map = new HashMap<>();
Also, you don't need to repeat the types in the map in the latest Java. The <>
tells the compiler to figure it out for you.
char maxappearchar = ' ';
You don't need to declare this at the beginning of the method. You can move it between the two for
loops, since it's only used in the second loop. And you can do without it entirely if you prefer (see below).
for (int i = 0; i < s.length(); i++)
You don't have to manually maintain an index variable. You can just say
for (char c : s.toCharArray())
It's also easier to write c
than s.charAt(i)
.
if ( map.containsKey(s.charAt(i))) { map.put (s.charAt(i), map.get(s.charAt(i)) + 1 ); } else { map.put (s.charAt(i), 1); }
It's not necessary to do both containsKey
and get
. You can just say
Integer count = map.get(c);
if ( count == null )
{
count = 0;
}
map.put(c, count + 1);
This also simplifies to just one put
line.
int maxValueInMap=(Collections.max(map.values())); // This will return max value in the Hashmap for (Entry<Character, Integer> entry : map.entrySet()) { if (entry.getValue()==maxValueInMap) { System.out.println("the max char is : " + entry.getKey() + " and displayed " +maxValueInMap+ " times"); // Print the key with max value maxappearchar = entry.getKey(); } } return maxappearchar;
As previously mentioned you can write this without maxappearchar
.
int maxValueInMap = Collections.max(map.values());
for (Entry<Character, Integer> entry : map.entrySet())
{
if (entry.getValue() == maxValueInMap)
{
System.out.println("the max char is : " + entry.getKey() + " and displayed " +maxValueInMap+ " times");
return entry.getKey();
}
}
return ' ';
This also does fewer iterations of the loop, as it will exit early. This won't help you in big O terms, but it may cut the actual run time slightly. I'm not sure whether getting the maximum value manually would beat that or not. You'd have to iterate through the entire loop and do a comparison on each iteration, but getting the maximum requires that already.
It also may be important to you to print all the characters with the maximum number of appearances. This only prints the first one encountered. Both versions only return one character.
Consider having the method return a Character
instead of a char
. That would allow you to return null
on an empty string. As it is, you have it returning a space in that case, which is also a possible answer.
Since you are on Java 8, this will be an ideal exercise for getting familiarized with the new stream-based processing approach. For example, instead of using an explicit for
-loop:
for (int i = 0; i < s.length(); i++) {
if (map.containsKey(s.charAt(i))) {
// ...
}
}
You can call String.chars()
to give you an IntStream
of 'int
zero-extending the char
values', or in other words a stream of characters.
Then, you only need to call Stream.collect()
with a combination of Collectors.groupingBy()
and Collectors.counting()
to give you the resulting Map
instance with the count-per-character:
Map<Character, Long> result = input.chars().boxed().collect(Collectors.groupingBy(
c -> Character.valueOf((char) c.intValue()),
Collectors.counting()));
For the first argument of Collectors.groupingBy()
, we are mapping our Integer
objects (which have been boxed()
from the IntStream
) to Character
instances.
Again, instead of looping on result.entrySet()
, we can apply another set of stream operations on it. First, we can define a Comparator
that will compare an Entry
object by the value:
private static final Comparator<Entry<? extends Object, Long>> RANK_BY_VALUE = Comparator
.comparingLong(Entry::getValue);
Then, we will sort the stream of map entries with the reverse of this Comparator
so that the characters with the most occurrences are ranked first:
Entry<Character, Long> entry = result.entrySet().stream()
.sorted(RANK_BY_VALUE.reversed()).findFirst().get();
Putting all these together into its own method:
private static Entry<Character, Long> getMaxOccurrenceCharacter(String input) {
return input.chars().boxed().collect(
Collectors.groupingBy(
c -> Character.valueOf((char) c.intValue()),
Collectors.counting())).entrySet().stream()
.sorted(RANK_BY_VALUE.reversed()).findFirst().get();
}
And here's an example of how you can make use of this method:
public static void main(String[] args) {
Entry<Character, Long> result = getMaxOccurrenceCharacter("abc");
System.out.printf("The character \"%s\" is repeated %d time(s).\n",
result.getKey(), result.getValue());
}
// result
The character "a" is repeated 1 time(s).
Other pointers from your code:
- Java's bracing convention is to put the opening brace on the same line, instead of what you have used (mostly). More importantly, have a consistent convention in your codebase. :) If you are more inclined to your current way, then do so throughout.
HashMap<Character, Integer> map
should be declared using theMap
interface, for simplification.- The method name
getMaxViaHashmap()
can be improved upon, as there should not be a need to tell callers how the derivation is done (ViaHashmap
). - If your method is meant to be used solely for the derivation, it should not print output to
System.out
. From my example above, I get the results out first, then display it to the user. Even if you want to do the equivalent of this manually:
if (map.containsKey(key)) { map.put(key, map.get(key) + newValue); } else { map.put(key, newValue); }
There are the newer
Map.compute()
orMap.merge()
methods to help you. For example:// input being the String Map<Character, Long> result = new HashMap<>(); for (int i = 0; i < input.length(); i++) { result.merge(Character.valueOf(input.charAt(i)), 1L, (a, b) -> a + b); }
edit: @Misha's answer is a welcome improvement to mine, do take a look at that too.
-
2\$\begingroup\$ I had more suggestions than I could fit into a comment, so I added a separate answer codereview.stackexchange.com/a/101239/70389. See if you agree. \$\endgroup\$Misha– Misha2015年08月18日 00:21:37 +00:00Commented Aug 18, 2015 at 0:21
-
\$\begingroup\$ @Misha your improved answer looks good! :) \$\endgroup\$h.j.k.– h.j.k.2015年08月18日 00:47:45 +00:00Commented Aug 18, 2015 at 0:47
There is already a Java 8 answer (https://codereview.stackexchange.com/a/101202/70389) but I think there are a number of improvements that can be made to it (more than can fit in a comment)
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;
import static java.util.Map.Entry.comparingByValue;
char mostFrequent = s.chars()
.mapToObj(x -> (char) x) // box to Character
.collect(groupingBy(x -> x, counting())) // collect to Map<Character, Long>
.entrySet().stream()
.max(comparingByValue()) // find entry with largest count
.get() // or throw if source string is empty
.getKey(); // get Character key of the selected entry
If you want to throw a different exception or supply a message, replace .get()
with something like .orElseThrow(() -> new NoSuchElementException("input string empty"))
. x -> x
can be also expressed as Function.identity()
if you find that more readable.
Relative to the other answer, I made the following changes:
- It is unnecessary to first box to
Integer
withboxed()
and then rebox toCharacter
inside the collector. Better box toCharacter
right away withmapToObj(x -> (char) x)
- It is customary to statically import collectors for brevity
- It is better for both CPU and memory to use
.max(...)
rather than.sorted(...).findFirst()
. - Map.Entry has static methods for comparators by value and by key.
Additionally, it's important to keep in mind that processing a String
as a sequence of char
will not work for certain Unicode characters as some of them take up 2 chars. To be fully Unicode-friendly, you will need to use .codePoints()
and operate on int
s rather than char
s
-
2\$\begingroup\$ (makes mental note of
Map.Entry.comparingByValue()
from now on) \$\endgroup\$h.j.k.– h.j.k.2015年08月18日 00:49:18 +00:00Commented Aug 18, 2015 at 0:49
Gathered all in one place.
Using Google Guava's « com.google.common.collect.Multiset<Character> Interface to find repeating Character's.
String str = "how are you doing today";
Multiset<Character> set = HashMultiset.create();
for (char c : str.toCharArray()) {
set.add( c );
}
for( Multiset.Entry<Character> entry : set.entrySet() ) {
System.out.println("Element :-> "+entry.getElement()+" Repeat Cnt :-> "+entry.getCount());
}
java8 « usage of Map.merge function
:
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
map.put(1, 1);
map.merge(1, 5, (prevValue, newValue) -> prevValue + newValue );
// key 1 is present, so new value 5 will be added to previous value 1, [K:V]=[1:6]
map.put(2, 5);
map.merge(2, 10, (prevValue, newValue) -> prevValue < newValue ? prevValue : newValue);
map.merge(1, 10, (prevValue, newValue) -> null);
// Since the BiFunction results in null, the element will be removed from map
Example:
String str = "how are you doing today";
System.out.println("Using LinkedHashMap to get first most repeated char.");
Map<Character, Integer> map = new LinkedHashMap<Character, Integer>();
for (char c : str.toCharArray()) {
map.merge(c, 1, (prevValue, newValue) -> prevValue + newValue );
}
maxValueFromMap( map );
Output:
Using LinkedHashMap to get first most repeated char.
Char ['o'] repeated 4 times
Char [' '] repeated 4 times
To get max repeated Character value count from Map
use Collections.max(listofElements) or stream().max(Comparator).get()
Returns the maximum element of the given collection, according to the natural ordering of its elements. All elements in the collection must implement the Comparable interface.
Map<Character, Integer> map = new LinkedHashMap<Character, Integer>();
map.put('y', 5);
map.put('a', 7);
map.put('s', 6);
map.put('h', 7);
int maxValueInMap = Collections.max( map.values() );
System.out.println( maxValueInMap ); // 7
int maxValueInMap2 = map.values().stream().max( Comparator.comparing( e -> e) ).get();
System.out.println( maxValueInMap2 ); // 7
Example:
public static Character maxValueFromMap( Map<Character, Integer> map) {
Character mostRepeatedChar = ' ';
int maxValueInMap = Collections.max( map.values() );
for (Entry<Character, Integer> entry : map.entrySet()) {
if ( entry.getValue() == maxValueInMap ) {
System.out.format("Char ['%s'] repeated %d times\n", entry.getKey(), maxValueInMap);
mostRepeatedChar = entry.getKey();
}
}
return mostRepeatedChar;
}
using java version below 8. map.containsKey( key )
public static Map<Character, Integer> charCountFromString_Java7(String str) {
Map<Character, Integer> map = new HashMap<>();
for (char c : str.toCharArray()) {
if( map.containsKey( c ) ) {
map.put(c, map.get(c)+1 );
} else {
map.put(c, 1);
}
/*Integer count = map.get(c); // null if this map contains no mapping for the key
if ( count == null ) {
count = 0;
}
map.put(c, count + 1);*/
}
return map;
}
using java 8.
public static Map<Character, Integer> charCountFromString_Java8(String str) {
Map<Character, Integer> map = new HashMap<>();
IntStream chars = str.chars();
//Stream<Integer> boxedInt = chars.boxed(); // Returns a Stream consisting of the elements boxed to an Integer.
Stream<Character> boxedChars = chars.mapToObj( x -> (char) x );
Set<Entry<Character, Long>> enteries =
boxedChars.collect( Collectors.groupingBy( boxedChar -> boxedChar , Collectors.counting() )).entrySet();
Map<Character, Integer> mapObj =
enteries.stream().collect( Collectors.toMap( Map.Entry::getKey, entry -> entry.getValue().intValue() ) );
//mapObj.forEach( (k,v)->{ System.out.println(k+":"+v); } ); // (int)(long)entry.getValue()
map = mapObj;
return map;
}
public class MostRepeatedChar {
public static void main(String[] args) {
String str = "how are you doing today";
Map<Character, Integer> repeatedCharsCount = charCountFromString_Java8(str);
// charCountFromString_Java7(str);
maxValueFromMap( repeatedCharsCount );
}
}
Explore related questions
See similar questions with these tags.