Count number of each char in a String

Question 1

I know there is a simpler way of doing this, but I just really can't think of it right now. Can you please help me out?

String sample = "hello world";
char arraysample[] = sample.toCharArray();
int length = sample.length();
//count the number of each letter in the string
int acount = 0;
int bcount = 0;
int ccount = 0;
int dcount = 0;
int ecount = 0;
int fcount = 0;
int gcount = 0;
int hcount = 0;
int icount = 0;
int jcount = 0;
int kcount = 0;
int lcount = 0;
int mcount = 0;
int ncount = 0;
int ocount = 0;
int pcount = 0;
int qcount = 0;
int rcount = 0;
int scount = 0;
int tcount = 0;
int ucount = 0;
int vcount = 0;
int wcount = 0;
int xcount = 0;
int ycount = 0;
int zcount = 0; 
for(int i = 0; i < length; i++)
{
 char c = arraysample[i];
 switch (c) 
 {
 case 'a': 
 acount++;
 break;
 case 'b': 
 bcount++;
 break;
 case 'c': 
 ccount++;
 break;
 case 'd': 
 dcount++;
 break;
 case 'e':
 ecount++;
 break;
 case 'f':
 fcount++;
 break;
 case 'g':
 gcount++;
 break;
 case 'h':
 hcount++;
 break;
 case 'i':
 icount++;
 break;
 case 'j':
 jcount++;
 break;
 case 'k':
 kcount++;
 break;
 case 'l':
 lcount++;
 break;
 case 'm':
 mcount++;
 break;
 case 'n':
 ncount++;
 break;
 case 'o':
 ocount++;
 break;
 case 'p':
 pcount++;
 break;
 case 'q':
 qcount++;
 break;
 case 'r':
 rcount++;
 break;
 case 's':
 scount++;
 break;
 case 't':
 tcount++;
 break;
 case 'u':
 ucount++;
 break;
 case 'v':
 vcount++;
 break;
 case 'w':
 wcount++;
 break;
 case 'x':
 xcount++;
 break;
 case 'y':
 ycount++;
 break;
 case 'z':
 zcount++;
 break;
 }
}
System.out.println ("There are " +hcount+" h's in here ");
System.out.println ("There are " +ocount+" o's in here ");

Question 2

Oh woah! xD It's just.. woah! What patience you have to write all those variables.

Well, it's Java so you can use a HashMap.

Write something like this:

String str = "Hello World";
int len = str.length();
Map<Character, Integer> numChars = new HashMap<Character, Integer>(Math.min(len, 26));
for (int i = 0; i < len; ++i)
{
 char charAt = str.charAt(i);
 if (!numChars.containsKey(charAt))
 {
 numChars.put(charAt, 1);
 }
 else
 {
 numChars.put(charAt, numChars.get(charAt) + 1);
 }
}
System.out.println(numChars);

We do a for loop over all the string's characters and save the current char in the charAt variable
We check if our HashMap already has a charAt key inside it
- If it's true we will just get the current value and add one.. this means the string has already been found to have this char.
- If it's false (i.e. we never found a char like this in the string), we add a key with value 1 because we found a new char
Stop! Our HashMap will contain all chars (keys) found and how many times it's repeated (values)!

Question 3

As an added bonus, this also supports more than just lower case letters.

Question 4

You might consider a TreeMap instead of a HashMap so that the data, when printed, is ordered by the collection itself (if you were going to print them all). If you are going to stay with a HashMap, you might consider the initalcapacity argument on the constructor as you're not going to get more than 26 letters in this situation. Best practices and all that.

Question 5

@MichaelT TreeMap great! I would anyway set the initialSize to str.length() since it could be every char is never repeated.

Question 6

Then it would be min(str.len(),26) because if it is longer than 26 characters there will be duplicated characters... though we're both kind of ignoring spaces. Still, its only a hint to the system and it will grow if it needs to. The default to start with is 16 (and then it grows to 32, and then to 64). I'd still use one of the navigablemap implementations for its sorted nature. I kind of like the skip list.

Question 7

Yup, min(str.length(), 26). It's just to avoid the "grow cost" if not needed

Question 8

A possibly faster, and at least more compact version than using a HashMap is to use a good old integer array. A char can actually be typecasted to an int, which gives it's ASCII code value.

String str = "Hello World";
int[] counts = new int[(int) Character.MAX_VALUE];
// If you are certain you will only have ASCII characters, I would use `new int[256]` instead
for (int i = 0; i < str.length(); i++) {
 char charAt = str.charAt(i);
 counts[(int) charAt]++;
}
System.out.println(Arrays.toString(counts));

As the above output is a bit big, by looping through the integer array you can output just the characters which actually occur:

for (int i = 0; i < counts.length; i++) {
 if (counts[i] > 0)
 System.out.println("Number of " + (char) i + ": " + counts[i]);
}

Question 9

You don't need to cast Character.MAX_VALUE to an int, nor do you need to cast chartAt to an int. Internally they are represented as numbers and will work just fine. You can even use char i instead of int i in your printing loop to avoid the cast to char there as well. The nice thing about making it an array like this is that it can make for very readable code. You can get the count for any char by just saying counts['a'] to get the count for 'a'.

Question 10

Actually, there is an even better structure than maps and arrays for this kind of counting: Multisets. Documentation of Google Guava mentions a very similar case:
The traditional Java idiom for e.g. counting how many times a word occurs in a document is something like:
```
Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : words) {
 Integer count = counts.get(word);
 if (count == null) {
 counts.put(word, 1);
 } else {
 counts.put(word, count + 1);
 }
}
```
This is awkward, prone to mistakes, and doesn't support collecting a variety of useful statistics, like the total number of words. We can do better.
With a multiset you can get rid of the contains (or if (get(c) != null)) calls, what you need to call is a simple add in every iteration. Calling add the first time adds a single occurrence of the given element.
```
String input = "Hello world!";
Multiset<Character> characterCount = HashMultiset.create();
for (char c: input.toCharArray()) {
 characterCount.add(c);
}
for (Entry<Character> entry: characterCount.entrySet()) {
 System.out.println(entry.getElement() + ": " + entry.getCount());
}
```
(See also: Effective Java, 2nd edition, Item 47: Know and use the libraries The author mentions only the JDK's built-in libraries but I think the reasoning could be true for other libraries too.)

int length = sample.length();
....
for (int i = 0; i < length; i++) {
 char c = arraysample[i];

You could replace these three lines with a foreach loop:

for (char c: arraysample) {

int length = sample.length();
....
for (int i = 0; i < length; i++) {
 char c = arraysample[i];

You don't need the length variable, you could use sample.length() in the loop directly:

for (int i = 0; i < sample.length(); i++) {

The JVM is smart, it will optimize that for you.

```
char arraysample[] = sample.toCharArray();
int length = sample.length();
for (int i = 0; i < length; i++) {
 char c = arraysample[i];
```
It's a little bit confusing that the loop iterating over arraysample but using sample.length() as the upper bound. Although their value is the same it would be cleaner to use arraysample.length as the upper bound.

Question 11

Yes... there is a simpler way. You have two choices, but each about the same. Use an Array, or a Map. The more advanced way of doing this would certainly be with a Map.

Think about a map as a type of array where instead of using an integer to index the array you can use anything. In our case here we'll use char as the index. Because chars are ints you could just use a simple array in this case, and just mentally think of 'a' as 0, but we're going to take the larger step today.

String sample = "Hello World!";
// Initialization
Map <Character, Integer> counter = new HashMap<Character, Integer>();
for(int c = 'a'; c <= 'z'; c++){
 counter.put((Character)c, 0);
}
// Populate
for (int i = 0; i < sample.length(); i++){
 if(counter.containsKey(sample.charAt(i)))
 counter.put(sample.charAt(i), counter.get(sample.charAt(i)) + 1 );
}

Now anytime you want to know how many of whatever character there was just call this method

int getCount(Character character){
 if(counter.containsKey(character))
 return counter.get(character);
 else return 0;
}

Note: This only will work for counting punctuation.

Question 12

Your code will throw NullPointerException for counter.get(s.charAt(i)) + 1 when char is not 'a'..'z' :( null + 1 throws.

Question 13

@SimonAndréForsberg was just about to fix that, and make a note that this only counter letters

Question 14

containsKey(i)? That sounds like a clear mistake.

Question 15

The use of functional programming can even simplify this problem to a great extent.

public static void main(String[] args) {
 String myString = "hello world";
 System.out.println(
 myString
 .chars()
 .mapToObj(c -> String.valueOf((char) c))
 .filter(str -> !str.equals(" "))
 .collect(Collectors.groupingBy(ch -> ch, Collectors.counting()))
 );
}

First we get the stream of integers from the string

myString.chars()

Next we transform the integers into string

mapToObj(c -> String.valueOf((char) c))

Then we filter out the charcters we don't need to consider, for example above we have filtered the spaces.

filter(str -> !str.equals(" "))

Then finally we collect them grouping by the characters and counting them

collect(Collectors.groupingBy(ch -> ch, Collectors.counting()))

Question 16

Why do you need to convert the characters to strings? Why don't you use .codePoints() instead of .chars()?

Question 17

What does the variable p stand for?

Question 18

Sorry for that p, changed it to ch denoting character

Marco Acierno Marco Acierno 2,22516 silver badges18 bronze badges · Accepted Answer · 2014-03-12 20:03:06Z

35

\$\begingroup\$

Oh woah! xD It's just.. woah! What patience you have to write all those variables.

Well, it's Java so you can use a HashMap.

Write something like this:

String str = "Hello World";
int len = str.length();
Map<Character, Integer> numChars = new HashMap<Character, Integer>(Math.min(len, 26));
for (int i = 0; i < len; ++i)
{
 char charAt = str.charAt(i);
 if (!numChars.containsKey(charAt))
 {
 numChars.put(charAt, 1);
 }
 else
 {
 numChars.put(charAt, numChars.get(charAt) + 1);
 }
}
System.out.println(numChars);

We do a for loop over all the string's characters and save the current char in the charAt variable
We check if our HashMap already has a charAt key inside it
- If it's true we will just get the current value and add one.. this means the string has already been found to have this char.
- If it's false (i.e. we never found a char like this in the string), we add a key with value 1 because we found a new char
Stop! Our HashMap will contain all chars (keys) found and how many times it's repeated (values)!

Share

edited Sep 12, 2018 at 8:41

Sᴀᴍ Onᴇᴌᴀ's user avatar

Sᴀᴍ Onᴇᴌᴀ ♦

29.5k16 gold badges45 silver badges201 bronze badges

answered Mar 12, 2014 at 20:03

Marco Acierno's user avatar

Marco Acierno Marco Acierno

2,22516 silver badges18 bronze badges

\$\endgroup\$

5

2

\$\begingroup\$ As an added bonus, this also supports more than just lower case letters. \$\endgroup\$

unholysampler
– unholysampler

2014年03月12日 20:05:46 +00:00
Commented Mar 12, 2014 at 20:05
7

\$\begingroup\$ You might consider a TreeMap instead of a HashMap so that the data, when printed, is ordered by the collection itself (if you were going to print them all). If you are going to stay with a HashMap, you might consider the initalcapacity argument on the constructor as you're not going to get more than 26 letters in this situation. Best practices and all that. \$\endgroup\$

user22048
– user22048

2014年03月12日 21:19:14 +00:00
Commented Mar 12, 2014 at 21:19
\$\begingroup\$ @MichaelT TreeMap great! I would anyway set the initialSize to str.length() since it could be every char is never repeated. \$\endgroup\$

Marco Acierno
– Marco Acierno

2014年03月12日 21:20:20 +00:00
Commented Mar 12, 2014 at 21:20
1

\$\begingroup\$ Then it would be min(str.len(),26) because if it is longer than 26 characters there will be duplicated characters... though we're both kind of ignoring spaces. Still, its only a hint to the system and it will grow if it needs to. The default to start with is 16 (and then it grows to 32, and then to 64). I'd still use one of the navigablemap implementations for its sorted nature. I kind of like the skip list. \$\endgroup\$

user22048
– user22048

2014年03月12日 21:26:04 +00:00
Commented Mar 12, 2014 at 21:26
\$\begingroup\$ Yup, min(str.length(), 26). It's just to avoid the "grow cost" if not needed \$\endgroup\$

Marco Acierno
– Marco Acierno

2014年03月12日 21:28:05 +00:00
Commented Mar 12, 2014 at 21:28

Add a comment |

Stack Exchange Network

Count number of each char in a String

5 Answers 5

Linked

Hot Network Questions

Count number of each char in a String

5 Answers 5

Linked

Related

Hot Network Questions