3

I have an ArrayList<String> containing dates represented as Strings with the format yyyy-MM-dd, e.g:

ArrayList<String> dates = new ArrayList<>(); 
dates.add("1991-02-28");
dates.add("1991-02-28");
dates.add("1994-02-21");

I'd like to know the number of times the same String (date) appears in the list. In the example above, I'd like to achieve the following output:

1991年02月28日, 2
1994年02月21日, 1

I've tried the following code

 ArrayList<String> dates = new ArrayList<>();
 dates.add("1991-02-28");
 dates.add("1991-02-28");
 dates.add("1994-02-21");
 SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd", Locale.getDefault());
 HashMap<String, String> dateCount = new HashMap<String, String>();
 String first = dates.get(0);
 int count = 1;
 dateCount.put(first, String.valueOf(count));
 for (int i = 1; i < dates.size(); i++) {
 if (first.equals(dates.get(i))) {
 count++;
 } else {
 first = dates.get(i);
 dateCount.put(dates.get(i), String.valueOf(count));
 count = 0;
 }
 }
 for (String date : dates) {
 String occ = dateCount.get(date);
 System.out.println(date + ", " + occ);
 }

But it prints

1991年02月28日, 1
1991年02月28日, 1
1994年02月21日, 2

I'm tired, stuck, and turning to SO as a last resort. Any help is appreciated.

asked May 4, 2015 at 22:56
3
  • I might have formulated my problem badly. What I'd like to print is 1991年02月28日, 2 and 1994年02月21日, 1 @AmirAfghani Commented May 4, 2015 at 23:00
  • I see your edit now. So, effectively this is a groupBy operation. Are you using Java 8? Guava also has a method that does this for you.. Commented May 4, 2015 at 23:01
  • Precisely, groupBy would solve it. Unfortunately I'm currently using Java 7. @AmirAfghani Commented May 4, 2015 at 23:02

5 Answers 5

3

I may be missing something, but it looks like you could just do something simple like this, just keep the count of Dates in the HashMap, and iterate over the HashMap for the output:

 ArrayList<String> dates = new ArrayList<>();
 dates.add("1991-02-28");
 dates.add("1991-02-28");
 dates.add("1994-02-21");
 SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd", Locale.getDefault());
 HashMap<String, Integer> dateCount = new HashMap<String, Integer>();
 for (int i = 0; i < dates.size(); i++) {
 String date = dates.get(i);
 Integer count = dateCount.get(date);
 if (count == null){
 dateCount.put(date, 1);
 }
 else{
 dateCount.put(date, count + 1);
 }
 }
 for(String key : dateCount.keySet()){
 Integer occ = dateCount.get(key);
 System.out.println(key + ", " + occ);
 }

Output:

1991年02月28日, 2
1994年02月21日, 1
answered May 4, 2015 at 23:14

1 Comment

you aren't missing anything - i did the same thing and got the same results. this or an api are the right answer here.
2

I haven't debugged your logic yet, but you can use Google Guava's index method to perform a groupBy.

answered May 4, 2015 at 23:04

Comments

2

Here is the correct solution:

public class mainClass {

/**
 * @param args
 */
public static void main(String[] args) {
 // TODO Auto-generated method stub
 ArrayList<String> dates = new ArrayList<>();
 dates.add("1991-02-28");
 dates.add("1991-02-28");
 dates.add("1994-02-21");
 //SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd", Locale.getDefault());
 HashMap<String, Integer> dateCount = new HashMap<String, Integer>();
 // String first = dates.get(0);
 // int count = 1;
 // dateCount.put(first, String.valueOf(count));
 // for (int i = 1; i < dates.size(); i++) {
 // if (first.equals(dates.get(i))) {
 // count++;
 // } else {
 // first = dates.get(i);
 // dateCount.put(dates.get(i), String.valueOf(count));
 // count = 0;
 // }
 // }
 for(int i= 0; i < dates.size();i++) 
 {
 if(dateCount.containsKey(dates.get(i)))
 {
 dateCount.put(dates.get(i),dateCount.get(dates.get(i))+1); 
 }
 else 
 dateCount.put(dates.get(i),1); 
 }
 for (String date : dates) {
 int occ = dateCount.get(date);
 System.out.println(date + ", " + occ);
 }
}

}

But, you need to traverse through the hashmap instead of ArrayList to get the desired ouput..

Hope This helps!

answered May 4, 2015 at 23:19

Comments

2

The data structure you're describing is commonly called a Multiset or Bag (and generally uses Integer as the value, not String).

Guava provides a very nice Multiset interface, which makes this operation trivial:

Multiset<String> counts = HashMultiset.create();
for(String date : dates) {
 counts.add(date);
}
System.out.println(counts);
[1991年02月28日 x 2, 1994年02月21日]

Even without Guava, you can fake a Multiset with a Map<T, Integer> and a little boilerplate:

Map<String, Integer> counts = new HashMap<>();
for(String date : dates) {
 Integer count = counts.get(date);
 if(count == null) {
 count = 0;
 }
 counts.put(date, count+1);
}
System.out.println(counts);
{1991年02月28日=2, 1994年02月21日=1}
answered May 4, 2015 at 23:19

Comments

1

If all that is required is just a count of the number of occurrences of each full string in a List<String> collection, there are numerous trivial ways in Java 7 (or earlier) for doing it - not necessarily the fastest ones, but working.

For example, one can create a Set from the list and iterate over all items in the set, calling Collections.frequency(list, item), where list is the List<String> collection and item is each string of the set iteration.

Here is a simple implementation:

 public static class FrequencyCount {
 public static void main(String[] args){
 java.util.ArrayList<String> dates = new java.util.ArrayList<>();
 dates.add("1991-02-28");
 dates.add("1991-02-28");
 dates.add("1994-02-21");
 java.util.Set<String> uniqueDates = new java.util.HashSet<String>(dates);
 for (String date : uniqueDates) {
 System.out.println(date + ", " + java.util.Collections.frequency(dates, date));
 }
 }
 }

Output:

1994年02月21日, 1
1991年02月28日, 2
answered May 4, 2015 at 23:05

5 Comments

This is a very clean solution which is working as intended, thank you.
Collections.frequency() is an O(n) operation, meaning this solution is O(n^2). Every other answer contains an O(n) solution; this is not how you want to do this.
Sure. I would also suggest a map-based solution like yours as a better alternative. This one was the simplest to demonstrate a working result and the question does not appear to be large-scale, or require high performance, hence I preferred the simplest approach. I have already given you a +1 since yesterday, by the way. :-)
I'm not quite sure that I follow. Could you elaborate on why this answer is inferior to the others? I used a TreeSet instead of a HashSet, if that changes anything :-) @ dimo414
@Marcus TreeSet vs. HashSet is irrelevant (though TreeSet is O(log n) on most operations, making it generally worse than HashSet). The issue here is that Collections.frequency() is being called n times, and it has to iterate over the whole dates list each time in order to count each element. That's O(n^2) time. Using a Multiset or Map<T, Integer> will do the same work in O(n) time.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.