5

I am trying to calculate 95th Percentile from the data sets which I have populated in my below ConcurrentHashMap.

I am interested in finding out how many calls came back in 95th percentile of time

My Map will look like this and it will always be sorted in ascending order on the keys- In which

key - means number of milliseconds
value - means number of calls that took that much milliseconds
Milliseconds Number
0 1702
1 15036
2 14262
3 13190
4 9137
5 5635
6 3742
7 2628
8 1899
9 1298
10 963
11 727
12 503
13 415
14 311
15 235
16 204
17 140
18 109
19 83
20 72

For example, from the above data sets, it means

1702 calls came back in 0 milliseconds

15036 calls came back in 1 milliseconds

Now I can calculate the 95th percentile by plugging the above data sets in the Excel sheet. But I was thinking to calculate the percentile in Java code.

I know the algorithm will look something like this-

Sum all values from the map, calculate 95% of the sum, iterate the map keys in ascending order keeping a running total of values, and when sum equals or exceeds the previously calculated 95% of the total sum, the key should be the 95th percentile I guess.

But I am not able to plugin this algorithm in the Java code. Below is the map which will have above datasets.

Map<Long, Long> histogram = new ConcurrentHashMap<Long, Long>

I am not sure what is the best way to calculate the percentile in Java. I am not sure whether I am algorithm is also correct or not. I am just trying to find out how many calls came back in 95th percentile of time.

private static void calculatePercentile() {
 for (Long time : CassandraTimer.histogram.keySet()) {
 
 }
}

Can anyone provide some example how to do that?

Any help will be appreciated.

Updated code:-

Below is the code I have got so far. Let me know if I got everything correct in calculating the 95th percentile-

/**
 * A simple method to log 95th percentile information
 */
private static void logPercentileInfo() {
 double total = 0;
 for (Map.Entry<Long, Long> entry : CassandraTimer.histogram.entrySet()) {
 long value = entry.getKey() * entry.getValue();
 total += value;
 }
 double sum = 0.95*total;
 double totalSum = 0;
 SortedSet<Long> keys = new TreeSet<Long>(CassandraTimer.histogram.keySet());
 for (long key : keys) {
 totalSum += CassandraTimer.histogram.get(key);
 if(totalSum >= sum) {
 //this is the 95th percentile I guess
 System.out.println(key);
 }
 }
}
asked Apr 22, 2013 at 0:00
8
  • sum the total time and get the number in which the .95/total time came in from Commented Apr 22, 2013 at 0:04
  • @ratchet freak. Thanks for the suggestion. It will be great if you can provide an example for me? Thanks for the help. Commented Apr 22, 2013 at 0:07
  • Why a ConcurrentHashMap? You might want to look at the ConcurrentSkipListMap or TreeMap which implement SortedMap so that you get the numbers out in order (if order is important). Though noting that the structure is integers from 0-20, you may just want an array without the overhead of the map. Commented Apr 22, 2013 at 2:08
  • @MichaelT Oops I pasted wrong code. I have updated the code which is using SortedMap. Can you please take a look and let me know whether I am calculating the percentile correctly or not for my problem? Commented Apr 22, 2013 at 2:16
  • 1
    "I am interested in finding out how many calls came back in 95th percentile of time" -> isn't that always 95% of the calls? Commented Jun 28, 2024 at 9:54

2 Answers 2

3

I'm not sure what you're trying to accomplish here, but it's easy to get a percentile. Suppose you have 100 numbers. You sort them and extract the 95th one (if you want the 95th percentile). If you don't have a multiple of 100 numbers you may have to do some interpolation. I assume you know how to do that.

EDIT: OK, you already have the numbers in order. First get the total of the column called "Number". Call that Tot. Then enumerate through them, keeping a running sum of the column and call that RS. When RS passes 0.95 * Tot, you've found it. As I said, you might want to do some interpolation so you get a fractional number of milliseconds.

Your question has the right idea. It's not a big deal.

for (i=0, sum=0; i<n; i++) sum += Number[i];
tot = sum;
for (i=0, sum=0; i<n && sum < 0.95*tot; i++) sum += Number[i];
// i is about it
answered Apr 22, 2013 at 0:44
5
  • I have a very simple use case. In my question above, I am just trying to see how many calls came back in 95th percentile of time. For example- 95th percentile of time calls came back in 5 ms, something like this. So that is the reason I have my data in the map. I believe I know the algorithm but I am not able to put together in the java code. If you can provide me an example then I can learn something from that. Thanks for the help. Commented Apr 22, 2013 at 0:56
  • Thanks for the edit. It kind of make sense. But I am having hard time in putting this to actual code. That is the reason I ask any example can make me better understand this. if you can provide me an example then that will be of great help. Commented Apr 22, 2013 at 1:08
  • @user21973 Just iterate over your array and stop before you get to the 95th percentile of 20 (or whatever your map's size). Then you'll have the sum of the 95th percentile. Do you know how to iterate on a map? Commented Apr 22, 2013 at 1:31
  • Give me few minutes, I have started coding on this. I will update this thread with my solution and then let me know if I got it right or not. Commented Apr 22, 2013 at 1:42
  • I have updated my question with the code I just wrote. let me know if the way I am doing is right or not? Thanks for the help. Commented Apr 22, 2013 at 1:48
-2

In kotlin :

// The q-th quantile represents the value below 
// which q percent of the data falls.
fun numpy_quantile(data: IntArray, quantile: Double): Int {
 require (quantile in 0.0..1.0)
 val total = data.sum() * quantile
 val sortedData = data.copyOf() // or sort in place, which changes data
 sortedData.sort()
 var i = 0
 var runningTotal = 0
 while (runningTotal < total) {
 runningTotal += sortedData[i++]
 }
 return sortedData[i]
}
answered Jun 26, 2024 at 16:40

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.