Computing the relative magnitude of the values in one double array to another

Question 1

An interview question I received:

You are given two unsorted lists of positive doubles

x contains the results from experiment 1

y which contains the results from experiment 2.

All results from the second experiment y are improvements over x by i percent, where i is an integer. That is to say that given a result from x, the corresponding result from y will be an i percent improvement.

However, the lists are not parallel; that is to say x[0] does not necessarily correspond to y[0]. Find the value of i.

Note: must be written in Java 7

Some examples

Inputs:
y = [1.0]
x = [1.0]
Output:
0
Inputs:
y = [2.2999999999999998, 15.0, 102.40000000000001, 3486.8000000000002]
x = [23.0, 150.0, 1024.0, 34868.0]
Output:
90
Inputs:
y = [23, 11.1, 50.4]
x = [22.2, 46, 100.8]
Output:
50
Note in the last case where 46 -> 23 and 22.2 -> 11.1 but the index of 46 is not the index of 23, etc.

My Solution

import java.util.Arrays;
//Time complexity: O(n)
//Space complexity: O(1)
public class Solution{ 
 public static int solution(double[] secondRun, double[] firstRun) {
 Double firstRunMax = findMax(firstRun),
 secondRunMax = findMax(secondRun);
 return (int)((1.0-secondRunMax/firstRunMax)*100);
 }
 private static double findMax(double[] array){
 double max = -Double.MAX_VALUE;
 for(double d : array){
 if(d > max){
 max = d;
 }
 }
 return max;
 }
}

Strengths: Very simple, as efficient as possible (I think?), ignores most of the data

Weakness: the return line is a little convoluted. Perhaps there is a better way to simplify it also. Also I thought I would be able to find a simple Arrays.Max method or something in Java, but I didnt see any during a quick glance.

Question 2

Just a quick note: use Collections.max(array) to get the max - you don't need the findMax function.

Question 3

I think the answers should be: 0, 900, 100 (instead of 0, 90, 50).

Question 4

@Ryan There is a Collections.max, but it only takes collections (arrays are not collections in Java)

Question 5

@DavidGrinberg just use Arrays.asList in the argument then.

Question 6

@Ryan That would be hugely inefficient. Why would I want to convert the array to list just so I can find the max?

Question 7

You are auto-boxing the results of your function call, only to un-box them later. If everything else is using double (a primitive), there is no reason to declare the local variables in solution() as Double (an object). This doesn't effect the big-O of your solution, but it is extra operations for the JVM to execute.

Give your class and public method better names. If you actually needed this in production code, there would be no way to know what this code did. The names tell you nothing and there is no documentation.

There is a max(), but it works on Collections, not arrays. However, writing your own max method is not that hard, so it is up to you if you want to replace it.

Note: Until seeing you question, I didn't know that Double.MIN_VALUE and Integer.MIN_VALUE do not represent the same concept for the respective type. This isn't your fault, but it is unintuitive. Explination

Question 8

Class/function names were mandated, as well as the signature, but otherwise I agree with you.

Question 9

Oh, I like challenges like this. Looking through your solution, though, you make statements that do not quite ring true, you say it 'ignores most of the data', but that's not true, it inspects all the data, scanning both arrays completely.

On the other hand, that is as efficient as it can be (\$O(n)\$), so that's OK. You just need to realize that you scan it all.

So, I agree, it is close to being as efficient as possible. Certainly, it is as scalable as it could be (good time complexity, and \$O(1)\$ space complexity).

Bugs

I don't believe it gets the best results all the time. Specifically, what if the integer is negative, then the one array will have negative values, and the max(...) from that array will be wrong.
your solution suffers from an off-by-one problem in the event that floating-point operaions are slightly less than accurate (which is typical). For example, you may compute the max of one dataset to be 100.0000, and the other to be 33.33334. When you do the division, the result will be 2.999999 which, when you re-calculate the factor, will truncate the wrong way in the int conversion. You need to round, and not truncate.

Summing, not max

I believe the better solution to this problem would be to sum all the values from each array, and then to compute the ratio afterwards.

The math is simple. If you have the two data arrays [a,b,c,d,...] and the corresponding set [A,B,C,D,...] where you know that A is some factor Q times larger, then you have:

[Qa, Qb, Qc, Qd, ...] == [A, B, C, D, ....]

which implies that:

Q[a, b, c, d, ...] == [A, B, C, D, ...]

So, if you sum the values in both sets, you will have:

Q sum(abcd...) == sum(ABCD...)

and

Q = sum(ABCD...) / sum(abcd...)

This makes it all quite simple...

double sumA = 0.0;
double sumB = 0.0;
for (int i = 0; i < firstRun.length; i++) {
 sumA += firstRun[i];
 sumB += secondRun[i];
}
double q = sumB / sumA;

Now, q is a value which may not be easy to translate directly to an integer. For example, because of floating-point inaccuracies, the intended Q factor may be 5, but the computed one is 4.99999999997 or something.

return (int)Math.round(q);

Note that summing the values instead of finding the max, will be approximately equally efficient. In each case, you are going through all elements just once.

Java 8

Notice, in Java 8, this would be a nice problem....:

return (int)(Math.round(Arrays.stream(secondRun).sum() / Arrays.stream(firstRun).sum());

Question 10

I forgot to add a couple constraints - all values are positive, and must be Java 7. I'll edit question.

Question 11

Since all values are positive, it would be more efficient to find max than to sum, correct? (though both would still be O(n)). Though I do think your sum idea is pretty cool.

Question 12

I don't believe there would be any significant difference between the sum and max. I have identified some other issues too, edited my answer.

Question 13

Also for Bug 3 I think you have it backwards. The second run will always be smaller than the first run (there is always an improvement). Thus it will not be 10.0/1.0 but 1.0/10.0 -> (1.0 - 1.0/10.0) * 100 = 90

Question 14

Adding floating point numbers can be tricky if the values to be added involve very large and very small numbers. So finding the max seems like a more robust solution

Question 15

With such a short piece of code, there is not really much to say.

Some notes:

return (int)((1.0-secondRunMax/firstRunMax)*100);

This is hard to read. Put spaces before and after all the operators, to improve readability:

return (int) ((1.0 - secondRunMax / firstRunMax) * 100);
public static int solution(double[] secondRun, double[] firstRun)

To me, it doesn't make sense to put secondRun before firstRun. Reverse their locations:

public static int solution(double[] firstRun, double[] secondRun)

Question 16

I agree with both points, but for point 2 the signature was provided, so not much choice there.

unholysampler unholysampler 6,19119 silver badges31 bronze badges · Answer 1 · 2015-01-09 04:08:10Z

You are auto-boxing the results of your function call, only to un-box them later. If everything else is using double (a primitive), there is no reason to declare the local variables in solution() as Double (an object). This doesn't effect the big-O of your solution, but it is extra operations for the JVM to execute.

Give your class and public method better names. If you actually needed this in production code, there would be no way to know what this code did. The names tell you nothing and there is no documentation.

There is a max(), but it works on Collections, not arrays. However, writing your own max method is not that hard, so it is up to you if you want to replace it.

Note: Until seeing you question, I didn't know that Double.MIN_VALUE and Integer.MIN_VALUE do not represent the same concept for the respective type. This isn't your fault, but it is unintuitive. Explination

Class/function names were mandated, as well as the signature, but otherwise I agree with you.

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Answer 2 · 2015-01-09 04:09:35Z

Oh, I like challenges like this. Looking through your solution, though, you make statements that do not quite ring true, you say it 'ignores most of the data', but that's not true, it inspects all the data, scanning both arrays completely.

On the other hand, that is as efficient as it can be (\$O(n)\$), so that's OK. You just need to realize that you scan it all.

So, I agree, it is close to being as efficient as possible. Certainly, it is as scalable as it could be (good time complexity, and \$O(1)\$ space complexity).

Bugs

I don't believe it gets the best results all the time. Specifically, what if the integer is negative, then the one array will have negative values, and the max(...) from that array will be wrong.
your solution suffers from an off-by-one problem in the event that floating-point operaions are slightly less than accurate (which is typical). For example, you may compute the max of one dataset to be 100.0000, and the other to be 33.33334. When you do the division, the result will be 2.999999 which, when you re-calculate the factor, will truncate the wrong way in the int conversion. You need to round, and not truncate.

Summing, not max

I believe the better solution to this problem would be to sum all the values from each array, and then to compute the ratio afterwards.

The math is simple. If you have the two data arrays [a,b,c,d,...] and the corresponding set [A,B,C,D,...] where you know that A is some factor Q times larger, then you have:

[Qa, Qb, Qc, Qd, ...] == [A, B, C, D, ....]

which implies that:

Q[a, b, c, d, ...] == [A, B, C, D, ...]

So, if you sum the values in both sets, you will have:

Q sum(abcd...) == sum(ABCD...)

and

Q = sum(ABCD...) / sum(abcd...)

This makes it all quite simple...

double sumA = 0.0;
double sumB = 0.0;
for (int i = 0; i < firstRun.length; i++) {
 sumA += firstRun[i];
 sumB += secondRun[i];
}
double q = sumB / sumA;

Now, q is a value which may not be easy to translate directly to an integer. For example, because of floating-point inaccuracies, the intended Q factor may be 5, but the computed one is 4.99999999997 or something.

return (int)Math.round(q);

Note that summing the values instead of finding the max, will be approximately equally efficient. In each case, you are going through all elements just once.

Java 8

Notice, in Java 8, this would be a nice problem....:

return (int)(Math.round(Arrays.stream(secondRun).sum() / Arrays.stream(firstRun).sum());

I forgot to add a couple constraints - all values are positive, and must be Java 7. I'll edit question.
Since all values are positive, it would be more efficient to find max than to sum, correct? (though both would still be O(n)). Though I do think your sum idea is pretty cool.
I don't believe there would be any significant difference between the sum and max. I have identified some other issues too, edited my answer.
Also for Bug 3 I think you have it backwards. The second run will always be smaller than the first run (there is always an improvement). Thus it will not be 10.0/1.0 but 1.0/10.0 -> (1.0 - 1.0/10.0) * 100 = 90
Adding floating point numbers can be tricky if the values to be added involve very large and very small numbers. So finding the max seems like a more robust solution

TheCoffeeCup TheCoffeeCup 9,5164 gold badges38 silver badges96 bronze badges · Answer 3 · 2015-01-09 04:09:06Z

With such a short piece of code, there is not really much to say.

Some notes:

return (int)((1.0-secondRunMax/firstRunMax)*100);

This is hard to read. Put spaces before and after all the operators, to improve readability:

return (int) ((1.0 - secondRunMax / firstRunMax) * 100);
public static int solution(double[] secondRun, double[] firstRun)

To me, it doesn't make sense to put secondRun before firstRun. Reverse their locations:

public static int solution(double[] firstRun, double[] secondRun)

I agree with both points, but for point 2 the signature was provided, so not much choice there.

Stack Exchange Network

Computing the relative magnitude of the values in one double array to another

3 Answers 3

Bugs

Summing, not max

Java 8

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Computing the relative magnitude of the values in one double array to another

3 Answers 3

Bugs

Summing, not max

Java 8

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions