An interview question I received:
You are given two unsorted lists of positive doubles
x
contains the results from experiment 1y
which contains the results from experiment 2.All results from the second experiment
y
are improvements overx
byi
percent, wherei
is an integer. That is to say that given a result fromx
, the corresponding result fromy
will be ani
percent improvement.However, the lists are not parallel; that is to say
x[0]
does not necessarily correspond toy[0]
. Find the value ofi
.Note: must be written in Java 7
Some examples
Inputs:
y = [1.0] x = [1.0]
Output:
0
Inputs:
y = [2.2999999999999998, 15.0, 102.40000000000001, 3486.8000000000002] x = [23.0, 150.0, 1024.0, 34868.0]
Output:
90
Inputs:
y = [23, 11.1, 50.4] x = [22.2, 46, 100.8]
Output:
50
Note in the last case where
46 -> 23
and22.2 -> 11.1
but the index of46
is not the index of23
, etc.
My Solution
import java.util.Arrays;
//Time complexity: O(n)
//Space complexity: O(1)
public class Solution{
public static int solution(double[] secondRun, double[] firstRun) {
Double firstRunMax = findMax(firstRun),
secondRunMax = findMax(secondRun);
return (int)((1.0-secondRunMax/firstRunMax)*100);
}
private static double findMax(double[] array){
double max = -Double.MAX_VALUE;
for(double d : array){
if(d > max){
max = d;
}
}
return max;
}
}
Strengths: Very simple, as efficient as possible (I think?), ignores most of the data
Weakness: the return line is a little convoluted. Perhaps there is a better way to simplify it also. Also I thought I would be able to find a simple Arrays.Max method or something in Java, but I didnt see any during a quick glance.
3 Answers 3
You are auto-boxing the results of your function call, only to un-box them later. If everything else is using double
(a primitive), there is no reason to declare the local variables in solution()
as Double
(an object). This doesn't effect the big-O of your solution, but it is extra operations for the JVM to execute.
Give your class and public method better names. If you actually needed this in production code, there would be no way to know what this code did. The names tell you nothing and there is no documentation.
There is a max()
, but it works on Collection
s, not arrays. However, writing your own max method is not that hard, so it is up to you if you want to replace it.
Note: Until seeing you question, I didn't know that Double.MIN_VALUE
and Integer.MIN_VALUE
do not represent the same concept for the respective type. This isn't your fault, but it is unintuitive. Explination
-
\$\begingroup\$ Class/function names were mandated, as well as the signature, but otherwise I agree with you. \$\endgroup\$David says Reinstate Monica– David says Reinstate Monica2015年01月09日 04:19:19 +00:00Commented Jan 9, 2015 at 4:19
Oh, I like challenges like this. Looking through your solution, though, you make statements that do not quite ring true, you say it 'ignores most of the data', but that's not true, it inspects all the data, scanning both arrays completely.
On the other hand, that is as efficient as it can be (\$O(n)\$), so that's OK. You just need to realize that you scan it all.
So, I agree, it is close to being as efficient as possible. Certainly, it is as scalable as it could be (good time complexity, and \$O(1)\$ space complexity).
Bugs
I don't believe it gets the best results all the time. Specifically, what if the integer is negative, then the one array will have negative values, and the
max(...)
from that array will be wrong.your solution suffers from an off-by-one problem in the event that floating-point operaions are slightly less than accurate (which is typical). For example, you may compute the max of one dataset to be 100.0000, and the other to be 33.33334. When you do the division, the result will be 2.999999 which, when you re-calculate the factor, will truncate the wrong way in the int conversion. You need to round, and not truncate.
Summing, not max
I believe the better solution to this problem would be to sum all the values from each array, and then to compute the ratio afterwards.
The math is simple. If you have the two data arrays [a,b,c,d,...]
and the corresponding set [A,B,C,D,...]
where you know that A
is some factor Q
times larger, then you have:
[Qa, Qb, Qc, Qd, ...] == [A, B, C, D, ....]
which implies that:
Q[a, b, c, d, ...] == [A, B, C, D, ...]
So, if you sum the values in both sets, you will have:
Q sum(abcd...) == sum(ABCD...)
and
Q = sum(ABCD...) / sum(abcd...)
This makes it all quite simple...
double sumA = 0.0;
double sumB = 0.0;
for (int i = 0; i < firstRun.length; i++) {
sumA += firstRun[i];
sumB += secondRun[i];
}
double q = sumB / sumA;
Now, q
is a value which may not be easy to translate directly to an integer. For example, because of floating-point inaccuracies, the intended Q factor may be 5, but the computed one is 4.99999999997 or something.
return (int)Math.round(q);
Note that summing the values instead of finding the max, will be approximately equally efficient. In each case, you are going through all elements just once.
Java 8
Notice, in Java 8, this would be a nice problem....:
return (int)(Math.round(Arrays.stream(secondRun).sum() / Arrays.stream(firstRun).sum());
-
\$\begingroup\$ I forgot to add a couple constraints - all values are positive, and must be Java 7. I'll edit question. \$\endgroup\$David says Reinstate Monica– David says Reinstate Monica2015年01月09日 04:14:41 +00:00Commented Jan 9, 2015 at 4:14
-
\$\begingroup\$ Since all values are positive, it would be more efficient to find max than to sum, correct? (though both would still be O(n)). Though I do think your sum idea is pretty cool. \$\endgroup\$David says Reinstate Monica– David says Reinstate Monica2015年01月09日 04:21:11 +00:00Commented Jan 9, 2015 at 4:21
-
\$\begingroup\$ I don't believe there would be any significant difference between the sum and max. I have identified some other issues too, edited my answer. \$\endgroup\$rolfl– rolfl2015年01月09日 04:24:07 +00:00Commented Jan 9, 2015 at 4:24
-
\$\begingroup\$ Also for Bug 3 I think you have it backwards. The second run will always be smaller than the first run (there is always an improvement). Thus it will not be
10.0/1.0
but1.0/10.0
->(1.0 - 1.0/10.0) * 100 = 90
\$\endgroup\$David says Reinstate Monica– David says Reinstate Monica2015年01月09日 04:29:29 +00:00Commented Jan 9, 2015 at 4:29 -
1\$\begingroup\$ Adding floating point numbers can be tricky if the values to be added involve very large and very small numbers. So finding the max seems like a more robust solution \$\endgroup\$ChrisWue– ChrisWue2015年01月09日 07:53:23 +00:00Commented Jan 9, 2015 at 7:53
With such a short piece of code, there is not really much to say.
Some notes:
return (int)((1.0-secondRunMax/firstRunMax)*100);
This is hard to read. Put spaces before and after all the operators, to improve readability:
return (int) ((1.0 - secondRunMax / firstRunMax) * 100);
public static int solution(double[] secondRun, double[] firstRun)
To me, it doesn't make sense to put
secondRun
beforefirstRun
. Reverse their locations:public static int solution(double[] firstRun, double[] secondRun)
-
\$\begingroup\$ I agree with both points, but for point 2 the signature was provided, so not much choice there. \$\endgroup\$David says Reinstate Monica– David says Reinstate Monica2015年01月09日 04:17:27 +00:00Commented Jan 9, 2015 at 4:17
Explore related questions
See similar questions with these tags.
Collections.max
, but it only takes collections (arrays are not collections in Java) \$\endgroup\$