The prompt that made me put this function together was sourced from CodingBat.
I was curious about cleaner, more correct methods of doing putting this together such as finding a cleaner way to tell if there is or is not a duplicate in the array. I am also trying to be as efficient as possible. I believe this is an \$O(n^2)\$ function. Let me know what I should change.
The code works just fine. I just feel like there is quite a lot I can streamline and I would like to be pointed in the right direction.
public int maxSpan(int[] nums) {
int highestSpan = 0;
int span;
boolean duplicate = false;
if (nums.length == 0)
return highestSpan;
for (int i = 0; i < nums.length; i++) {
for (int j = 0; j < nums.length; j++){
if ((nums[i] == nums[j])&& j != i){ //if duplicate
duplicate = true;
//get the absolute value of j - i
span = j - i + 1; //Add 1 because it needs to count itself
//if it is larger than the highestSpan then record it
if (span > highestSpan)
highestSpan = span;
}
}
}
if (duplicate)
return highestSpan;
else
return 1;
}
6 Answers 6
Getting rid of duplicate
as Janos proposed and then QPaysTaxes wrote is a good step, but this useless variable introduced quite some useless code we should also get rid of. The following code builds up on QPaysTaxes' answer; I'm commenting on changed things and presenting my variant:
public int maxSpan(int[] nums) {
int highestSpan = 0;
This is not the place where span
should be defined. You don't need it here.
The condition nums.length == 0
can go.
for (int i = 0; i < nums.length; i++) {
for (int j = i; j < nums.length; j++) {
Changed the lower bound from i+1
to i
to cover span
equal to 1
. Below removed the i != j
test and used max
to make the code a bit shorter.
if (nums[i] == nums[j]) {
int span = j - i + 1; // Add 1 to count itself
highestSpan = Math.max(highestSpan, span);
}
}
}
No need for a conditional here.
return highestSpan;
}
I guess, I saved some 6 lines without making it any more complicated. Now can also span
be inlined to save 1 more line (but that's not the objective).
--
Now it works exactly according to the explanation by QPaysTaxes in comment without any special tests:
An empty array doesn't have any spans, so clearly it has to return 0. An array of unique elements is rather more confusing, but since the prompt states that a single (i.e. unduplicated) element has a span of 1, then an array of unique (i.e. unduplicated [i.e. single]) elements has a maximum span of the span of every unique number
An O(n)
solution could look like this (untested):
public int maxSpan(int[] nums) {
int highestSpan = nums.length == 0 ? 0 : 1;
Map<Integer, Integer> firstOccurrenceMap = new HashMap<>();
for (int i = 0; i < nums.length; i++) {
Integer firstOccurrence = firstOccurrenceMap.get(nums[i]);
if (firstOccurrence == null) {
firstOccurrenceMap.put(nums[i], i);
} else {
highestSpan = Math.max(highestSpan, i - firstOccurrence + 1);
}
}
return highestSpan;
}
As the first occurrence of a number gets a special treatment, the empty array has to be handled specially, too (see the declaration of highestSpan
).
-
\$\begingroup\$ Man, way to reuse with attribution something knowingly posted under a license that allows reuse with attribution. \$\endgroup\$anon– anon2015年06月22日 17:32:58 +00:00Commented Jun 22, 2015 at 17:32
-
1\$\begingroup\$ Nice O(n) solution, but highestSpan (maxSpan might be a better name) should always be initialized to zero, because an array with no duplicates also has no "spans", thus the max is zero. \$\endgroup\$gen-y-s– gen-y-s2015年08月27日 05:31:06 +00:00Commented Aug 27, 2015 at 5:31
-
\$\begingroup\$ @gen-y-s I fully agree with you, it's just that I was rewriting the OP's solution, not writing my own (e.g., I always call the result just "result"). \$\endgroup\$maaartinus– maaartinus2015年08月27日 21:13:57 +00:00Commented Aug 27, 2015 at 21:13
Avoid unnecessary flag variables
The variable duplicate
is unnecessary, because you could drive the same meaning from highestSpan > 0
. It's an extra hurdle to keep this flag, I suggest to drop it.
Improving the algorithm
I believe this can be solved in \$O(N)\$ time and space, by using a map to record the position of the first occurrences of numbers. If a number was seen already, check the distance and update the max distance if necessary. Otherwise record the current position.
-
\$\begingroup\$ Of course, checking
if (duplicate)
is a lot clearer than checkingif (highestSpan > 0)
. WRT your first point, see the prompt on CodingBat -- "A single value has a span of 1." \$\endgroup\$anon– anon2015年06月21日 23:10:18 +00:00Commented Jun 21, 2015 at 23:10 -
1\$\begingroup\$ Giving something multiple names creates ambiguity, raises questions like what's the difference between the two things at any point in the code, and requires extra maintenance. If highest span already carriers the relevant information, it's outright dangerous to duplicate that, spanning the entire function body. The handling of empty array and all unique elements is not clear to me. At the very least, I think both should return the same value, not different as in the OP \$\endgroup\$janos– janos2015年06月21日 23:23:09 +00:00Commented Jun 21, 2015 at 23:23
-
\$\begingroup\$ Fair point on the multiple names; I didn't consider that. WRT your second point: Read the CodingBat prompt. An empty array doesn't have any spans, so clearly it has to return 0. An array of unique elements is rather more confusing, but since the prompt states that a single (i.e. unduplicated) element has a span of 1, then an array of unique (i.e. unduplicated [i.e. single]) elements has a maximum span of the span of every unique number -- 1. \$\endgroup\$anon– anon2015年06月21日 23:25:59 +00:00Commented Jun 21, 2015 at 23:25
-
\$\begingroup\$ @QPaysTaxes The span explanation is fine, but concerning
duplicate
it was the very first thing which stroke me: "a needless beginner's variable leading to needless beginner'sif
s". Just dropduplicate
, thenums.length==0
andi==j
tests, and you'll get a shorter and cleaner code. \$\endgroup\$maaartinus– maaartinus2015年06月22日 00:29:30 +00:00Commented Jun 22, 2015 at 0:29 -
\$\begingroup\$ @maaartinus "Fair point on the multiple names; I didn't consider that." \$\endgroup\$anon– anon2015年06月22日 00:30:38 +00:00Commented Jun 22, 2015 at 0:30
First: Always use brackets around conditionals, even one-liners. It makes it easier to update later. That's not a big factor with this code, but it may well be with code you write in the future, and it's a good habit to get in to.
Add a comment before return 1;
explaining that 1 is the default amount because... [your reasoning here].
Comments like this:
//if it is larger than the highestSpan then record it
right before code that explains itself are entirely unnecessary.
Always put spaces both before and after every binary operator -- things like &&
.
if ((nums[i] == nums[j])&& j != i){ //if duplicate
The first set of parentheses is unnecessary, but doesn't harm anything. However, you should always try to be consistent -- if you parenthesize one side, parenthesize the other side, too. I decided to remove them altogether.
Newlines should separate logically separate code -- I like to separate the variable declarations from the calculations bits, unless the variables are one-off little things that are just to make things readable. I also like to separate the bit that returns things from the rest. See the changed code that the bottom for what I mean.
As far as your actual algorithm: Just about the only improvement I could offer is that you should initialize j
to i + 1
rather than 0 to avoid extra traversing that you already did, albeit backwards. To clarify, this is what your for
blocks should look like (with the insides snipped):
for (int i = 0; i < nums.length; i++) {
for (int j = i + 1; j < nums.length; j++){
// [snip!]
}
}
If you want, you can put a ternary instead of the if
block at the end; something like this:
return duplicate ? highestSpan : 1;
(Note: I also stole something from janos' answer -- I got rid of duplicate
.)
public int maxSpan(int[] nums) {
int highestSpan = 0;
int span;
if (nums.length == 0) {
return highestSpan;
}
for (int i = 0; i < nums.length; i++) {
for (int j = i + 1; j < nums.length; j++){
if (nums[i] == nums[j] && j != i){ // if duplicate
//get the absolute value of j - i
span = j - i + 1; //Add 1 because it needs to count itself
if (span > highestSpan) {
highestSpan = span;
}
}
}
}
return highestSpan > 1 ? highestSpan : 1;
}
According to CodingBat, it still works.
You can cut the computational complexity nearly in half.
Your nested for
-loop is actually checking each pair of objects twice. Try unwrapping the loop and write out each combination of i
and j
to see why.
Consider this loop:
for (i = 0 to nums.length) {
for (j = i+1 to nums.length) {
...
}
}
If you unwrap that one, you will see we end up with only unique combinations of i
and j
, and it eliminates the possibility that i>j
(which could make your span
variable negative).
-
\$\begingroup\$ Just pointing out that I touched on this in my answer, though nowhere near as in-depth as you did. ;) Also, please don't change code in peoples' answers. I realize that you can't comment yet, but once this starts getting upvotes, you'll have permission to do so, and that's what you should do instead. \$\endgroup\$anon– anon2015年06月21日 23:51:46 +00:00Commented Jun 21, 2015 at 23:51
-
\$\begingroup\$ Sorry about that! \$\endgroup\$jperl– jperl2015年06月24日 06:01:26 +00:00Commented Jun 24, 2015 at 6:01
-
1\$\begingroup\$ It's not a problem; just wanted to let you know, and I couldn't address you directly. It ended up being a good point, and I changed it myself, but that kind of thing really does belong in a comment. Now you have the rep to post them, too! :D \$\endgroup\$anon– anon2015年06月24日 06:02:38 +00:00Commented Jun 24, 2015 at 6:02
You can clearly do this in linear time by writing the array to a multimap. Basically we are going to have a map from value -> list of places where it appears in the array.
So {1, 4, 2, 1, 4, 4, 4} from coding bat would become the map:
1 -> 0,3
2 -> 2
4 -> 1,4,5,6
The span is obviously largest - smallest + 1. You could write a custom map object which just stores the largest and smallest value on a given key.
This is obviously linear. This is similar to the classic trick for finding duplicates, you create a HashSet, and exploit the behaviour of HashSet.add(), which returns true only if add increases the size of the set, so it will return false at the first duplicate.
The obvious one optimization is to start inner loop from the end of array and stop if distance is less than highest span found so far:
public int maxSpan(int[] nums) {
int highestSpan = 0;
for (int i = 0; i < nums.length; i++) {
for (int j = nums.length-1; j >= i+highestSpan ; j--){
if (nums[i] == nums[j]){
highestSpan = Math.max(j - i + 1, highestSpan);
}
}
}
return highestSpan;
}