Finding the maximum inclusive distance between two duplicate numbers in an array

Question 1

The prompt that made me put this function together was sourced from CodingBat.

I was curious about cleaner, more correct methods of doing putting this together such as finding a cleaner way to tell if there is or is not a duplicate in the array. I am also trying to be as efficient as possible. I believe this is an \$O(n^2)\$ function. Let me know what I should change.

The code works just fine. I just feel like there is quite a lot I can streamline and I would like to be pointed in the right direction.

 public int maxSpan(int[] nums) {
 int highestSpan = 0;
 int span;
 boolean duplicate = false;
 if (nums.length == 0)
 return highestSpan;
 for (int i = 0; i < nums.length; i++) {
 for (int j = 0; j < nums.length; j++){
 if ((nums[i] == nums[j])&& j != i){ //if duplicate
 duplicate = true;
 //get the absolute value of j - i
 span = j - i + 1; //Add 1 because it needs to count itself
 //if it is larger than the highestSpan then record it
 if (span > highestSpan)
 highestSpan = span;
 }
 }
 }
 if (duplicate)
 return highestSpan;
 else
 return 1;
}

Question 2

Getting rid of duplicate as Janos proposed and then QPaysTaxes wrote is a good step, but this useless variable introduced quite some useless code we should also get rid of. The following code builds up on QPaysTaxes' answer; I'm commenting on changed things and presenting my variant:

public int maxSpan(int[] nums) {
 int highestSpan = 0;

This is not the place where span should be defined. You don't need it here.

The condition nums.length == 0 can go.

 for (int i = 0; i < nums.length; i++) {
 for (int j = i; j < nums.length; j++) {

Changed the lower bound from i+1 to i to cover span equal to 1. Below removed the i != j test and used max to make the code a bit shorter.

 if (nums[i] == nums[j]) {
 int span = j - i + 1; // Add 1 to count itself 
 highestSpan = Math.max(highestSpan, span);
 }
 }
 }

No need for a conditional here.

 return highestSpan;
}

I guess, I saved some 6 lines without making it any more complicated. Now can also span be inlined to save 1 more line (but that's not the objective).

--

Now it works exactly according to the explanation by QPaysTaxes in comment without any special tests:

An empty array doesn't have any spans, so clearly it has to return 0. An array of unique elements is rather more confusing, but since the prompt states that a single (i.e. unduplicated) element has a span of 1, then an array of unique (i.e. unduplicated [i.e. single]) elements has a maximum span of the span of every unique number

An O(n) solution could look like this (untested):

public int maxSpan(int[] nums) {
 int highestSpan = nums.length == 0 ? 0 : 1;
 Map<Integer, Integer> firstOccurrenceMap = new HashMap<>();
 for (int i = 0; i < nums.length; i++) {
 Integer firstOccurrence = firstOccurrenceMap.get(nums[i]);
 if (firstOccurrence == null) {
 firstOccurrenceMap.put(nums[i], i);
 } else {
 highestSpan = Math.max(highestSpan, i - firstOccurrence + 1);
 }
 }
 return highestSpan;
}

As the first occurrence of a number gets a special treatment, the empty array has to be handled specially, too (see the declaration of highestSpan).

Question 3

Man, way to reuse with attribution something knowingly posted under a license that allows reuse with attribution.

Question 4

Nice O(n) solution, but highestSpan (maxSpan might be a better name) should always be initialized to zero, because an array with no duplicates also has no "spans", thus the max is zero.

Question 5

@gen-y-s I fully agree with you, it's just that I was rewriting the OP's solution, not writing my own (e.g., I always call the result just "result").

Question 6

Avoid unnecessary flag variables

The variable duplicate is unnecessary, because you could drive the same meaning from highestSpan > 0. It's an extra hurdle to keep this flag, I suggest to drop it.

Improving the algorithm

I believe this can be solved in \$O(N)\$ time and space, by using a map to record the position of the first occurrences of numbers. If a number was seen already, check the distance and update the max distance if necessary. Otherwise record the current position.

Question 7

Of course, checking if (duplicate) is a lot clearer than checking if (highestSpan > 0). WRT your first point, see the prompt on CodingBat -- "A single value has a span of 1."

Question 8

Giving something multiple names creates ambiguity, raises questions like what's the difference between the two things at any point in the code, and requires extra maintenance. If highest span already carriers the relevant information, it's outright dangerous to duplicate that, spanning the entire function body. The handling of empty array and all unique elements is not clear to me. At the very least, I think both should return the same value, not different as in the OP

Question 9

Fair point on the multiple names; I didn't consider that. WRT your second point: Read the CodingBat prompt. An empty array doesn't have any spans, so clearly it has to return 0. An array of unique elements is rather more confusing, but since the prompt states that a single (i.e. unduplicated) element has a span of 1, then an array of unique (i.e. unduplicated [i.e. single]) elements has a maximum span of the span of every unique number -- 1.

Question 10

@QPaysTaxes The span explanation is fine, but concerning duplicate it was the very first thing which stroke me: "a needless beginner's variable leading to needless beginner's ifs". Just drop duplicate, the nums.length==0 and i==j tests, and you'll get a shorter and cleaner code.

Question 11

@maaartinus "Fair point on the multiple names; I didn't consider that."

Question 12

First: Always use brackets around conditionals, even one-liners. It makes it easier to update later. That's not a big factor with this code, but it may well be with code you write in the future, and it's a good habit to get in to.

Add a comment before return 1; explaining that 1 is the default amount because... [your reasoning here].

Comments like this:

//if it is larger than the highestSpan then record it

right before code that explains itself are entirely unnecessary.

Always put spaces both before and after every binary operator -- things like &&.

if ((nums[i] == nums[j])&& j != i){ //if duplicate

The first set of parentheses is unnecessary, but doesn't harm anything. However, you should always try to be consistent -- if you parenthesize one side, parenthesize the other side, too. I decided to remove them altogether.

Newlines should separate logically separate code -- I like to separate the variable declarations from the calculations bits, unless the variables are one-off little things that are just to make things readable. I also like to separate the bit that returns things from the rest. See the changed code that the bottom for what I mean.

As far as your actual algorithm: Just about the only improvement I could offer is that you should initialize j to i + 1 rather than 0 to avoid extra traversing that you already did, albeit backwards. To clarify, this is what your for blocks should look like (with the insides snipped):

for (int i = 0; i < nums.length; i++) {
 for (int j = i + 1; j < nums.length; j++){
 // [snip!]
 }
}

If you want, you can put a ternary instead of the if block at the end; something like this:

return duplicate ? highestSpan : 1;

(Note: I also stole something from janos' answer -- I got rid of duplicate.)

public int maxSpan(int[] nums) {
 int highestSpan = 0;
 int span;
 if (nums.length == 0) {
 return highestSpan;
 }
 for (int i = 0; i < nums.length; i++) {
 for (int j = i + 1; j < nums.length; j++){
 if (nums[i] == nums[j] && j != i){ // if duplicate
 //get the absolute value of j - i
 span = j - i + 1; //Add 1 because it needs to count itself
 if (span > highestSpan) {
 highestSpan = span;
 }
 }
 }
 }
 return highestSpan > 1 ? highestSpan : 1;
}

According to CodingBat, it still works.

Question 13

You can cut the computational complexity nearly in half.

Your nested for-loop is actually checking each pair of objects twice. Try unwrapping the loop and write out each combination of i and j to see why.

Consider this loop:

for (i = 0 to nums.length) {
 for (j = i+1 to nums.length) {
 ...
 }
}

If you unwrap that one, you will see we end up with only unique combinations of i and j, and it eliminates the possibility that i>j (which could make your span variable negative).

Question 14

Just pointing out that I touched on this in my answer, though nowhere near as in-depth as you did. ;) Also, please don't change code in peoples' answers. I realize that you can't comment yet, but once this starts getting upvotes, you'll have permission to do so, and that's what you should do instead.

Question 15

Sorry about that!

Question 16

It's not a problem; just wanted to let you know, and I couldn't address you directly. It ended up being a good point, and I changed it myself, but that kind of thing really does belong in a comment. Now you have the rep to post them, too! :D

Question 17

You can clearly do this in linear time by writing the array to a multimap. Basically we are going to have a map from value -> list of places where it appears in the array.

So {1, 4, 2, 1, 4, 4, 4} from coding bat would become the map:

1 -> 0,3
2 -> 2
4 -> 1,4,5,6

The span is obviously largest - smallest + 1. You could write a custom map object which just stores the largest and smallest value on a given key.

This is obviously linear. This is similar to the classic trick for finding duplicates, you create a HashSet, and exploit the behaviour of HashSet.add(), which returns true only if add increases the size of the set, so it will return false at the first duplicate.

Question 18

The obvious one optimization is to start inner loop from the end of array and stop if distance is less than highest span found so far:

public int maxSpan(int[] nums) {
 int highestSpan = 0;
 for (int i = 0; i < nums.length; i++) {
 for (int j = nums.length-1; j >= i+highestSpan ; j--){
 if (nums[i] == nums[j]){
 highestSpan = Math.max(j - i + 1, highestSpan);
 }
 }
 }
 return highestSpan;
}

maaartinus maaartinus 13.6k1 gold badge35 silver badges73 bronze badges · Accepted Answer · 2015-06-22 00:42:23Z

Getting rid of duplicate as Janos proposed and then QPaysTaxes wrote is a good step, but this useless variable introduced quite some useless code we should also get rid of. The following code builds up on QPaysTaxes' answer; I'm commenting on changed things and presenting my variant:

public int maxSpan(int[] nums) {
 int highestSpan = 0;

This is not the place where span should be defined. You don't need it here.

The condition nums.length == 0 can go.

 for (int i = 0; i < nums.length; i++) {
 for (int j = i; j < nums.length; j++) {

Changed the lower bound from i+1 to i to cover span equal to 1. Below removed the i != j test and used max to make the code a bit shorter.

 if (nums[i] == nums[j]) {
 int span = j - i + 1; // Add 1 to count itself 
 highestSpan = Math.max(highestSpan, span);
 }
 }
 }

No need for a conditional here.

 return highestSpan;
}

I guess, I saved some 6 lines without making it any more complicated. Now can also span be inlined to save 1 more line (but that's not the objective).

--

Now it works exactly according to the explanation by QPaysTaxes in comment without any special tests:

An empty array doesn't have any spans, so clearly it has to return 0. An array of unique elements is rather more confusing, but since the prompt states that a single (i.e. unduplicated) element has a span of 1, then an array of unique (i.e. unduplicated [i.e. single]) elements has a maximum span of the span of every unique number

An O(n) solution could look like this (untested):

public int maxSpan(int[] nums) {
 int highestSpan = nums.length == 0 ? 0 : 1;
 Map<Integer, Integer> firstOccurrenceMap = new HashMap<>();
 for (int i = 0; i < nums.length; i++) {
 Integer firstOccurrence = firstOccurrenceMap.get(nums[i]);
 if (firstOccurrence == null) {
 firstOccurrenceMap.put(nums[i], i);
 } else {
 highestSpan = Math.max(highestSpan, i - firstOccurrence + 1);
 }
 }
 return highestSpan;
}

As the first occurrence of a number gets a special treatment, the empty array has to be handled specially, too (see the declaration of highestSpan).

Man, way to reuse with attribution something knowingly posted under a license that allows reuse with attribution.
Nice O(n) solution, but highestSpan (maxSpan might be a better name) should always be initialized to zero, because an array with no duplicates also has no "spans", thus the max is zero.
@gen-y-s I fully agree with you, it's just that I was rewriting the OP's solution, not writing my own (e.g., I always call the result just "result").

Stack Exchange Network

Finding the maximum inclusive distance between two duplicate numbers in an array

6 Answers 6

Avoid unnecessary flag variables

Improving the algorithm

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Finding the maximum inclusive distance between two duplicate numbers in an array

6 Answers 6

Avoid unnecessary flag variables

Improving the algorithm

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions