Write a method named
numUnique
that accepts a sorted array of integers as a parameter and that returns the number of unique values in the array. The array is guaranteed to be in sorted order, which means that duplicates will be grouped together.
I'm mainly concerned with following proper conventions and maximizing readability; both of which I feel I know little about. In the last question I posted, I learned a bit about structuring if-statements. How can I think about these statements so that it's easier for me to write them, as well as easier for others to read them? Is there a sort of PEMDAS of programming logic that would help me understand writing conditional code with precedence? Any feedback is appreciated!
public static int numUnique(int[] array){
if(array.length > 0){
int countUnique = 1;
for(int i = 0; i < array.length - 1; i++){
if(array[i] != array[i+1]){
countUnique++;
}
}
return countUnique;
} else {
return 0;
}
}
5 Answers 5
Just a few minor suggestions:
- If you invert the condition on the length, then you can reduce the level of nesting, which is slightly easier to read
- An array will never have a negative length. It's pure paranoia to check for
array.length <= 0
, I suggest to replace witharray.length == 0
- The spacing is not perfect. Use an IDE to reformat the code
- This maybe a matter of taste, but I would rename
countUnique
to simplycount
, and the method fromnumUnique
tocountUnique
Putting it together:
public static int countUnique(int[] array) {
if (array.length == 0) {
return 0;
}
int count = 1;
for (int i = 0; i < array.length - 1; i++) {
if (array[i] != array[i + 1]) {
count++;
}
}
return count;
}
-
1\$\begingroup\$ [rename] method from
numUnique
tocountUnique
. How aboutcountUniqueMembers
orcountUniqueElements
\$\endgroup\$radarbob– radarbob2015年12月21日 19:41:10 +00:00Commented Dec 21, 2015 at 19:41
We could do a little bit better. It depends on what array
ends up looking like, but what if we had like 100 0s and then 500 1s. Do we really need to compare each one individually?
No, we don't. We can binary search the next element.
private static int upper_bound(int[] array, int lo, int hi, int val) {
int count = hi - lo;
while (count > 0) {
int step = count/2;
int cur = lo + step;
if (val >= array[cur]) {
lo = cur + 1;
count -= step + 1;
}
else {
count = step;
}
}
return lo;
}
And then use that to increment:
public static int countUnique(int[] array) {
int count = 0;
int idx = 0;
while (idx < array.length) {
++count;
// exponential search to find the next elem
int bound = 1;
while (idx+bound < array.length && array[idx+bound] == array[idx]) {
bound *= 2;
}
idx = upper_bound(array,
idx+bound/2,
Math.min(idx+bound, array.length),
array[idx]
);
}
return count;
}
-
2\$\begingroup\$ @JS1 The exponential search bound (just added) should keep it at \$O(n)\$ worst case, though still worse than simply walking the list. \$\endgroup\$Barry– Barry2015年12月15日 21:33:03 +00:00Commented Dec 15, 2015 at 21:33
-
\$\begingroup\$ Add this condition though :
if(idx < array.length-1 && array[idx]!=array[idx+1] || idx == array.length-1) { System.out.println(array[idx]); count++; }
\$\endgroup\$Nilesh Agrawal– Nilesh Agrawal2019年07月22日 01:05:59 +00:00Commented Jul 22, 2019 at 1:05
Expanding @janos's answer a bit:
Instead of:
for (int i = 0; i < array.length - 1; i++) { if (array[i] != array[i + 1]) {
I would usually do:
for (int i = 1; i < array.length; i++) { if (array[i] != array[i - 1]) { // ...
Because, in my opinion, is easier to understand. I like this because I don't like the
-
sign in the condition. Of course, this is a matter of preference.Since you use
array.length
twice, I would make it a variable, but, your choice.I don't really like edge-case checks. I prefer the algorithm to be able to handle all cases, but in this case, it looks impossible. You can do:
public static int countUnique(int[] array) { int length = array.length; int count = 1; for (int i = 1; i < length; i++) { if (array[i] != array[i - 1]) { count++; } } return length == 0 ? 0 : count; }
But the ternary is like edge-case handling anyways. I don't think that's a good solution.
Result:
public static int countUnique(int[] array) {
int length = array.length;
if (length == 0) {
return 0;
}
int count = 1;
for (int i = 1; i < length; i++) {
if (array[i] != array[i - 1]) {
count++;
}
}
return count;
}
Building upon @Barry's answer, Java already has Arrays.binarySearch(int[], int, int, int)
that can perform the binary search without having to write additional code ourselves. :)
Using a very similar approach of modeling the next index \$m\$ as where the next larger value (\$V\$) of the current value (\$arr[n]\$) should be inserted, we have the following BiFunction<int[], Integer, Integer>
implementation:
private static final BiFunction<int[], Integer, Integer> STEPPER = (sorted, i) -> {
int r = Arrays.binarySearch(sorted, i, sorted.length, sorted[i] + 1);
return r < 0 ? Math.abs(r + 1) : r;
};
We need to handle for r < 0
as \$V\$ may not exist in the array, in which case Arrays.binarySearch()
will return (-(insertion point) - 1)
. For pre-Java 8, this can easily be converted into a method as well.
The full method can then be:
private static int countUnique(int[] sortedArray) {
if (sortedArray.length == 0) {
return 0;
}
int counter = 1;
for (int i = STEPPER.apply(sortedArray, 0); i < sortedArray.length;
i = STEPPER.apply(sortedArray, i)) {
counter++;
}
return counter;
}
After taking care of handling an empty array, we initialize counter
as 1
(for the first element) and i
to be the index of the next larger value than the first element. After each loop, i
is then assigned the index of the next larger value than the current iteration's element.
Visualizing the steps taken can be done with a slight modification to STEPPER
:
private static final BiFunction<int[], Integer, Integer> STEPPER = (sorted, i) -> {
int r = Arrays.binarySearch(sorted, i, sorted.length, sorted[i] + 1);
r = r < 0 ? Math.abs(r + 1) : r;
System.out.printf("DEBUG: current (index = %d, value = %d); "
+ "next (index = %d, value = %s)%n",
i, sorted[i], r,
r >= sorted.length ? "END" : sorted[r]);
return r;
};
Given this example:
countUnique(new int[]{0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 4, 4, 4, 5, 5, 5, 5, 5});
// Output
DEBUG: current (index = 0, value = 0); next (index = 4, value = 1)
DEBUG: current (index = 4, value = 1); next (index = 7, value = 2)
DEBUG: current (index = 7, value = 2); next (index = 11, value = 4)
DEBUG: current (index = 11, value = 4); next (index = 14, value = 5)
DEBUG: current (index = 14, value = 5); next (index = 19, value = END)
Input(19): [0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 4, 4, 4, 5, 5, 5, 5, 5]
The answer is the number of DEBUG
lines printed, i.e. 5
. It works by jumping across any indices of the numbers 0 -> 1 -> 2 -> 4 -> 5
. The any index part is a known effect from the method's Javadoc:
If the range contains multiple elements with the specified value, there is no guarantee which one will be found.
This still suffices for our usage here as we only need to hit any of the next larger number. Also, since 3
is not present in the list, 4
is returned after 2
, and this is where the check for r < 0
is required to convert the result of Arrays.binarySearch()
to a positive, usable index.
-
\$\begingroup\$ Nope, the
r < 0
handling takes care of that by turning(-(insertion point) - 1)
toinsertion point
byMath.abs(r + 1)
. \$\endgroup\$h.j.k.– h.j.k.2015年12月16日 13:58:05 +00:00Commented Dec 16, 2015 at 13:58 -
1\$\begingroup\$ Ah, I misunderstood what the error case meant. I'm out of cotes today already, here's a phantom +1 \$\endgroup\$Barry– Barry2015年12月16日 14:11:44 +00:00Commented Dec 16, 2015 at 14:11
-
\$\begingroup\$ I am really impressed by efficient algorithms. But for me it is really difficult to identifiy what this is doing and why. For example I am missing an expression like "array[0] == array[1]" to give a hint whats this algorithm all about. If I really need an efficient solution then I would really use your code. But I would put it under test, trust the name of the function and don't care about how it comes to the solution. \$\endgroup\$oopexpert– oopexpert2015年12月17日 06:17:43 +00:00Commented Dec 17, 2015 at 6:17
-
1\$\begingroup\$ It's using binary search to find what is the next largest number from the current element, let me update my answer... \$\endgroup\$h.j.k.– h.j.k.2015年12月17日 07:59:06 +00:00Commented Dec 17, 2015 at 7:59
-
\$\begingroup\$ Upvote comment for adding explanation. But I still have an issue with readability. \$\endgroup\$oopexpert– oopexpert2015年12月17日 09:35:02 +00:00Commented Dec 17, 2015 at 9:35
Solution 1
Maybe for readability a recursive function is a proper solution in this case. But think about it carefully because it will strain the stack.
public static int getUniqueValueCount(int[] array) {
if (array.length == 0) {
return 0;
} else if (array.length == 1) {
return 1;
} else {
return getUniqueValueCount(Arrays.copyOfRange(array, 1, array.length))
+ ((array[0] == array[1]) ? 0 : 1);
}
}
Pros
- getUniqueValueCount instead of numUnique --> more expressive name
- no for-int-i iteration --> less control structures
- no state involved, nearly functional
- positive formulation of main condition
Cons
- strains the stack
- additional array operations O(n^2)
Solution 2
The following solution is a summary of this discussion with some modifications to improve readability. It uses the binary search approach for efficiency and speed. Personally I am not that convinced of it because there are some specifics about Arrays.binarySearch(..) in the case of application and interpretation of the return value.
The return value can hold (as far as I understand) up to two things. In the normal case (key was found) the index of the (any?) key is returned. In the case when the binary search returns a negative value it is considered as "not found". Furthermore the value can be multiplied by (-1) to receive an insertion index. Finally you have to decrement the insertion index by 1.
private static int getUniqueValueCount(int[] sorted) {
int count = 0;
for (int curentIndx = 0; (curentIndx = nextDeviationIndex(sorted, curentIndx)) != -1;) {
count++;
}
return count;
}
private static int nextDeviationIndex(int[] sorted, int currentIndex) {
if (sorted.length == 1 && currentIndex == 0) {
return 1;
} else if (currentIndex < sorted.length) {
return binarySearchNextDeviationIndex(sorted, currentIndex, sorted[currentIndex] + 1);
} else {
return -1;
}
}
private static int binarySearchNextDeviationIndex(int[] sorted, int startIndex, int key) {
int searchResult = Arrays.binarySearch(sorted, startIndex, sorted.length, key);
// this is very hard to get ...
return searchResult < 0 ? -searchResult - 1 : searchResult;
}
Although the solution inflates to three methods nobody can avoid the responsibilites involved to produce the result:
- Counting while iterating over the interesting elements until there are no left
- Decide how to find the next interesting element and provide the information if there is no element left
- Provide an efficient search for the case of mass data
Solution 3 (OO)
This is an solution that shows how the responsibilities are cut in the OO. I guess this is the most verbose one for the given problem.
public static class DeviationIterator implements Iterator<Integer> {
private int[] sorted;
private int nextIndex;
public DeviationIterator(int[] sorted) {
super();
this.sorted = sorted;
this.nextIndex = calculateNextIndex();
}
@Override
public boolean hasNext() {
return nextIndex != -1;
}
@Override
public Integer next() {
if (nextIndex != -1) {
int currentIndex = nextIndex;
nextIndex = calculateNextIndex();
return currentIndex;
} else {
throw new NoSuchElementException();
}
}
private int calculateNextIndex() {
if (sorted.length == 1 && nextIndex == 0) {
return 1;
} else if (nextIndex < sorted.length) {
return binarySearchNextDeviationIndex(sorted, nextIndex,
sorted[nextIndex] + 1);
} else {
return -1;
}
}
}
private static int getUniqueValueCount(int[] sorted) {
int count = 0;
DeviationIterator deviationIterator = new DeviationIterator(sorted);
while(deviationIterator.hasNext()) {
deviationIterator.next();
count++;
}
return count;
}
- The iterator encapsulates the search algorithm
- the getUniqueValueCount uses the iterator to count
The most interesting part is you can clearly see, where the responsibilities are located. Using the iterator pattern identifies the fragment -1 as the EOF for the iterator. The user of the iterator via contract has to ask for if another element is available, so the "Solution 2". But there -1 is not translated into false so the the caller has to interpret it by himself.
-
\$\begingroup\$ Why downvoted? Explain please ;-). \$\endgroup\$oopexpert– oopexpert2015年12月15日 21:38:46 +00:00Commented Dec 15, 2015 at 21:38
-
\$\begingroup\$ Recursion for this problem is not a good suggestion. \$\endgroup\$Barry– Barry2015年12月15日 22:05:11 +00:00Commented Dec 15, 2015 at 22:05
-
\$\begingroup\$
Arrays.copyOfRange()
is a really bad option. You shouldn't need to copy an array for this simple operation. \$\endgroup\$TheCoffeeCup– TheCoffeeCup2015年12月16日 00:16:55 +00:00Commented Dec 16, 2015 at 0:16 -
\$\begingroup\$ The focus of my suggestion was readablitity. And the problem with "Arrays.copyRange()" I mentioned in the main points. I provided a solution together with its limits. I would understand "not voting" for this answer... but downvoting? But after all doesn't matter. \$\endgroup\$oopexpert– oopexpert2015年12月16日 06:59:16 +00:00Commented Dec 16, 2015 at 6:59
-
\$\begingroup\$ @OOP-Expert You're proposing an \$O(n^2)\$ solution to a problem which has a very simple \$O(n)\$ one, with potential for better. Even if you didn't copy the range (for no reason?), recursion offers no benefits here. This is a poor proposal. And it's not great for readability - you're unnecessarily duplicating the recursive call. \$\endgroup\$Barry– Barry2015年12月16日 13:01:56 +00:00Commented Dec 16, 2015 at 13:01
[3, 2, 3, 2]
doesn't looka sorted array
. Naming is important, and I find it non-trivial here, not likingcountUnique
for almost implying how to, andnumberOfDistinctValues()
looks an alphabetic procession. \$\endgroup\$