Find the non repeating number in the integer array

Question 1

The program finds the non repeated number in a int array. I am using a count for the duplicates and making a Numcounts[] to store the count values. I am assuming the array is sorted. Then I can check the Numcounts[] for 1, and that would be the non-repeating number. The complexity is \$O(n^2)\$.

Can you tell me how to do the same thing using hashmaps? I haven't used hashmaps and I know the complexity can be reduced to \$O(n))\$ using them.

#include<stdafx.h>
#include<stdio.h>
int main()
{
 int n=6;
 int * Numcounts = new int[n];
 int count = 0;
 int a[6] = {1,1,1,2,2,3}; // sort the array before proceeding.Because in the else part count is set to zero. 
 //If the array is unsorted then count will be reset in middle and will not retain the previous count.
 for(int i=0;i<n;i++)
 {
 for(int j=0;j<n;j++)
 {
 if(a[i]==a[j])
 {
 count = count+1;
 Numcounts[i] = count;
 }
 else{
 count = 0;
 }
 }
 }
}

Question 2

If the array is sorted you do not need hashmaps to achieve O(n). If you have a random/unsorted array you can use hashmaps to achieve the purpose, see Paulo's answer for that. Note that sorting requires O(n log(n)).

If you want to know how often each integer is in the original collection you will also need hashmaps, but if you only want to know a nonduplicate number in a sorted array the following code will work:

// N is the number of ints in sorted
int nonduplicate(int sorted[], const int N){
 //To prevent segfaulting when N=1
 if(N==1){ return sorted[0]; };
 for(int i=0;i < N-1;i++){
 if(sorted[i] == sorted[i+1]){
 // Next number in collection is the same so the current number is not unique
 // Set i to the first location of the next different number
 do{
 i++;
 }while(i < N-1 && sorted[i] == sorted[i+1]);
 }else{
 // The next number is different so the current is unique
 return sorted[i];
 }
 }
 // Is the last number unique?
 if(sorted[N-1] != sorted[N-2]){
 return sorted[N-1];
 }
 // No unique numbers
 return -1; // Or some other value you reserve for it
}

Note that this code is faster because it doesn't have the overhead of a hashmap and may stop before traversing the whole array, but does have the same big-O limit.

Question 3

@paul : Thanks much.I had to change this part of insert and it worked. hashMap.insert(std::pair<int,int>(number,1)); I like the way how hashmaps take care of indexing and stuff automatically, instead of having the programmer worry abt that. I am not very good at working with indexing (:-).

Question 4

Since you are using C++ and not C, there are a few things that you could clean up.

First of all, your code leaks memory: you are newing memory but you are not deleteing. In order to avoid this manual memory management, you should use the std::vector class template instead of new[].

Furthermore, stdio.h is a legacy C header. Use cstdio in C++. But in fact, your code doesn’t need that header anyway.

A hash map implementation is available in the std::tr1 namespace. If your compiler supports this (modern compilers do), the following code works:

#include <iostream>
#include <unordered_map>
#include <vector>
int main() {
 std::unordered_map<int, unsigned> occurrences;
 int a[6] = { 1, 1, 1, 2, 2, 3 };
 unsigned const size = sizeof a / sizeof a[0];
 // First pass: count occurrences.
 for (unsigned i = 0; i < size; ++i)
 ++occurrences[a[i]];
 // Second pass: search singleton.
 for (unsigned i = 0; i < size; ++i)
 if (occurrences[a[i]] == 1)
 std::cout << a[i] << " only occurred once." << std::endl;
}

(To enable TR1 support on GCC, pass the std=c++0x flag to the compiler.)

Point of interest: the first pass works because the values inside the unordered map (= hash map) are initialised with the default value, 0. We are using this fact to increment the values directly without bothering to check whether the value existed in the map beforehand. This makes the code more concise and is actually also more efficient.

Question 5

how about for(auto i : a) if you have range based for loops, or if not: for(auto i = std::begin(a); i != std::end(a); ++i)

Question 6

Why are you using unsigned? It's implied to be unsigned int right? Most correct would be size_t since that's the maximum occurrence count you could conceivably have without going out of memory.

Question 7

@Dave I’m using unsigned because negative counts simply don’t make sense. I only use std::size_t when explicitly referring to sizes or offsets of container classes, not when counting natural entities. It wouldn’t be wrong here to use either, per se.

Question 8

I meant why are you using unsigned instead of unsigned int? In this case it represents the count in an array, which does fit the bill for what size_t is for. It's the type of the count of an array. If you used std::array instead of int[] you could use std::array::size_type but that's guaranteed to be size_t... just like real arrays.

Question 9

@Dave When I wrote this answer I was working on a project where the convention was to use unsigned instead of unsigned int. Since either is valid it doesn’t really matter. I now use unsigned int again in my own code.

Question 10

A hashmap is a structure that maps a key (the hash) to a value.
In your example you'd want the key to be the number and the value to be the count of each number. If you are using a compiler that that supports C++0x, you can add

#include <unordered_map>

and then modify the body of your main loop to something like:

std::unordered_map<int, int> hashMap;
for(int i=0;i<n;i++)
{
 int number = a[i];
 // check if this number is already in the map
 if(hashMap.find(number) != hashMap.end())
 {
 // it's already in there so increment the count
 hashMap[number] += 1;
 }
 else
 {
 // it's not in there yet so add it and set the count to 1
 hashMap.insert(number, 1);
 }
}

I wrote this code down from memory so it might not even compile, although it should be straightforward to fix.

As you can see we only run through the array once so you now have the O(n) complexity.

Question 11

Actually, you could do it in \$O(N)\$ using XOR to bitshift things.

This will do it will less overhead and much faster (in java, since that is what I know):

public static void main(String... args) {
 int[] x = new int[] { 5,7,4,9,5,7,4 };
 int result = 0;
 for (int i=0;i<x.length;i++) {
 result ^= x[i];
 }
 System.out.println("result:"+result);
}

Good article on this here: http://kaipakartik.amplify.com/

Question 12

This only works if elements are repeating only twice. In other cases it fails (Coz It is XOR operator). The link is Also broken now.

Question 13

FYI, the link is broken now.

Question 14

This can easily be done in \$O(n)\$: run through the array and find the largest value. Create a vector of this size initialized to zeros. Run through the array and increment the element in the vector indexed by each value by one. Finally, run through the vector and find the element equal to one; the offset of this element in the vector is your answer.

Question 15

Using an array is a bad idea. It would be O(n) time, but also O(m) space, where m is the maximum value. The maximum could be 4 billion. Numbers could also be negative.

dtech dtech 1713 bronze badges · Answer 1 · 2011-03-24 20:13:21Z

If the array is sorted you do not need hashmaps to achieve O(n). If you have a random/unsorted array you can use hashmaps to achieve the purpose, see Paulo's answer for that. Note that sorting requires O(n log(n)).

If you want to know how often each integer is in the original collection you will also need hashmaps, but if you only want to know a nonduplicate number in a sorted array the following code will work:

// N is the number of ints in sorted
int nonduplicate(int sorted[], const int N){
 //To prevent segfaulting when N=1
 if(N==1){ return sorted[0]; };
 for(int i=0;i < N-1;i++){
 if(sorted[i] == sorted[i+1]){
 // Next number in collection is the same so the current number is not unique
 // Set i to the first location of the next different number
 do{
 i++;
 }while(i < N-1 && sorted[i] == sorted[i+1]);
 }else{
 // The next number is different so the current is unique
 return sorted[i];
 }
 }
 // Is the last number unique?
 if(sorted[N-1] != sorted[N-2]){
 return sorted[N-1];
 }
 // No unique numbers
 return -1; // Or some other value you reserve for it
}

Note that this code is faster because it doesn't have the overhead of a hashmap and may stop before traversing the whole array, but does have the same big-O limit.

@paul : Thanks much.I had to change this part of insert and it worked. hashMap.insert(std::pair<int,int>(number,1)); I like the way how hashmaps take care of indexing and stuff automatically, instead of having the programmer worry abt that. I am not very good at working with indexing (:-).

Konrad Rudolph Konrad Rudolph 6,68823 silver badges35 bronze badges · Answer 2 · 2011-03-25 16:21:19Z

Since you are using C++ and not C, there are a few things that you could clean up.

First of all, your code leaks memory: you are newing memory but you are not deleteing. In order to avoid this manual memory management, you should use the std::vector class template instead of new[].

Furthermore, stdio.h is a legacy C header. Use cstdio in C++. But in fact, your code doesn’t need that header anyway.

A hash map implementation is available in the std::tr1 namespace. If your compiler supports this (modern compilers do), the following code works:

#include <iostream>
#include <unordered_map>
#include <vector>
int main() {
 std::unordered_map<int, unsigned> occurrences;
 int a[6] = { 1, 1, 1, 2, 2, 3 };
 unsigned const size = sizeof a / sizeof a[0];
 // First pass: count occurrences.
 for (unsigned i = 0; i < size; ++i)
 ++occurrences[a[i]];
 // Second pass: search singleton.
 for (unsigned i = 0; i < size; ++i)
 if (occurrences[a[i]] == 1)
 std::cout << a[i] << " only occurred once." << std::endl;
}

(To enable TR1 support on GCC, pass the std=c++0x flag to the compiler.)

Point of interest: the first pass works because the values inside the unordered map (= hash map) are initialised with the default value, 0. We are using this fact to increment the values directly without bothering to check whether the value existed in the map beforehand. This makes the code more concise and is actually also more efficient.

how about for(auto i : a) if you have range based for loops, or if not: for(auto i = std::begin(a); i != std::end(a); ++i)
Why are you using unsigned? It's implied to be unsigned int right? Most correct would be size_t since that's the maximum occurrence count you could conceivably have without going out of memory.
@Dave I’m using unsigned because negative counts simply don’t make sense. I only use std::size_t when explicitly referring to sizes or offsets of container classes, not when counting natural entities. It wouldn’t be wrong here to use either, per se.
I meant why are you using unsigned instead of unsigned int? In this case it represents the count in an array, which does fit the bill for what size_t is for. It's the type of the count of an array. If you used std::array instead of int[] you could use std::array::size_type but that's guaranteed to be size_t... just like real arrays.
@Dave When I wrote this answer I was working on a project where the convention was to use unsigned instead of unsigned int. Since either is valid it doesn’t really matter. I now use unsigned int again in my own code.

Paulo Pinto Paulo Pinto 4472 silver badges10 bronze badges · Answer 3 · 2011-03-24 19:51:18Z

A hashmap is a structure that maps a key (the hash) to a value.
In your example you'd want the key to be the number and the value to be the count of each number. If you are using a compiler that that supports C++0x, you can add

#include <unordered_map>

and then modify the body of your main loop to something like:

std::unordered_map<int, int> hashMap;
for(int i=0;i<n;i++)
{
 int number = a[i];
 // check if this number is already in the map
 if(hashMap.find(number) != hashMap.end())
 {
 // it's already in there so increment the count
 hashMap[number] += 1;
 }
 else
 {
 // it's not in there yet so add it and set the count to 1
 hashMap.insert(number, 1);
 }
}

I wrote this code down from memory so it might not even compile, although it should be straightforward to fix.

As you can see we only run through the array once so you now have the O(n) complexity.

Sloan Sloan 391 bronze badge · Answer 4 · 2011-06-22 17:54:29Z

Actually, you could do it in \$O(N)\$ using XOR to bitshift things.

This will do it will less overhead and much faster (in java, since that is what I know):

public static void main(String... args) {
 int[] x = new int[] { 5,7,4,9,5,7,4 };
 int result = 0;
 for (int i=0;i<x.length;i++) {
 result ^= x[i];
 }
 System.out.println("result:"+result);
}

Good article on this here: http://kaipakartik.amplify.com/

This only works if elements are repeating only twice. In other cases it fails (Coz It is XOR operator). The link is Also broken now.

Dale Dale 111 bronze badge · Answer 5 · 2015-09-07 16:59:20Z

This can easily be done in \$O(n)\$: run through the array and find the largest value. Create a vector of this size initialized to zeros. Run through the array and increment the element in the vector indexed by each value by one. Finally, run through the vector and find the element equal to one; the offset of this element in the vector is your answer.

Using an array is a bad idea. It would be O(n) time, but also O(m) space, where m is the maximum value. The maximum could be 4 billion. Numbers could also be negative.

Stack Exchange Network

Find the non repeating number in the integer array

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find the non repeating number in the integer array

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions