Does this implementation guarantee execution of \$O(n)\$ time? What is the additional space utilization for this implementation by excluding the original array size? Is it \$O(1)\$?
#include<iostream>
#include<set>
#include<algorithm>
bool ArraysPermute(int array1[],int size1, int array2[], int size2)
{
if( size1 != size2)
return false;
else
{
std::set<int> first_set(array1, array1+size1);
std::set<int> second_set(array2, array2+size2);
std::pair<std::set<int>::iterator,std::set<int>::iterator> myPair=std::mismatch(first_set.begin(),first_set.end(),second_set.begin());
if( *myPair.first != *myPair.second)
return false;
else
return true;
}
}
int main()
{
int array1[] ={1,2,3,5};
int array2[]={1,2,4,3};
if(ArraysPermute(array1,sizeof(array1)/sizeof(int), array2, sizeof(array2)/sizeof(int)))
std::cout<< " Arrays are permutation of each other\n";
else
std::cout<< " Arrays are not permutation of each other\n";
return 0;
}
3 Answers 3
Complexity
You suggest that the performance is \$O(n)\,ドル but it is not. It is \$O(n \log{n})\$ because the set
constructor is a binary-tree building algorithm that has an additional log(n) operations for each value.
Style
While your code is all neat, and well laid out, the variable names are good, etc., there are two issues I see with unnecessary else-condiditions on if-statements, and un-braced 1-liners. Additionally, white-space around punctuation and operators is useful for readability. This code:
if( size1 != size2) return false; else { std::set<int> first_set(array1, array1+size1); std::set<int> second_set(array2, array2+size2); std::pair<std::set<int>::iterator,std::set<int>::iterator> myPair=std::mismatch(first_set.begin(),first_set.end(),second_set.begin()); if( *myPair.first != *myPair.second) return false; else return true; }
should be written:
if( size1 != size2)
{
return false;
}
std::set<int> first_set(array1, array1 + size1);
std::set<int> second_set(array2, array2 + size2);
std::pair<std::set<int>::iterator, std::set<int>::iterator> myPair
= std::mismatch(first_set.begin(), first_set.end(), second_set.begin());
return *myPair.first == *myPair.second;
Bugs
The set
construct keeps just a single entry for all instances of a given value. Thus, for example, given the input [1,2,3,4,4,4,4]
it will have just 4 entries.
This leads to bugs in your code when there are duplicates. For example, your code will identify the following two input arrays as being permutations of each other:
[1,2,3,4,5,5,5]
[1,1,1,2,3,4,5]
-
\$\begingroup\$ Thank you for the answer. To resolve the bug, I may need to use std::vector in place of std::set and sort it before checking the std::mismatch, I will correct it. \$\endgroup\$Steephen– Steephen2015年03月24日 13:00:05 +00:00Commented Mar 24, 2015 at 13:00
-
1\$\begingroup\$ Most of the std::sort implementations are in n(logn), so again second version should ends up in O(nlogn) performance \$\endgroup\$Steephen– Steephen2015年03月24日 13:01:42 +00:00Commented Mar 24, 2015 at 13:01
-
\$\begingroup\$ Reference from the link you mentioned in your answer. " range constructor : Constructs a container with as many elements as the range [first,last), with each element emplace-constructed from its corresponding element in that range." It is obvious running an operation is O(nlogn) with std:;set, but I think construction is O(n) for a range based constructor of std::set \$\endgroup\$Steephen– Steephen2015年03月24日 20:53:22 +00:00Commented Mar 24, 2015 at 20:53
-
1\$\begingroup\$ @Steephen - if you read the documentation again, you will see it says (emphasis mine): For all other cases, linear in the distance between the iterators if the elements are already sorted according to the same criterion. For unsorted sequences, linearithmic (NlogN) in that distance ...* \$\endgroup\$rolfl– rolfl2015年03月24日 22:55:51 +00:00Commented Mar 24, 2015 at 22:55
-
1\$\begingroup\$ @Steephen: Sure if your range is already sorted. But I don't see that as a stipulation about your input arrays. Otherwise it is
O(n.log(n))
\$\endgroup\$Loki Astari– Loki Astari2015年03月25日 00:18:57 +00:00Commented Mar 25, 2015 at 0:18
This code runs in \$O(n \log{n})\$ time because a C++ set is a Red-Black tree, and insertions/deletions/finds run in \$O(\log{n})\$ time. In addition, since you are not using a multiset, your code is not actually correct for arrays with duplicate elements.
In order to resolve this, a simple solution is to use an std::unordered_map
(which is a hash table with \$O(1)\$ operations) with the element as the key and the number of occurrences as the value. Construct a map for the 1st and 2nd arrays, and then iterate over the keys (or the array, complexity will be the same) and verify that the values are the same in both maps.
-
\$\begingroup\$ construction of first_set itself will take O(n) complexity since it is using a range based constructor of std::set. And none of the listed operations of std::set , listed in your answer are not using also here. So how did you calculate its order is O(nlogn)? \$\endgroup\$Steephen– Steephen2015年03月24日 22:52:59 +00:00Commented Mar 24, 2015 at 22:52
-
1\$\begingroup\$ Set construction takes O(nlogn) for unsorted arrays. See cplusplus.com/reference/set/set/set. The construction process could be thought of us a series of n insertions, even if it's not implemented that way. I mostly added the operation information for completeness's sake. \$\endgroup\$mleyfman– mleyfman2015年03月24日 23:21:01 +00:00Commented Mar 24, 2015 at 23:21
Algorithm and Complexity
As reported in other answers, your program most likely executes in \$O(n.log(n))\$ time and uses \$O(n)\$ extra space, probably substantially more than the original space used by the arrays, and may report erroneous results because of a bad choice of class.
Better algorithm
Forget using anything this sophisticated, this problem can be solved in \$O(n)\$ time with astute brute force, while still using \$O(n)\$ extra space:
- make temporary copies of array1 and array2
- sort both temporary arrays
- compare them -> if equal, result is true, otherwise false.
- free the temporary arrays.
You may think the complexity to be \$O(n.log(n))\,ドル and it may be so if you use the standard sort method, but keep in mind these arrays are a special case that can be sorted in \$O(n)\$ time using Radix sort
I am not familiar enough with C++ to tell if you will need to implement radix sort yourself or if the standard array sort will perform correctly.
Style
- indentation is inconsistent;
- use white space more systematically and consistently;
- keep line length under control: a limit of 79 characters is advisable;
- the number of elements in the arrays should be computed in a safer way with
sizeof(array1) / sizeof(array1[0])
instead ofsizeof(array1)/sizeof(int)
-
\$\begingroup\$ How do you propose to sort an array in O(n) time? \$\endgroup\$200_success– 200_success2015年03月24日 22:25:19 +00:00Commented Mar 24, 2015 at 22:25
-
\$\begingroup\$ @200_success: HaHa! look up radix sort: en.wikipedia.org/wiki/Radix_sort \$\endgroup\$chqrlie– chqrlie2015年03月24日 22:27:26 +00:00Commented Mar 24, 2015 at 22:27
-
\$\begingroup\$ That's an iffy claim, as the problem does not state a limit on the size of the elements. \$\endgroup\$200_success– 200_success2015年03月24日 22:30:23 +00:00Commented Mar 24, 2015 at 22:30
-
\$\begingroup\$ The elements are
int
. Runningsizeof(int)
passes of distribution sort over 256 buckets will sort the array in O(n) time \$\endgroup\$chqrlie– chqrlie2015年03月24日 22:32:32 +00:00Commented Mar 24, 2015 at 22:32 -
2\$\begingroup\$ @Steephen: call this defensive programming. If you or the next programmer/maintainer of this piece of code later change the type of
array1
andarray2
to a type with a different size (short
orlong
for example) or if you cut and paste the code to a different function where these types differ, forgetting to update thesizeof(int)
will cause spurious bugs where using an expression independent of the type ofarray1
will not. \$\endgroup\$chqrlie– chqrlie2015年03月25日 07:36:16 +00:00Commented Mar 25, 2015 at 7:36
std::is_permutation
in the standard library since C++11 :) \$\endgroup\$std::is_permutation
has a complexity worst case of O(n2). Unless the template is special cased forint
, it could be much slower. \$\endgroup\$