I need a review of the following code for finding all permutations of a string in C++:
List Permute(const String& string)
{
if (string.length == 0)
return EmptyStrings(1);
List prevList = Permute(SubString(string,1,string.length));
List nextList = EmptyStrings(string.length*prevList.length);
for (int i=0; i<prevList.length; i++)
{
for (int j=0; j<string.length; j++)
{
nextList[i*string.length+j] += SubString(prevList[i],0,j);
nextList[i*string.length+j] += string[0];
nextList[i*string.length+j] += SubString(prevList[i],j,string.length-1);
}
}
return nextList;
}
You may assume the correctness of the following:
class List
class String
List EmptyStrings(int n); // returns a list of 'n' empty strings
String SubString(const String& string,int i,int j); // returns the string 'string[i...j)'
-
1\$\begingroup\$ This is cheating, but I thought it would be nice to mention that <algorithm> has useful functions for this such as next_permutation, prev_permutation, and is_permutation. \$\endgroup\$red_eight– red_eight2014年01月18日 18:17:46 +00:00Commented Jan 18, 2014 at 18:17
1 Answer 1
You may need to remove duplicates. Consider what your output will be if, for example, your input string is "aaa"
.
Code like this is unusual:
// nextList contains an array of empty strings
nextList[i*string.length+j] += SubString(prevList[i],0,j);
IMO this would be more usual:
// nextList is empty
// calculate new string
string nextString = SubString(prevList[i],0,j)
+ string[0]
+ SubString(prevList[i],j,string.length-1);
nextList.Add(nextString);
Returning by value is sometimes necessary but can be inefficient. Your algorithm creates many temporary List objects. A better API might be ...
void Permute(const String& inputString, List& outputList) { ... }
... where one empty List is allocated by the caller, and you add to it.
This statement may be incorrect ...
List prevList = Permute(SubString(string,1,string.length));
... I don't know how returns the string 'string[i...j)'
is defined but perhaps this statement should be ...
List prevList = Permute(SubString(string,1,string.length - 1));
You may run out of memory very quickly. For a string of length n
there are n!
('n factorial') permutations. "what's up doc?"
has 14 characters => 87178291200 (or 0x144C3B2800) i.e. bigger than 232 combinations.
-
2\$\begingroup\$ Never thought of the possibility of duplicates, thanks \$\endgroup\$barak manos– barak manos2014年01月18日 10:51:17 +00:00Commented Jan 18, 2014 at 10:51
-
\$\begingroup\$ As to the 'run out of memory' - that will obviously happen regardless of how efficient my method is in terms of memory consumption, due to the number of permutations itself \$\endgroup\$barak manos– barak manos2014年01月18日 12:11:59 +00:00Commented Jan 18, 2014 at 12:11
-
\$\begingroup\$ It's true that Combinatorial optimization is a non-trivial problem for computer scientists. Perhaps (I don't know) if you search you'll find (previously researched) algorithms for this "permutations of a string" problem. IMO there might be a solution which can work for larger strings, by generating/outputing the list of permutations to a (larger than 4 GB) output file on disk. \$\endgroup\$ChrisW– ChrisW2014年01月18日 12:21:24 +00:00Commented Jan 18, 2014 at 12:21
-
2\$\begingroup\$ Returning by value is unlikely to
inefficient
. RVO and NRVO basically prevent the copy from happening (ie it is elided). So prefer to use the more natural style of return by value. Also with move semantics in C++11 it becomes important to use return by value so that these kick in as expected automatically. \$\endgroup\$Loki Astari– Loki Astari2014年01月18日 13:23:43 +00:00Commented Jan 18, 2014 at 13:23