Character permutations - Revisited

Question 1

Several hours ago I posted this question about Generating character permutations in Python.

User rolfl said that my main problem was not using the right tool for the job, as Python is interpreted it will be much slower than a compiled solution. So I tried to write this in C.

Objective

Generate a file containing all fixed length l permutations of a given set of chars, in this case upper-case alphabet.

i.e.: With l = 5 the output file should look like:

AAAAA
AAAAB
AAAAC
...
ZZZZZ

My solution

I used a recursive for loop. It's my first program in C (after a hello world) so I don't know what it's worth. It runs in about 2.621s on my 2.9 GHz Intel Core i7

#include <stdio.h>
FILE *fp;
char alphabet[26] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'};
int len = 5;
void recloop(char *pwd, int repeat) {
 int i;
 for (i = 0; i < 26; i++) {
 pwd[len-repeat] = alphabet[i];
 if (repeat == 1) {
 fprintf(fp, "%s\n", pwd);
 }
 else {
 recloop(pwd, repeat - 1);
 }
 }
}
int main(void) {
 char pwd[len];
 fp = fopen("output", "w");
 recloop(pwd, len);
}

Question 2

"I thought it would belong to a new post as I'm dealing with another language." Any time that you revise code, you should make a new post. See here for more info.

Question 3

I have spent a couple of hours struggling with a bit of a performance mystery, but, the long and short of it is that I have essentially produced a manual stack in C, and made it generic for the use-case you are investigating.

The mystery is that compiling with:

gcc -o pwgen -Wall -D_GNU_SOURCE -std=c99 pwgen.c

on my computer is 3 times faster than:

gcc -o pwgen -Wall -D_GNU_SOURCE -std=c99 -O3 pwgen.c

... and that makes no sense...

Moving on....

There are some things which are significant here.

Don't output to a file. Output to stdout and redirect the results.
similar to what vnp has, you create a logical stack, and 'increment' it.
don't use recursion, the additional stack management is not necessary.

I put the code together in a way that makes it more maintainable. Note that there are no global variables, and you can pass the alphabet and generated size in to the function and it will 'just work'.

Writing maintainable code is something you will thank yourself for in the future.

Note the performance of the code:

Java: 1.204u 0.148s 0:01.14 117.5%
C: 0.392u 0.012s 0:00.40 100.0%

So, for a short run, the Java code is 3 times slower. Note that a lot of that time is probably startup overhead for Java.

When I run it with 6 characters, it runs as:

Java: 20.517u 0.524s 0:20.44 102.8%
C: 9.980u 0.092s 0:10.07 100.0%

The C version is almost exactly 26 times slower... which is good.

The Java version is half the speed of the C version for 6 char passwords.... It's in the ballpark of what I expect...

The C code I ran (5 chars) is:

#include <stdio.h>
#include <string.h>
void recloop(const int len, const char* alphabet) {
 char pwd[len + 1];
 int cnt[len];
 const int chars = strlen(alphabet);
 unsigned long long count = 1;
 for (int i = 0; i < len; i++)
 {
 // initialize the arrays, and count the number of permutations
 pwd[i] = alphabet[0];
 cnt[i] = 0;
 count *= chars;
 }
 // set the null terminator.
 pwd[len] = 0;
 //printf("There are %d chars with %llu permutations and %s first value\n", chars, count, pwd);
 while(count--)
 {
 printf("%s\n", pwd);
 int pos = len - 1;
 while (++cnt[pos] == chars)
 {
 cnt[pos] = 0;
 pwd[pos] = alphabet[0];
 pos--;
 }
 pwd[pos] = alphabet[cnt[pos]];
 }
}
int main(void) {
 char *alpha = strdup("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
 recloop(5, alpha);
 return 0;
}

Note that the 6-char password generated 2GB of data in 10 seconds, or 200MB/second. You will be limited by your storage system probably, not the CPU.

Update: Detail on the inner loops

Consider a small alphabet of "ABC", and a small output of only 2 characters.

The outputs are:

AA, AB, AC, BA, BB, BC, CA, CB, CC

We produce these by setting up two arrays, the actual char pwd array, and an index array:

We initialize the arrays as:

cnt -> 0,0
pwd -> A,A,0円

Then, we loop. In the outer loop We know exactly how many loops we will do because it is \3ドル^2\,ドル 9 times. That's how many passwords we generate.

The interesting part is the inner loop. On start up, cnt is [0,0]. We treat this like an index in to the characters, and we increment the right-most one. Let me lay out all the increments for you:

cnt pwd
--- ---
0,0 -> AA
0,1 -> AB
0,2 -> AC
1,0 -> BA
1,1 -> BB
1,2 -> BC
2,0 -> CA
2,1 -> CB
2,2 -> CC

That's the progression we want to follow. We increment the right-most digit. If it becomes 3 (the number of characters), we set it back to 0, and move 1 digit to the left.

If these were decimal numbers, it would be the same as there being 10 characters in the alphabet. You can think of the logic you would go through to roll 999 to 1000. Add 1 to 9, it becomes 0 carry 10. Add that 10 to 90, it becomes 100 carry 1000, and that's the result.

That's what the inner loop does... it starts at the right, and adds one, and carries it back if there's an overflow. The overflow is configurable though, by the number of characters in the alphabet, and the number of digits is how wide the value is. Each time it changes a value in the cnt array, it copies the corresponding char from the alphabet back in to the pwd.

So, if the last password was ABCZZ we will add 1 to the last char, which 'wraps' it back to ABCZA but we also then loop again with the next char, making it ABCAA with another loop making it ABDAA. The D is not an overflow, so that's the last of the inner loops.

Note that, because count is strictly calculated, the inner loop will never have a negative overflow....

Question 4

Thanks it's definitely lightning fast and learnt some good practice. I understand the whole code apart from the nested while loop. I understand what it does but I'm not sure exactly how. What's the logic behind those four lines of code if I may ask?

Question 5

updated with detail on inner loops

Question 6

A few notes:

You aren't declaring alphabet in a very readable and maintainable form. Here's how I would do it:
```
static const char alphabet[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
```
As you can see, I also made this variable static and const. static renders variable local to the file, and const makes it so that we can't screw with the array somewhere along the line (and also makes things a tidbit faster). This leads me to my next point...

Also, you don't have to specify the number of elements in the array when you declare it like this, since the compiler can determine that itself (because it is constant).
fp and len are global variables when they shouldn't be. Arguably alphabet shouldn't be one either, but I think it is find as long as you have the static in front of it. Anyways, they should be declared as local variables and then passed to the functions that need them.
I'm not sure your algorithm is the most optimal. I used a method known as "backtracking" myself, which seems to be the most efficient that I could find in practice.
fprintf() isn't going to be as efficient compared to fwrite(). I would consider rewriting your code if you still want to generate a permutation table.

Time test (without compiler optimizations)

mycode.c:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
static const char alphabet[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static const int alphabetSize = sizeof(alphabet) - 1;
static void bruteImpl(char* str, size_t index, size_t maxDepth)
{
 for (size_t i = 0; i < alphabetSize; ++i)
 {
 str[index] = alphabet[i];
 if (index == maxDepth - 1); //printf("%s\n", str);
 else bruteImpl(str, index + 1, maxDepth);
 }
}
void bruteSequential(size_t maxLen)
{
 char* buf = calloc(maxLen + 1, sizeof(char));
 for (size_t len = 1; len <= maxLen; ++len)
 {
 bruteImpl(buf, 0, len);
 }
 free(buf);
}
int main(void)
{
 bruteSequential(5);
}

$ gcc opCode.c # I took out your fprintf call so that it wouldn't be a factor
$ time ./a.out 
real 0m0.170s
user 0m0.139s
sys 0m0.009s
$ gcc mycode.c
$ time ./a.out
real 0m0.043s
user 0m0.039s
sys 0m0.002s

Question 7

Could you please publish your code (see my answer below).

Question 8

@vnp Answer has been updated.

Question 9

Think of the string you are generating as numbers written in 26-base. Then obtaining a next string is a matter of a simple increment:

increment_string(char * s)
{
 while(*s) {
 *s += 1;
 if (*s > 'Z') {
 *s++ = 'A';
 } else {
 return true;
 }
 }
 return false;
}

Question 10

Similar to @vnp's answer, this answer uses a base-N index to pull out all the permutations from a letters string

double fac( int n )
{
 if( !n ) return 1;
 else return n*fac(n-1);
}
int numPerms( int n, int k )
{
 return fac(n)/(fac(n-k));
}
string getPerm( int index, int k, string letters )
{
 int n = letters.size();
 string ans;
 for( int i = 0; i < k; i++ )
 {
 int letterIndex = index % n;
 index /= n;
 ans = letters[letterIndex] + ans;
 }
 return ans;
}
void printPerms()
{
 string letters = "abcdx";
 int n = letters.size();
 int k = 2;
 int perms = numPerms(n,k);
 for( int i = 0; i < perms; i++ )
 {
 puts( getPerm( i, k, letters ).c_str() );
 }
}
int main()
{
 printPerms();
}

Question 11

Note that your factorial computation is not the right formula for permutations, but for combinations.

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Accepted Answer · 2014-10-22 01:16:26Z

I have spent a couple of hours struggling with a bit of a performance mystery, but, the long and short of it is that I have essentially produced a manual stack in C, and made it generic for the use-case you are investigating.

The mystery is that compiling with:

gcc -o pwgen -Wall -D_GNU_SOURCE -std=c99 pwgen.c

on my computer is 3 times faster than:

gcc -o pwgen -Wall -D_GNU_SOURCE -std=c99 -O3 pwgen.c

... and that makes no sense...

Moving on....

There are some things which are significant here.

Don't output to a file. Output to stdout and redirect the results.
similar to what vnp has, you create a logical stack, and 'increment' it.
don't use recursion, the additional stack management is not necessary.

I put the code together in a way that makes it more maintainable. Note that there are no global variables, and you can pass the alphabet and generated size in to the function and it will 'just work'.

Writing maintainable code is something you will thank yourself for in the future.

Note the performance of the code:

Java: 1.204u 0.148s 0:01.14 117.5%
C: 0.392u 0.012s 0:00.40 100.0%

So, for a short run, the Java code is 3 times slower. Note that a lot of that time is probably startup overhead for Java.

When I run it with 6 characters, it runs as:

Java: 20.517u 0.524s 0:20.44 102.8%
C: 9.980u 0.092s 0:10.07 100.0%

The C version is almost exactly 26 times slower... which is good.

The Java version is half the speed of the C version for 6 char passwords.... It's in the ballpark of what I expect...

The C code I ran (5 chars) is:

#include <stdio.h>
#include <string.h>
void recloop(const int len, const char* alphabet) {
 char pwd[len + 1];
 int cnt[len];
 const int chars = strlen(alphabet);
 unsigned long long count = 1;
 for (int i = 0; i < len; i++)
 {
 // initialize the arrays, and count the number of permutations
 pwd[i] = alphabet[0];
 cnt[i] = 0;
 count *= chars;
 }
 // set the null terminator.
 pwd[len] = 0;
 //printf("There are %d chars with %llu permutations and %s first value\n", chars, count, pwd);
 while(count--)
 {
 printf("%s\n", pwd);
 int pos = len - 1;
 while (++cnt[pos] == chars)
 {
 cnt[pos] = 0;
 pwd[pos] = alphabet[0];
 pos--;
 }
 pwd[pos] = alphabet[cnt[pos]];
 }
}
int main(void) {
 char *alpha = strdup("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
 recloop(5, alpha);
 return 0;
}

Note that the 6-char password generated 2GB of data in 10 seconds, or 200MB/second. You will be limited by your storage system probably, not the CPU.

Update: Detail on the inner loops

Consider a small alphabet of "ABC", and a small output of only 2 characters.

The outputs are:

AA, AB, AC, BA, BB, BC, CA, CB, CC

We produce these by setting up two arrays, the actual char pwd array, and an index array:

We initialize the arrays as:

cnt -> 0,0
pwd -> A,A,0円

Then, we loop. In the outer loop We know exactly how many loops we will do because it is \3ドル^2\,ドル 9 times. That's how many passwords we generate.

The interesting part is the inner loop. On start up, cnt is [0,0]. We treat this like an index in to the characters, and we increment the right-most one. Let me lay out all the increments for you:

cnt pwd
--- ---
0,0 -> AA
0,1 -> AB
0,2 -> AC
1,0 -> BA
1,1 -> BB
1,2 -> BC
2,0 -> CA
2,1 -> CB
2,2 -> CC

That's the progression we want to follow. We increment the right-most digit. If it becomes 3 (the number of characters), we set it back to 0, and move 1 digit to the left.

If these were decimal numbers, it would be the same as there being 10 characters in the alphabet. You can think of the logic you would go through to roll 999 to 1000. Add 1 to 9, it becomes 0 carry 10. Add that 10 to 90, it becomes 100 carry 1000, and that's the result.

That's what the inner loop does... it starts at the right, and adds one, and carries it back if there's an overflow. The overflow is configurable though, by the number of characters in the alphabet, and the number of digits is how wide the value is. Each time it changes a value in the cnt array, it copies the corresponding char from the alphabet back in to the pwd.

So, if the last password was ABCZZ we will add 1 to the last char, which 'wraps' it back to ABCZA but we also then loop again with the next char, making it ABCAA with another loop making it ABDAA. The D is not an overflow, so that's the last of the inner loops.

Note that, because count is strictly calculated, the inner loop will never have a negative overflow....

Thanks it's definitely lightning fast and learnt some good practice. I understand the whole code apart from the nested while loop. I understand what it does but I'm not sure exactly how. What's the logic behind those four lines of code if I may ask?

Stack Exchange Network

Character permutations - Revisited

Objective

My solution

4 Answers 4

Time test (without compiler optimizations)

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Character permutations - Revisited

Objective

My solution

4 Answers 4

Time test (without compiler optimizations)

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions