Collatz sequence in C

Question 1

First of all, if you're unfamiliar with the Collatz sequence, it recursively applies the following function to an integer:

$$ f(n) = \begin{cases} n/2 & \text{if } n \equiv 0 \pmod{2}, \\ 3n+1 & \text{if } n \equiv 1 \pmod{2}. \end{cases} $$

What I wanted to do was to make a program that would:

Quit if 0 was inputted
Reprompt the user if something invalid was inputted
For positive integers, e.g. 7, output "7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1"
For negative integers, e.g. -3, where doing the above would output "-3 -8 -4 -2 -1 -2 -1 -2... [forever repeating]", output something like "-3 -8 -4 (-2 -1)", as "-2 -1" repeats forever.

This is what I have, which fits most of the criteria, but could definitely be improved somehow. What do you guys think?:

#include <stdio.h>
#include <stdbool.h>
long long cltz(long long x);
int main() {
 long long n;
 char input[256];
 while (1) {
 printf("Enter a positive or negative integer (0 to quit): ");
 if (fgets(input, sizeof(input), stdin) == NULL) break;
 if (sscanf(input, "%lld", &n) != 1) continue;
 if (n == 0) break;
 bool sign = n < 0;
 long long on = n, slow = n, fast = n;
 while (sign) {
 slow = cltz(slow);
 fast = cltz(cltz(fast));
 if (fast == slow) break;
 }
 if (on == slow && sign) printf("( ");
 printf("%lld ", n);
 while (n != 1) {
 n = cltz(n);
 if (n == slow) break;
 printf("%lld ", n);
 }
 long long c_slow = cltz(slow);
 if (sign && on != slow) {
 printf("( %lld ", slow);
 while (c_slow != slow) {
 printf("%lld ", c_slow);
 c_slow = cltz(c_slow);
 }
 printf(")");
 }
 if (on == slow && sign) printf(")");
 printf("\n");
 }
 return 0;
}
long long cltz(long long x) {
 return (x & 1) ? ((x << 1) + x + 1) : (x >> 1);
}

Question 2

Is there a reason you want the negative loop marked as (-2 -1) but not the positive loop 4 2 1?

Question 3

Does cltz just stand for Collatz? I thought at first it was some bithack helper function since the name is very close to clz (count leading zeros) and ctz (count trailing zeros). Or clear trailing zeros? But that doesn't make sense because clearing means zeroing. Anyway, see Why does C++ code for testing the Collatz conjecture run faster than hand-written asm? re: versions that will compile efficiently for x86, specifically that using an unsigned type (uint64_t) was faster there, using /2 which has to truncate toward 0 for signed, unlike >>1.

Question 4

Let the compile optimize

Rather than (x << 1) + x + 1) : (x >> 1), use (x*3 + 1) : (x/2) to match the algorithm and enable your compiler's optimizations.

If you can optimize better than the compiler, get a better compiler.

Note that x%2 is not the same as x mod 2.
Stay with x & 1 or use x % 2llu.

Detect overflow

cltz() may overflow. Should you want to detect this:

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
// Highest odd value valid for `cltz()`
#define CLTX_ODD_HIGH ((LLONG_MAX - 1)/3)
long long cltz(long long x) {
 // return (x & 1) ? ((x << 1) + x + 1) : (x >> 1);
 if (x & 1) {
 if (llabs(x) > CLTX_ODD_HIGH) {
 fprintf(stderr, "Overflow with %lld\n", x); 
 exit(EXIT_FAILURE);
 }
 return x*3 + 1;
 }
 return x / 2; 
}

Design

Consider bool sign = n < 0; ... if (on == slow && sign) printf(")"); printf("\n"); as a helper function. That is, separate user input from doing the task.

Easier to perform testing and recover from overflow.

Question 5

Do you say that x%2 is not the same as mathematical x mod 2 because the result is different for negative numbers? Whilst that's true in general, we're in a boolean context here (i.e. implicitly x&2 != 0), for which the result should be the same, and a compiler may be able to produce identical code (though I haven't confirmed that the ones I care about actually do).

Question 6

@TobySpeight Agree that if (x%2) is like if (x&1) if x in unsigned or signed (aside from the so to be gone non- 2's complement). IAC, mod and signed % are a general source of errors given -1%2 --> -1.

Question 7

Using x/2 instead of x>>1 would make the code slower unless you also change x from long long to unsigned long long. See Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?. That would also sidestep the whole %2 issue @TobySpeight discussed. (Although compilers are smart enough to optimize x%2 != 0 the same as x&1 even for signed types, the problem is only when code insists on writing x%2 == 1 that forces compilers to make asm such that it's false for negative odd integers.)

Question 8

Oh, I read the question more carefully and see they are actually playing with negative starting values for the sequence. In general x/2 (truncating toward zero) is not equivalent to x>>1 (rounding toward -inf), but interestingly in this case it's only done on even numbers, so either works. But GCC and clang don't notice that, and do the full rounding correction using shr reg, 63 / add before sar: godbolt.org/z/osT5c94Ph So your version is less efficient. x*3+1 is good, though; compilers know how to split it into x*2 + x for an x86 LEA, no need to do that manually.

Question 9

"If you can optimize better than the compiler, get a better compiler." this should be told in class.

Question 10

Adding to chux's review:

Naming things

Avoid unnecessarily abbreviating variable and function names. Some use of single-letter variables is OK, like i for a loop index, or n for a count.

Since n is not a count but just the current value, I would write current or value. I know that it is used in the mathematical definition of the Collatz sequence, but unless you add a comment explaining this and showing that function, then a reader of your code has no idea what n means.

Instead of cltz and c, write collatz. Instead of o, write old_value or previous (depending on whether you renamed n to value or current).

Don't special-case negative integers

If you rewrite your code to always check if there is a repetition, you don't need to look at the sign bit. I would have a small array of two entries to remember the past two values. This also avoids making more calls to cltz() than necessary: you are calculating fast only to check if there is a repetition, but then you discard it even though you'll recalculate the same value soon afterwards.

Fill the array first, then calculate a new value every step, check if there is a repetition, if not print the oldest value in the array and replace it with the newly calculated one. If there is a repetition, then you can print the parentheses and the two values in the array, as those will be the repeating ones.

chux chux 36.2k2 gold badges43 silver badges96 bronze badges · Accepted Answer · 2023-06-10 03:27:55Z

9

\$\begingroup\$

Let the compile optimize

Rather than (x << 1) + x + 1) : (x >> 1), use (x*3 + 1) : (x/2) to match the algorithm and enable your compiler's optimizations.

If you can optimize better than the compiler, get a better compiler.

Note that x%2 is not the same as x mod 2.
Stay with x & 1 or use x % 2llu.

Detect overflow

cltz() may overflow. Should you want to detect this:

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
// Highest odd value valid for `cltz()`
#define CLTX_ODD_HIGH ((LLONG_MAX - 1)/3)
long long cltz(long long x) {
 // return (x & 1) ? ((x << 1) + x + 1) : (x >> 1);
 if (x & 1) {
 if (llabs(x) > CLTX_ODD_HIGH) {
 fprintf(stderr, "Overflow with %lld\n", x); 
 exit(EXIT_FAILURE);
 }
 return x*3 + 1;
 }
 return x / 2; 
}

Design

Consider bool sign = n < 0; ... if (on == slow && sign) printf(")"); printf("\n"); as a helper function. That is, separate user input from doing the task.

Easier to perform testing and recover from overflow.

Share

edited Jun 12, 2023 at 18:49

answered Jun 10, 2023 at 3:27

chux's user avatar

chux chux

36.2k2 gold badges43 silver badges96 bronze badges

\$\endgroup\$

5

3

\$\begingroup\$ Do you say that x%2 is not the same as mathematical x mod 2 because the result is different for negative numbers? Whilst that's true in general, we're in a boolean context here (i.e. implicitly x&2 != 0), for which the result should be the same, and a compiler may be able to produce identical code (though I haven't confirmed that the ones I care about actually do). \$\endgroup\$

Toby Speight
– Toby Speight

2023年06月10日 07:46:38 +00:00
Commented Jun 10, 2023 at 7:46
1

\$\begingroup\$ @TobySpeight Agree that if (x%2) is like if (x&1) if x in unsigned or signed (aside from the so to be gone non- 2's complement). IAC, mod and signed % are a general source of errors given -1%2 --> -1. \$\endgroup\$

chux
– chux

2023年06月10日 16:45:09 +00:00
Commented Jun 10, 2023 at 16:45
\$\begingroup\$ Using x/2 instead of x>>1 would make the code slower unless you also change x from long long to unsigned long long. See Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?. That would also sidestep the whole %2 issue @TobySpeight discussed. (Although compilers are smart enough to optimize x%2 != 0 the same as x&1 even for signed types, the problem is only when code insists on writing x%2 == 1 that forces compilers to make asm such that it's false for negative odd integers.) \$\endgroup\$

Peter Cordes
– Peter Cordes

2023年06月11日 02:25:04 +00:00
Commented Jun 11, 2023 at 2:25
1

\$\begingroup\$ Oh, I read the question more carefully and see they are actually playing with negative starting values for the sequence. In general x/2 (truncating toward zero) is not equivalent to x>>1 (rounding toward -inf), but interestingly in this case it's only done on even numbers, so either works. But GCC and clang don't notice that, and do the full rounding correction using shr reg, 63 / add before sar: godbolt.org/z/osT5c94Ph So your version is less efficient. x*3+1 is good, though; compilers know how to split it into x*2 + x for an x86 LEA, no need to do that manually. \$\endgroup\$

Peter Cordes
– Peter Cordes

2023年06月11日 02:36:09 +00:00
Commented Jun 11, 2023 at 2:36
1

\$\begingroup\$ "If you can optimize better than the compiler, get a better compiler." this should be told in class. \$\endgroup\$

fudo
– fudo

2023年06月12日 09:21:52 +00:00
Commented Jun 12, 2023 at 9:21

Add a comment |

Stack Exchange Network

Collatz sequence in C

2 Answers 2

Naming things

Don't special-case negative integers

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Collatz sequence in C

2 Answers 2

Naming things

Don't special-case negative integers

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions