I'm trying to find maximum the Collatz sequence between 1 and 1000000. I wrote the following code below. I guess it is correct but it is extremely slow. Can you give me a few hints to make it faster?
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
bool myfn(int i, int j) { return i<j; }
int collatz(int x);
int main()
{
vector <int> myvector;
for(int i = 1; i < 1000000; i++)
{
myvector.push_back(collatz(i));
}
cout<<*max_element(myvector.begin(),myvector.end(),myfn);
return 0;
}
int collatz(int x)
{
int counter = 1;
while(1)
{
if(x == 1)
break;
if(x % 2 == 0)
{
x = x / 2;
counter++;
}
else
{
x = 3 * x + 1;
counter++;
}
}
return counter;
}
1 Answer 1
Your program is non-portable, since an int
is only guaranteed to hold numbers up to 32767. I would change your data type to unsigned long long
; otherwise, collatz(159487)
will overflow.
Your while
condition is clumsy.
while(1)
{
if(x == 1)
break;
...
}
should just be
while (x != 1)
{
...
}
The key insight is that if you already know, for example, the value of collatz(37)
, then collatz(74)
is just 1 + collatz(74 / 2)
, and you can reuse the previously computed result. Therefore, your collatz()
function should have access to the results vector — either
- pass it in by reference, or
- make the vector an instance variable, and the function a method of the class.
It turns out that recursion works quite well for this problem: every chain will terminate in a few hundred steps, so your function call chains will be at most a few hundred frames deep. Using recursion allows you to cache the result for every number along the chain, which you can't do using iteration.
#include <iostream>
#include <limits>
#include <stdexcept>
#include <vector>
int collatz(unsigned long long n, std::vector<int> &results) {
int c;
if (n == 1) {
return 1;
} else if (n < results.size() && results[n]) {
return results[n];
} else if (n % 2 == 0) {
c = 1 + collatz(n / 2, results);
} else if (n >= (std::numeric_limits<unsigned long long>::max() - 1) / 3) {
throw std::overflow_error("Overflow");
} else {
c = 1 + collatz(3 * n + 1, results);
}
if (n < results.size()) {
results[n] = c;
}
return c;
}
int main() {
const unsigned long long N = 1000000ULL;
std::vector<int> results(N);
int max = 0, max_i;
for (unsigned long long i = 1; i < N; ++i) {
results[i] = collatz(i, results);
// std::cout << "collatz(" << i << ") = " << results[i] << std::endl;
if (max < results[i]) {
max = results[i];
max_i = i;
}
}
std::cout << "Max: " << max_i << " with " << max << " steps" << std::endl;
return 0;
}
By packaging your code into a class, you can let main()
not worry about some of the details:
template <typename T>
class Collatz {
public:
Collatz(T limit) : limit(limit), results(limit) {}
int operator[](T n) {
int c;
if (n == 1) {
return 1;
} else if (n < results.size() && results[n]) {
return results[n];
} else if (n % 2 == 0) {
c = 1 + (*this)[n / 2];
} else if (n >= (std::numeric_limits<T>::max() - 1) / 3) {
throw std::overflow_error("Overflow");
} else {
c = 1 + (*this)[3 * n + 1];
}
if (n < results.size()) {
results[n] = c;
}
return c;
}
const T limit;
private:
std::vector<int> results;
};
int main() {
Collatz<unsigned long long> collatz(1000000ULL);
int max = 0, max_i;
for (unsigned long long i = 1; i < collatz.limit; ++i) {
// std::cout << "collatz(" << i << ") = " << collatz[i] << std::endl;
if (max < collatz[i]) {
max = collatz[i];
max_i = i;
}
}
std::cout << "Max: " << max_i << " with " << max << " steps" << std::endl;
return 0;
}
-
\$\begingroup\$ make the vector an instance variable, and the function a method of the class. Can you make this a little bit more clear? I couldn't understand. Thanks for such a helpful answer by the way. \$\endgroup\$jason– jason2014年01月23日 19:36:48 +00:00Commented Jan 23, 2014 at 19:36
x = x / 2
can be shortened tox /= 2
. \$\endgroup\$