The Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.
Let P be a binary string of length n and T be a binary string of length 2n-1. We can compute the n Hamming distances between P and every n-length substring of T in order from left to right and put them into an array (or list).
Example Hamming distance sequence
Let P = 101 and T = 01100. The sequence of Hamming distances you get from this pair is 2,2,1.
Task
For increasing n starting at n=1, consider all possible pairs of binary strings P of length n and T of length 2n-1. There are 2**(n+2n-1) such pairs and hence that many sequences of Hamming distances. However many of those sequences will be identical . The task is to find how many are distinct for each n.
Your code should output one number per value of n.
Score
Your score is the highest n your code reaches on my machine in 5 minutes. The timing is for the total running time, not the time just for that n.
Who wins
The person with the highest score wins. If two or more people end up with the same score then it is the first answer that wins.
Example answers
For n from 1 to 8 the optimal answers are 2, 9, 48, 297, 2040, 15425, 125232, 1070553.
Languages and libraries
You can use any available language and libraries you like. Where feasible, it would be good to be able to run your code so please include a full explanation for how to run/compile your code in Linux if at all possible.
My Machine The timings will be run on my 64-bit machine. This is a standard ubuntu install with 8GB RAM, AMD FX-8350 Eight-Core Processor and Radeon HD 4250. This also means I need to be able to run your code.
Leading answers
- 11 in C++ by feersum. 25 seconds.
- 11 in C++ by Andrew Epstein. 176 seconds.
- 10 in Javascript by Neil. 54 seconds.
- 9 in Haskell by nimi. 4 minutes and 59 seconds.
- 8 in Javascript by fəˈnɛtɪk. 10 seconds.
5 Answers 5
Haskell, score 9
import Data.Bits
import Data.List
import qualified Data.IntSet as S
main = mapM_ (print . S.size . S.fromList . hd) [1..9]
hd :: Int -> [Int]
hd n = [foldl' (\s e->s*m+e) 0 [popCount $ xor p (shiftR t i .&. m)|i<-[(0::Int)..n-1]] | let m=2^n-1,p<-[(0::Int)..2^n-1],t<-[(0::Int)..2^(2*n-1)-1]]
Compile with -O3. It takes 6min35s on my 6 year old laptop hardware to run up to n=9, so maybe it's under 5min on the reference hardware.
> time ./113785
2
9
48
297
2040
15425
125232
1070553
9530752
real 6m35.763s
user 6m27.690s
sys 0m5.025s
-
1\$\begingroup\$ 6 year laptop? Damn, that's some outdated tech! \$\endgroup\$Matthew Roh– Matthew Roh2017年03月25日 17:43:23 +00:00Commented Mar 25, 2017 at 17:43
-
\$\begingroup\$ @SIGSEGV: maybe it's outdated, but besides counting the number of Hamming distance sequences it does its job quite well. \$\endgroup\$nimi– nimi2017年03月25日 22:31:25 +00:00Commented Mar 25, 2017 at 22:31
JavaScript, score 10
findHamming = m => {
if (m < 2) return 2;
let popcnt = x => x && (x & 1) + popcnt(x >> 1);
let a = [...Array(1 << m)].map((_,i) => popcnt(i));
let t = 0;
let n = (1 << m) - 1;
for (let c = 0; c <= m; c++) {
for (let g = 0; g <= c; g++) {
let s = new Set;
for (let i = 1 << m; i--; ) {
for (let j = 1 << (m - 1); j--; ) {
if (a[i ^ j + j] != c) continue;
for (let k = 1 << m - 1; k--; ) {
if (a[i ^ k] != g) continue;
let f = j << m | k;
let h = 0;
for (l = m - 1; --l; ) h = h * (m + 1) + a[i ^ f >> l & n];
s.add(h);
}
}
}
t += s.size * (g < c ? 2 : 1);
}
}
return t;
};
let d = Date.now(); for (let m = 1; m < 11; m++) console.log(m, findHamming(m), Date.now() - d);
Explanation: Calculating n=10 is difficult because there are over two billion pairs and over 26 billion potential sequences. In order to speed things up I split the calculation up into 121 bins. Because the sequences are invariant under bitwise complement, I can assume without loss of generality that the middle bit of T is zero. This means that I can determine the first and last elements of the sequence independently from the the top n-1 and bottom n-1 bits of T. Each bin corresponds to a different pair of first and last elements; I then loop through all the possible sets of top and bottom bits that correspond to each bin and calculate the remaining elements of the sequence, finally counting the unique sequences for each bin. It then remains to total all 121 bins. Originally taking 45 hours, this now completed in a little under three and a half minutes on my AMD FX-8120. Edit: Thanks to @ChristianSievers for a 50% speedup. Full output:
1 2 0
2 9 1
3 48 1
4 297 2
5 2040 7
6 15425 45
7 125232 391
8 1070553 1844
9 9530752 15364
10 86526701 153699
-
\$\begingroup\$ Your code gives no output currently. \$\endgroup\$user7467– user74672017年03月27日 18:08:54 +00:00Commented Mar 27, 2017 at 18:08
-
\$\begingroup\$ @felipa Not sure what you mean. It's an anonymous function, so you call it (perhaps by assigning it to a variable first and then calling the variable as if it was a function) and pass it
nas a parameter. (Sorry for the bad choice of parameter name there.) \$\endgroup\$Neil– Neil2017年03月27日 18:25:32 +00:00Commented Mar 27, 2017 at 18:25 -
\$\begingroup\$ The question asks for code that prints out answer for n up to the highest value it can get to in 5 minutes. "Your code should output one number per value of n." \$\endgroup\$user7467– user74672017年03月27日 18:35:05 +00:00Commented Mar 27, 2017 at 18:35
-
\$\begingroup\$ It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n." \$\endgroup\$user9206– user92062017年03月28日 07:50:32 +00:00Commented Mar 28, 2017 at 7:50
-
1\$\begingroup\$ @Lembik Added timing code and also worked around bug for
n=1(don't know offhand why it hangs). \$\endgroup\$Neil– Neil2017年03月28日 08:16:06 +00:00Commented Mar 28, 2017 at 8:16
C++, score (削除) 10 (削除ここまで) 11
This is a translation of @Neil's answer into C++, with some simple parallelization. n=9 completes in 0.4 seconds, n=10 in 4.5 seconds, and n=11 in approximately 1 minute on my 2015 Macbook Pro. Also, thanks to @ChristianSievers. Due to his comments on @Neil's answer, I noticed some additional symmetries. From the original 121 buckets (for n=10), to 66 buckets when accounting for reversal, I've gotten down to just 21 buckets.
#include <iostream>
#include <cstdint>
#include <unordered_set>
#include <thread>
#include <future>
#include <vector>
using namespace std;
constexpr uint32_t popcnt( uint32_t v ) {
uint32_t c = v - ( ( v >> 1 ) & 0x55555555 );
c = ( ( c >> 2 ) & 0x33333333 ) + ( c & 0x33333333 );
c = ( ( c >> 4 ) + c ) & 0x0F0F0F0F;
c = ( ( c >> 8 ) + c ) & 0x00FF00FF;
c = ( ( c >> 16 ) + c ) & 0x0000FFFF;
return c;
}
template<uint32_t N>
struct A {
constexpr A() : arr() {
for( auto i = 0; i != N; ++i ) {
arr[i] = popcnt( i );
}
}
uint8_t arr[N];
};
uint32_t n = ( 1 << M ) - 1;
constexpr auto a = A < 1 << M > ();
uint32_t work( uint32_t c, uint32_t g, uint32_t mult ) {
unordered_set<uint64_t> s;
// Empirically derived "optimal" value
s.reserve( static_cast<uint32_t>( pow( 5, M ) ) );
for( int i = ( 1 << M ) - 1; i >= 0; i-- ) {
for( uint32_t j = 1 << ( M - 1 ); j--; ) {
if( a.arr[i ^ j + j] != c ) {
continue;
}
for( uint32_t k = 1 << ( M - 1 ); k--; ) {
if( a.arr[i ^ k] != g ) {
continue;
}
uint64_t f = j << M | k;
uint64_t h = 0;
for( uint32_t l = M - 1; --l; ) {
h = h * ( M + 1 ) + a.arr[i ^ ( f >> l & n )];
}
s.insert( h );
}
}
}
return s.size() * mult;
}
int main() {
auto t1 = std::chrono::high_resolution_clock::now();
if( M == 1 ) {
auto t2 = std::chrono::high_resolution_clock::now();
auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
cout << M << ": " << 2 << ", " << seconds << endl;
return 0;
}
uint64_t t = 0;
vector<future<uint32_t>> my_vector;
if( ( M & 1 ) == 0 ) {
for( uint32_t c = 0; c <= M / 2; ++c ) {
for( uint32_t g = c; g <= M / 2; ++g ) {
uint32_t mult = 8;
if( c == M / 2 && g == M / 2 ) {
mult = 1;
} else if( g == c || g == M / 2 ) {
mult = 4;
}
my_vector.push_back( async( work, c, g, mult ) );
}
}
for( auto && f : my_vector ) {
t += f.get();
}
} else {
for( uint32_t c = 0; c <= ( M - 1 ) / 2; ++c ) {
for( uint32_t g = c; g <= M - c; ++g ) {
uint32_t mult = 4;
if( g == c || g + c == M ) {
mult = 2;
}
my_vector.push_back( async( work, c, g, mult ) );
}
}
for( auto && f : my_vector ) {
t += f.get();
}
}
auto t2 = std::chrono::high_resolution_clock::now();
auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
cout << M << ": " << t << ", " << seconds << endl;
return 0;
}
Use the following script to execute the code:
#!/usr/bin/env bash
for i in {1..10}
do
clang++ -std=c++14 -march=native -mtune=native -Ofast -fno-exceptions -DM=$i hamming3.cpp -o hamming
./hamming
done
The output was as follows: (The format is M: result, seconds)
1: 2, 0
2: 9, 0
3: 48, 0
4: 297, 0
5: 2040, 0
6: 15425, 0.001
7: 125232, 0.004
8: 1070553, 0.029
9: 9530752, 0.419
10: 86526701, 4.459
11: 800164636, 58.865
n=12 took 42 minutes to calculate on a single thread, and gave a result of 7368225813.
-
\$\begingroup\$ How would you compile this in ubuntu using clang? \$\endgroup\$user7467– user74672017年03月27日 17:56:22 +00:00Commented Mar 27, 2017 at 17:56
-
\$\begingroup\$ @felipa I think the answer is
sudo apt-get install libiomp-dev. \$\endgroup\$user9206– user92062017年03月27日 18:17:39 +00:00Commented Mar 27, 2017 at 18:17 -
\$\begingroup\$ It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n." \$\endgroup\$user9206– user92062017年03月28日 07:50:21 +00:00Commented Mar 28, 2017 at 7:50
-
\$\begingroup\$ Rather than reimplementing it you could probably just use
__builtin_popcount. \$\endgroup\$Neil– Neil2017年03月28日 08:52:50 +00:00Commented Mar 28, 2017 at 8:52 -
\$\begingroup\$ @Lembik: I'll make the changes later today. @Neil: The popcnt function only gets evaluated at compile time, and I don't know how to use
__builtin_popcountin a constexpr context. I could go with the naïve implementation and it wouldn't affect the run time. \$\endgroup\$Andrew Epstein– Andrew Epstein2017年03月28日 12:57:00 +00:00Commented Mar 28, 2017 at 12:57
C++11 (should get to 11 or 12)
At the moment this is single-threaded.
To compile:
g++ -std=c++11 -O2 -march=native feersum.cpp
#include <iostream>
#include <unordered_set>
#include <vector>
#include <unordered_map>
#include <string.h>
using seq = uint64_t;
using bitvec = uint32_t;
using seqset = std::unordered_set<seq>;
using std::vector;
#define popcount __builtin_popcount
#define MAX_N_BITS 4
bitvec leading(bitvec v, int n) {
return v & ((1U << n) - 1);
}
bitvec trailing(bitvec v, int n, int total) {
return v >> (total - n);
}
bitvec maxP(int n) {
return 1 << (n - 1); // ~P, ~T symmetry
}
void prefixes(int n, int pre, int P, seqset& p) {
p.clear();
for (bitvec pref = 0; pref < (1U << pre); pref++) {
seq s = 0;
for (int i = 0; i < pre; i++) {
seq ham = popcount(trailing(pref, pre - i, pre) ^ leading(P, pre - i));
s |= ham << i * MAX_N_BITS;
}
p.insert(s);
}
}
vector<seqset> suffixes(int n, int suf, int off) {
vector<seqset> S(maxP(n));
for (bitvec P = 0; P < maxP(n); P++) {
for (bitvec suff = 0; suff < (1U << suf); suff++) {
seq s = 0;
for (int i = 0; i < suf; i++) {
seq ham = popcount(leading(suff, i + 1) ^ trailing(P, i + 1, n));
s |= ham << (off + i) * MAX_N_BITS;
}
S[P].insert(s);
}
}
return S;
}
template<typename T>
void mids(int n, int npre, int nsuf, int mid, bitvec P, T& S, const seqset& pre) {
seq mask = (1ULL << (npre + 1) * MAX_N_BITS) - 1;
for(bitvec m = 0; m < 1U << mid; m++) {
int pc = popcount(P ^ m);
if(pc * 2 > n) continue; // symmetry of T -> ~T : replaces x with n - x
seq s = (seq)pc << npre * MAX_N_BITS;
for(int i = 0; i < npre; i++)
s |= (seq)popcount(trailing(P, n - npre + i, n) ^ leading(m, n - npre + i)) << i * MAX_N_BITS;
for(int i = 0; i < nsuf; i++)
s |= (seq)popcount(leading(P, mid - 1 - i) ^ trailing(m, mid - 1 - i, mid)) << (npre + 1 + i) * MAX_N_BITS;
for(seq a: pre)
S[(s + a) & mask].insert(P | (s + a) << n);
}
}
uint64_t f(int n) {
if (n >= 1 << MAX_N_BITS) {
std::cerr << "n too big";
exit(1);
}
int tlen = 2*n - 1;
int mid = n;
int npre = (tlen - mid) / 2;
if(n>6) --npre;
int nsuf = tlen - npre - mid;
seqset preset;
auto sufs = suffixes(n, nsuf, npre + 1);
std::unordered_map<seq, std::unordered_set<uint64_t>> premid;
for(bitvec P = 0; P < maxP(n); P++) {
prefixes(n, npre, P, preset);
mids(n, npre, nsuf, mid, P, premid, preset);
}
uint64_t ans = 0;
using counter = uint8_t;
vector<counter> found((size_t)1 << nsuf * MAX_N_BITS);
counter iz = 0;
for(auto& z: premid) {
++iz;
if(!iz) {
memset(&found[0], 0, found.size() * sizeof(counter));
++iz;
}
for(auto& pair: z.second) {
seq s = pair >> n;
uint64_t count = 0;
bitvec P = pair & (maxP(n) - 1);
for(seq t: sufs[P]) {
seq suff = (s + t) >> (npre + 1) * MAX_N_BITS;
if (found[suff] != iz) {
++count;
found[suff] = iz;
}
}
int middle = int(s >> npre * MAX_N_BITS) & ~(~0U << MAX_N_BITS);
ans += count << (middle * 2 != n);
}
}
return ans;
}
int main() {
for (int i = 1; ; i++)
std::cout << i << ' ' << f(i) << std::endl;
}
-
\$\begingroup\$ Get's to 11 in less than 30 seconds! \$\endgroup\$user9206– user92062017年04月02日 11:52:31 +00:00Commented Apr 2, 2017 at 11:52
-
\$\begingroup\$ In case it's of interest:
feersum.cpp:111:61: warning: shifting a negative signed value is undefined [-Wshift-negative-value] int middle = int(s >> npre * MAX_N_BITS) & ~(~0 << MAX_N_BITS);\$\endgroup\$user9206– user92062017年04月02日 16:52:22 +00:00Commented Apr 2, 2017 at 16:52
JavaScript 2,9,48,297,2040,15425,125232,1070553,9530752
Run in console:
console.time("test");
h=(w,x,y,z)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal*Math.pow(w+1,z)};
sum=(x,y)=>x+y;
for(input=1;input<10;input++){
hammings=new Array(Math.pow(input+1,input));
for(i=1<<(input-1);i<1<<input;i++){
for(j=0;j<1<<(2*input);j++){
hamming=0;
for(k=0;k<input;k++){
hamming+=(h(input,(j>>k)%(1<<input),i,k));
}
hammings[hamming]=1;
}
}
console.log(hammings.reduce(sum));
}
console.timeEnd("test");
Or as a Stack Snippet:
h=(w,x,y,z)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal*Math.pow(w+1,z)};
function findHamming() {
console.time("test");
input = parseInt(document.getElementById("input").value);
hammings=new Array(Math.pow(input+1,input));
for(i=1<<(input-1);i<1<<input;i++){
for(j=0;j<1<<(2*input);j++){
hamming=0;
for(k=0;k<input;k++){
hamming+=(h(input,(j>>k)%(1<<input),i,k));
}
hammings[hamming]=1;
}
}
document.getElementById("output").innerHTML = hammings.reduce((x,y)=>x+y);
console.timeEnd("test");
}
<input type="text" id="input" value="6">
<button onclick="findHamming()"> Run </button>
<pre id="output">
The code preinitializes the array to make adding 1s to the array much faster
The code finds all the hamming distance sequences and treating them as numbers base (input+1), uses them to place 1s in an array. As a result, this generates an array with the n 1s where n is the number of unique hamming distance sequences. Finally, the number of 1s is counted using array.reduce() to sum all of the values in the array.
This code will not be able to run for input of 10 as it hits memory limits
This code runs in O(2^2n) time because that's how many elements it generates.
-
1\$\begingroup\$ Unsurprisingly, trying to create a 26*10^9 element array doesn't work \$\endgroup\$fəˈnɛtɪk– fəˈnɛtɪk2017年03月24日 18:55:59 +00:00Commented Mar 24, 2017 at 18:55
-
\$\begingroup\$
n = 9takes 5 minutes and 30 seconds for me using node.js so is just too slow. \$\endgroup\$user9206– user92062017年03月24日 20:38:54 +00:00Commented Mar 24, 2017 at 20:38 -
\$\begingroup\$ @Lembik
n = 8originally took 24 seconds on my PC, but I was able to optimise the code so thatn = 8took 6 seconds. I then triedn = 9and that took 100 seconds. \$\endgroup\$Neil– Neil2017年03月24日 21:37:56 +00:00Commented Mar 24, 2017 at 21:37 -
\$\begingroup\$ @Neil You should submit an answer ! \$\endgroup\$user9206– user92062017年03月24日 21:49:34 +00:00Commented Mar 24, 2017 at 21:49
-
\$\begingroup\$ It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n." \$\endgroup\$user9206– user92062017年03月28日 07:50:42 +00:00Commented Mar 28, 2017 at 7:50
fastest-codeleaves more space for optimizations through both code level optimizations and a good algorithm. So I do think thatfaster-codeis better thanfaster-algorithm. \$\endgroup\$