Count the number of Hamming distance sequences

Question 1

The Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.

Let P be a binary string of length n and T be a binary string of length 2n-1. We can compute the n Hamming distances between P and every n-length substring of T in order from left to right and put them into an array (or list).

Example Hamming distance sequence

Let P = 101 and T = 01100. The sequence of Hamming distances you get from this pair is 2,2,1.

Task

For increasing n starting at n=1, consider all possible pairs of binary strings P of length n and T of length 2n-1. There are 2**(n+2n-1) such pairs and hence that many sequences of Hamming distances. However many of those sequences will be identical . The task is to find how many are distinct for each n.

Your code should output one number per value of n.

Score

Your score is the highest n your code reaches on my machine in 5 minutes. The timing is for the total running time, not the time just for that n.

Who wins

The person with the highest score wins. If two or more people end up with the same score then it is the first answer that wins.

Example answers

For n from 1 to 8 the optimal answers are 2, 9, 48, 297, 2040, 15425, 125232, 1070553.

Languages and libraries

You can use any available language and libraries you like. Where feasible, it would be good to be able to run your code so please include a full explanation for how to run/compile your code in Linux if at all possible.

My Machine The timings will be run on my 64-bit machine. This is a standard ubuntu install with 8GB RAM, AMD FX-8350 Eight-Core Processor and Radeon HD 4250. This also means I need to be able to run your code.

Leading answers

11 in C++ by feersum. 25 seconds.
11 in C++ by Andrew Epstein. 176 seconds.
10 in Javascript by Neil. 54 seconds.
9 in Haskell by nimi. 4 minutes and 59 seconds.
8 in Javascript by fəˈnɛtɪk. 10 seconds.

Question 2

.. any available free* languages?

Question 3

fastest-code? not fastest-algorithm? You know, people could go with language with a darn fast interpreter and make a significant difference in time, but time complexity is always same, so It somewhat makes it fair.

Question 4

Related.

Question 5

@SIGSEGV fastest-code leaves more space for optimizations through both code level optimizations and a good algorithm. So I do think that faster-code is better than faster-algorithm.

Question 6

Haskell, score 9

import Data.Bits
import Data.List
import qualified Data.IntSet as S
main = mapM_ (print . S.size . S.fromList . hd) [1..9]
hd :: Int -> [Int]
hd n = [foldl' (\s e->s*m+e) 0 [popCount $ xor p (shiftR t i .&. m)|i<-[(0::Int)..n-1]] | let m=2^n-1,p<-[(0::Int)..2^n-1],t<-[(0::Int)..2^(2*n-1)-1]]

Compile with -O3. It takes 6min35s on my 6 year old laptop hardware to run up to n=9, so maybe it's under 5min on the reference hardware.

> time ./113785
2
9
48
297
2040
15425
125232
1070553
9530752
real 6m35.763s
user 6m27.690s
sys 0m5.025s

Question 7

6 year laptop? Damn, that's some outdated tech!

Question 8

@SIGSEGV: maybe it's outdated, but besides counting the number of Hamming distance sequences it does its job quite well.

Question 9

JavaScript, score 10

findHamming = m => { 
 if (m < 2) return 2;
 let popcnt = x => x && (x & 1) + popcnt(x >> 1);
 let a = [...Array(1 << m)].map((_,i) => popcnt(i));
 let t = 0;
 let n = (1 << m) - 1;
 for (let c = 0; c <= m; c++) {
 for (let g = 0; g <= c; g++) {
 let s = new Set;
 for (let i = 1 << m; i--; ) {
 for (let j = 1 << (m - 1); j--; ) {
 if (a[i ^ j + j] != c) continue;
 for (let k = 1 << m - 1; k--; ) {
 if (a[i ^ k] != g) continue;
 let f = j << m | k;
 let h = 0;
 for (l = m - 1; --l; ) h = h * (m + 1) + a[i ^ f >> l & n];
 s.add(h);
 }
 }
 }
 t += s.size * (g < c ? 2 : 1);
 }
 }
 return t;
};
let d = Date.now(); for (let m = 1; m < 11; m++) console.log(m, findHamming(m), Date.now() - d);

Explanation: Calculating n=10 is difficult because there are over two billion pairs and over 26 billion potential sequences. In order to speed things up I split the calculation up into 121 bins. Because the sequences are invariant under bitwise complement, I can assume without loss of generality that the middle bit of T is zero. This means that I can determine the first and last elements of the sequence independently from the the top n-1 and bottom n-1 bits of T. Each bin corresponds to a different pair of first and last elements; I then loop through all the possible sets of top and bottom bits that correspond to each bin and calculate the remaining elements of the sequence, finally counting the unique sequences for each bin. It then remains to total all 121 bins. Originally taking 45 hours, this now completed in a little under three and a half minutes on my AMD FX-8120. Edit: Thanks to @ChristianSievers for a 50% speedup. Full output:

1 2 0
2 9 1
3 48 1
4 297 2
5 2040 7
6 15425 45
7 125232 391
8 1070553 1844
9 9530752 15364
10 86526701 153699

Question 10

Your code gives no output currently.

Question 11

@felipa Not sure what you mean. It's an anonymous function, so you call it (perhaps by assigning it to a variable first and then calling the variable as if it was a function) and pass it n as a parameter. (Sorry for the bad choice of parameter name there.)

Question 12

The question asks for code that prints out answer for n up to the highest value it can get to in 5 minutes. "Your code should output one number per value of n."

Question 13

It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n."

Question 14

@Lembik Added timing code and also worked around bug for n=1 (don't know offhand why it hangs).

Question 15

C++, score (削除) 10 (削除ここまで) 11

This is a translation of @Neil's answer into C++, with some simple parallelization. n=9 completes in 0.4 seconds, n=10 in 4.5 seconds, and n=11 in approximately 1 minute on my 2015 Macbook Pro. Also, thanks to @ChristianSievers. Due to his comments on @Neil's answer, I noticed some additional symmetries. From the original 121 buckets (for n=10), to 66 buckets when accounting for reversal, I've gotten down to just 21 buckets.

#include <iostream>
#include <cstdint>
#include <unordered_set>
#include <thread>
#include <future>
#include <vector>
using namespace std;
constexpr uint32_t popcnt( uint32_t v ) {
 uint32_t c = v - ( ( v >> 1 ) & 0x55555555 );
 c = ( ( c >> 2 ) & 0x33333333 ) + ( c & 0x33333333 );
 c = ( ( c >> 4 ) + c ) & 0x0F0F0F0F;
 c = ( ( c >> 8 ) + c ) & 0x00FF00FF;
 c = ( ( c >> 16 ) + c ) & 0x0000FFFF;
 return c;
}
template<uint32_t N>
struct A {
 constexpr A() : arr() {
 for( auto i = 0; i != N; ++i ) {
 arr[i] = popcnt( i );
 }
 }
 uint8_t arr[N];
};
uint32_t n = ( 1 << M ) - 1;
constexpr auto a = A < 1 << M > ();
uint32_t work( uint32_t c, uint32_t g, uint32_t mult ) {
 unordered_set<uint64_t> s;
 // Empirically derived "optimal" value
 s.reserve( static_cast<uint32_t>( pow( 5, M ) ) );
 for( int i = ( 1 << M ) - 1; i >= 0; i-- ) {
 for( uint32_t j = 1 << ( M - 1 ); j--; ) {
 if( a.arr[i ^ j + j] != c ) {
 continue;
 }
 for( uint32_t k = 1 << ( M - 1 ); k--; ) {
 if( a.arr[i ^ k] != g ) {
 continue;
 }
 uint64_t f = j << M | k;
 uint64_t h = 0;
 for( uint32_t l = M - 1; --l; ) {
 h = h * ( M + 1 ) + a.arr[i ^ ( f >> l & n )];
 }
 s.insert( h );
 }
 }
 }
 return s.size() * mult;
}
int main() {
 auto t1 = std::chrono::high_resolution_clock::now();
 if( M == 1 ) {
 auto t2 = std::chrono::high_resolution_clock::now();
 auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
 cout << M << ": " << 2 << ", " << seconds << endl;
 return 0;
 }
 uint64_t t = 0;
 vector<future<uint32_t>> my_vector;
 if( ( M & 1 ) == 0 ) {
 for( uint32_t c = 0; c <= M / 2; ++c ) {
 for( uint32_t g = c; g <= M / 2; ++g ) {
 uint32_t mult = 8;
 if( c == M / 2 && g == M / 2 ) {
 mult = 1;
 } else if( g == c || g == M / 2 ) {
 mult = 4;
 }
 my_vector.push_back( async( work, c, g, mult ) );
 }
 }
 for( auto && f : my_vector ) {
 t += f.get();
 }
 } else {
 for( uint32_t c = 0; c <= ( M - 1 ) / 2; ++c ) {
 for( uint32_t g = c; g <= M - c; ++g ) {
 uint32_t mult = 4;
 if( g == c || g + c == M ) {
 mult = 2;
 }
 my_vector.push_back( async( work, c, g, mult ) );
 }
 }
 for( auto && f : my_vector ) {
 t += f.get();
 }
 }
 auto t2 = std::chrono::high_resolution_clock::now();
 auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
 cout << M << ": " << t << ", " << seconds << endl;
 return 0;
}

Use the following script to execute the code:

#!/usr/bin/env bash
for i in {1..10}
do
 clang++ -std=c++14 -march=native -mtune=native -Ofast -fno-exceptions -DM=$i hamming3.cpp -o hamming
 ./hamming
done

The output was as follows: (The format is M: result, seconds)

1: 2, 0
2: 9, 0
3: 48, 0
4: 297, 0
5: 2040, 0
6: 15425, 0.001
7: 125232, 0.004
8: 1070553, 0.029
9: 9530752, 0.419
10: 86526701, 4.459
11: 800164636, 58.865

n=12 took 42 minutes to calculate on a single thread, and gave a result of 7368225813.

Question 16

How would you compile this in ubuntu using clang?

Question 17

@felipa I think the answer is sudo apt-get install libiomp-dev.

Question 18

It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n."

Question 19

Rather than reimplementing it you could probably just use __builtin_popcount.

Question 20

@Lembik: I'll make the changes later today. @Neil: The popcnt function only gets evaluated at compile time, and I don't know how to use __builtin_popcount in a constexpr context. I could go with the naïve implementation and it wouldn't affect the run time.

Question 21

C++11 (should get to 11 or 12)

At the moment this is single-threaded.

To compile:

g++ -std=c++11 -O2 -march=native feersum.cpp

#include <iostream>
#include <unordered_set>
#include <vector>
#include <unordered_map>
#include <string.h>
using seq = uint64_t;
using bitvec = uint32_t;
using seqset = std::unordered_set<seq>;
using std::vector;
#define popcount __builtin_popcount
#define MAX_N_BITS 4
bitvec leading(bitvec v, int n) {
 return v & ((1U << n) - 1);
}
bitvec trailing(bitvec v, int n, int total) {
 return v >> (total - n);
}
bitvec maxP(int n) {
 return 1 << (n - 1); // ~P, ~T symmetry
}
void prefixes(int n, int pre, int P, seqset& p) {
 p.clear();
 for (bitvec pref = 0; pref < (1U << pre); pref++) {
 seq s = 0;
 for (int i = 0; i < pre; i++) {
 seq ham = popcount(trailing(pref, pre - i, pre) ^ leading(P, pre - i));
 s |= ham << i * MAX_N_BITS;
 }
 p.insert(s);
 }
}
vector<seqset> suffixes(int n, int suf, int off) {
 vector<seqset> S(maxP(n));
 for (bitvec P = 0; P < maxP(n); P++) {
 for (bitvec suff = 0; suff < (1U << suf); suff++) {
 seq s = 0;
 for (int i = 0; i < suf; i++) {
 seq ham = popcount(leading(suff, i + 1) ^ trailing(P, i + 1, n));
 s |= ham << (off + i) * MAX_N_BITS;
 }
 S[P].insert(s);
 }
 }
 return S;
}
template<typename T> 
void mids(int n, int npre, int nsuf, int mid, bitvec P, T& S, const seqset& pre) {
 seq mask = (1ULL << (npre + 1) * MAX_N_BITS) - 1;
 for(bitvec m = 0; m < 1U << mid; m++) {
 int pc = popcount(P ^ m);
 if(pc * 2 > n) continue; // symmetry of T -> ~T : replaces x with n - x
 seq s = (seq)pc << npre * MAX_N_BITS;
 for(int i = 0; i < npre; i++)
 s |= (seq)popcount(trailing(P, n - npre + i, n) ^ leading(m, n - npre + i)) << i * MAX_N_BITS;
 for(int i = 0; i < nsuf; i++)
 s |= (seq)popcount(leading(P, mid - 1 - i) ^ trailing(m, mid - 1 - i, mid)) << (npre + 1 + i) * MAX_N_BITS;
 for(seq a: pre)
 S[(s + a) & mask].insert(P | (s + a) << n);
 }
}
uint64_t f(int n) {
 if (n >= 1 << MAX_N_BITS) {
 std::cerr << "n too big";
 exit(1);
 }
 int tlen = 2*n - 1;
 int mid = n;
 int npre = (tlen - mid) / 2;
 if(n>6) --npre;
 int nsuf = tlen - npre - mid;
 seqset preset;
 auto sufs = suffixes(n, nsuf, npre + 1);
 std::unordered_map<seq, std::unordered_set<uint64_t>> premid;
 for(bitvec P = 0; P < maxP(n); P++) {
 prefixes(n, npre, P, preset);
 mids(n, npre, nsuf, mid, P, premid, preset);
 }
 uint64_t ans = 0;
 using counter = uint8_t;
 vector<counter> found((size_t)1 << nsuf * MAX_N_BITS);
 counter iz = 0;
 for(auto& z: premid) {
 ++iz;
 if(!iz) {
 memset(&found[0], 0, found.size() * sizeof(counter));
 ++iz;
 }
 for(auto& pair: z.second) {
 seq s = pair >> n;
 uint64_t count = 0;
 bitvec P = pair & (maxP(n) - 1);
 for(seq t: sufs[P]) {
 seq suff = (s + t) >> (npre + 1) * MAX_N_BITS;
 if (found[suff] != iz) {
 ++count;
 found[suff] = iz;
 }
 }
 int middle = int(s >> npre * MAX_N_BITS) & ~(~0U << MAX_N_BITS);
 ans += count << (middle * 2 != n);
 }
 }
 return ans;
}
int main() {
 for (int i = 1; ; i++)
 std::cout << i << ' ' << f(i) << std::endl;
}

Question 22

Get's to 11 in less than 30 seconds!

Question 23

In case it's of interest:

feersum.cpp:111:61: warning: shifting a negative signed value is undefined [-Wshift-negative-value] int middle = int(s >> npre * MAX_N_BITS) & ~(~0 << MAX_N_BITS);

Question 24

JavaScript 2,9,48,297,2040,15425,125232,1070553,9530752

Run in console:

console.time("test");
h=(w,x,y,z)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal*Math.pow(w+1,z)};
sum=(x,y)=>x+y;
for(input=1;input<10;input++){
 hammings=new Array(Math.pow(input+1,input));
 for(i=1<<(input-1);i<1<<input;i++){
 for(j=0;j<1<<(2*input);j++){
 hamming=0;
 for(k=0;k<input;k++){
 hamming+=(h(input,(j>>k)%(1<<input),i,k));
 }
 hammings[hamming]=1;
 }
 }
 console.log(hammings.reduce(sum));
}
console.timeEnd("test");

Try it online!

Or as a Stack Snippet:

h=(w,x,y,z)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal*Math.pow(w+1,z)};
function findHamming() {
 console.time("test");
 input = parseInt(document.getElementById("input").value);
 hammings=new Array(Math.pow(input+1,input));
 for(i=1<<(input-1);i<1<<input;i++){
 for(j=0;j<1<<(2*input);j++){
 hamming=0;
 for(k=0;k<input;k++){
 hamming+=(h(input,(j>>k)%(1<<input),i,k));
 }
 hammings[hamming]=1;
 }
 }
 document.getElementById("output").innerHTML = hammings.reduce((x,y)=>x+y);
 console.timeEnd("test");
}

<input type="text" id="input" value="6">
<button onclick="findHamming()"> Run </button>
<pre id="output">

The code preinitializes the array to make adding 1s to the array much faster

The code finds all the hamming distance sequences and treating them as numbers base (input+1), uses them to place 1s in an array. As a result, this generates an array with the n 1s where n is the number of unique hamming distance sequences. Finally, the number of 1s is counted using array.reduce() to sum all of the values in the array.

This code will not be able to run for input of 10 as it hits memory limits

This code runs in O(2^2n) time because that's how many elements it generates.

Question 25

Unsurprisingly, trying to create a 26*10^9 element array doesn't work

Question 26

n = 9 takes 5 minutes and 30 seconds for me using node.js so is just too slow.

Question 27

@Lembik n = 8 originally took 24 seconds on my PC, but I was able to optimise the code so that n = 8 took 6 seconds. I then tried n = 9 and that took 100 seconds.

Question 28

@Neil You should submit an answer !

Question 29

It would be great if your code worked up from n = 1 and outputted the timing at each stage. From the question " The timing is for the total running time, not the time just for that n."

nimi 36k4 gold badges35 silver badges100 bronze badges · Accepted Answer · 2017-03-24 21:08:18Z

Haskell, score 9

import Data.Bits
import Data.List
import qualified Data.IntSet as S
main = mapM_ (print . S.size . S.fromList . hd) [1..9]
hd :: Int -> [Int]
hd n = [foldl' (\s e->s*m+e) 0 [popCount $ xor p (shiftR t i .&. m)|i<-[(0::Int)..n-1]] | let m=2^n-1,p<-[(0::Int)..2^n-1],t<-[(0::Int)..2^(2*n-1)-1]]

Compile with -O3. It takes 6min35s on my 6 year old laptop hardware to run up to n=9, so maybe it's under 5min on the reference hardware.

> time ./113785
2
9
48
297
2040
15425
125232
1070553
9530752
real 6m35.763s
user 6m27.690s
sys 0m5.025s

@SIGSEGV: maybe it's outdated, but besides counting the number of Hamming distance sequences it does its job quite well.

Stack Exchange Network

Count the number of Hamming distance sequences

5 Answers 5

Haskell, score 9

JavaScript, score 10

C++, score (削除) 10 (削除ここまで) 11

C++11 (should get to 11 or 12)

JavaScript 2,9,48,297,2040,15425,125232,1070553,9530752

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions