The Hamming distance between two strings of equal length is the number of positions at which the corresponding characters are different. If the strings are not of equal length, the Hamming distance is not defined.
Challenge
Write a program or function that finds the largest Hamming distance from among all pairs of strings from a list of strings, padded as required according to the rules described below.
The characters will be from within a-zA-Z0-9.
The strings may not be equal in length, so for each comparison the shorter string has to be padded as follows:
- wrap the string from the beginning as many times as needed to match the required length
- change the cases of the letters each odd time wrapping (1st, 3rd, 5th, etc.)
- leave things outside
a-zA-Zunchanged when wrapping
For example, let's say you need to pad the 5 character string ab9Cd so that it ends up with 18 characters. You would end up with:
ab9CdAB9cDab9CdAB9
^^^^^ ^^^
with ^ added underneath the 1st and 3rd wraps to highlight to case changes.
Input/Output
Input/output format is flexible. You can assume the input has at least two strings, and that all strings will have at least one character.
The output is an integer.
Rules
This is code-golf. Standard rules apply.
Test cases
[ "a", "b" ] => 1
[ "a", "b", "c" ] => 1
[ "a", "a", "c" ] => 1
[ "abc", "abcd" ] => 1
[ "abc12D5", "abC34d3", "ABC14dabc23DAbC89d"] => 17
[ "a", "Aaa", "AaaA", "aAaAa", "aaaaaaaaaaaaaa", "AAaAA", "aAa" ] => 8
["AacaAc", "Aab"] => 2
Reference implementation
I tested the examples with (completely ungolfed) R code that you can try here to compare any other examples you might try out with your code.
9 Answers 9
Jelly, 20 bytes
Not really happy with it. Should be golfable, even to ~15 bytes perhaps.
LÞŒcμṁ/sḢL$ŒsÐeFn)§Ṁ
Explanation
LÞŒcμṁ/sḢL$ŒsÐeFn)§Ṁ Full program or monadic link. N = input. | Example: ["abc12D5", "abC34d3", "ABC14dabc23DAbC89d"] LÞ Sort N by length. | [['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'C', '3', '4', 'd', '3'], ['A', 'B', 'C', '1', '4', 'd', 'a', 'b', 'c', '2', '3', 'D', 'A', 'b', 'C', '8', '9', 'd']] (in Jelly, strings are list of characters) Œc Unordered pairs: [x, y] for all distinct x, y in N. | [[['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'C', '3', '4', 'd', '3']], [['a', 'b', 'c', '1', '2', 'D', '5'], ['A', 'B', 'C', '1', '4', 'd', 'a', 'b', 'c', '2', '3', 'D', 'A', 'b', 'C', '8', '9', 'd']], [['a', 'b', 'C', '3', '4', 'd', '3'], ['A', 'B', 'C', '1', '4', 'd', 'a', 'b', 'c', '2', '3', 'D', 'A', 'b', 'C', '8', '9', 'd']]] Here, by distinct, I mean at different positions. | μ ) Map with a monadic link. | ṁ/ Mold x like y. That is, cycle x until it reaches length y. | [['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'c', '1', '2', 'D', '5', 'a', 'b', 'c', '1', '2', 'D', '5', 'a', 'b', 'c', '1'], ['a', 'b', 'C', '3', '4', 'd', '3', 'a', 'b', 'C', '3', '4', 'd', '3', 'a', 'b', 'C', '3']] sḢL$ Split into chunks of x's length. | [[['a', 'b', 'c', '1', '2', 'D', '5']], [['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'c', '1']], [['a', 'b', 'C', '3', '4', 'd', '3'], ['a', 'b', 'C', '3', '4', 'd', '3'], ['a', 'b', 'C', '3']]] ŒsÐe Swap the case of even-indexed chunks (1-indexed). | [[['a', 'b', 'c', '1', '2', 'D', '5']], [['a', 'b', 'c', '1', '2', 'D', '5'], ['A', 'B', 'C', '1', '2', 'd', '5'], ['a', 'b', 'c', '1']], [['a', 'b', 'C', '3', '4', 'd', '3'], ['A', 'B', 'c', '3', '4', 'D', '3'], ['a', 'b', 'C', '3']]] F Flatten. | [['a', 'b', 'c', '1', '2', 'D', '5'], ['a', 'b', 'c', '1', '2', 'D', '5', 'A', 'B', 'C', '1', '2', 'd', '5', 'a', 'b', 'c', '1'], ['a', 'b', 'C', '3', '4', 'd', '3', 'A', 'B', 'c', '3', '4', 'D', '3', 'a', 'b', 'C', '3']] n Vectorized inequality with y. | [[[0, 0, 1, 1, 1, 1, 1]], [[1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], [[1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1]]] § After ending the map, sum each bool (0 or 1) array. | [[5], [17], [14]] Ṁ Maximum. | 17
-
\$\begingroup\$ I'm totally not familiar with Jelly, but I think you can omit
LÞand still get the same maximum at the end. \$\endgroup\$Chas Brown– Chas Brown2018年08月13日 21:41:32 +00:00Commented Aug 13, 2018 at 21:41 -
\$\begingroup\$ @ChasBrown Ugh, no, I should need that. Otherwise, instead of padding the shortest to the length of the longest one,
ṁ/would instead trim the longest to the length of the shortest one in some cases, which is not what we want.... I guess the test cases are too well chosen (and this is a rather unfortunate coincidence)... \$\endgroup\$Mr. Xcoder– Mr. Xcoder2018年08月13日 21:48:29 +00:00Commented Aug 13, 2018 at 21:48 -
\$\begingroup\$ @ChasBrown As an example, try
["AacaAc", "Aab"]. \$\endgroup\$Mr. Xcoder– Mr. Xcoder2018年08月13日 21:57:12 +00:00Commented Aug 13, 2018 at 21:57 -
\$\begingroup\$ Ah yes, I see... I need to learn me some more Jelly... :) \$\endgroup\$Chas Brown– Chas Brown2018年08月13日 22:02:12 +00:00Commented Aug 13, 2018 at 22:02
Python 2, 86 bytes
lambda a:max(sum(x!=y for x,y in zip((s+s.swapcase())*len(t),t))for s in a for t in a)
Given two strings, s,t, zip((s+s.swapcase())*len(t),t)) will be a list of tuples of length len(t) since zip truncates to the shortest iterable. If len(s)<len(t), then this "pads out" s with the desired case swapping and we calculate the sum of differing characters.
If len(t)<=len(s), then the resulting sum will be less than or equal to the sum if we were evaluating t,s; so it has no effect on the resulting max in that case.
-
\$\begingroup\$ You can use
y!=instead of!=yto save 1 byte \$\endgroup\$Mr. Xcoder– Mr. Xcoder2018年08月13日 20:43:09 +00:00Commented Aug 13, 2018 at 20:43 -
\$\begingroup\$ @Mr. Xcoder: Thx, but I ended up drastically reworking my solution... \$\endgroup\$Chas Brown– Chas Brown2018年08月13日 21:10:41 +00:00Commented Aug 13, 2018 at 21:10
JavaScript (Node.js), 111 bytes
a=>a.map(m=S=>a.map(s=>B(S).map((c,k)=>m=(c^(c=B(s)[k%(l=s.length)])^(k/l&c>9)<<5&&++x)<m?m:x,x=0)),B=Buffer)|m
Jelly, 19 bytes
WṁŒsÐeF=ċ0
LÞŒcç/€Ṁ
LÞŒcç/€Ṁ
LÞ Sort by length
Œc unordered pairs
€ to each of the pairs
ç/ find the hamming distance with molding and swapping case (helper link)
Ṁ maximum
WṁŒsÐeF=ċ0
W wrap the shorter string
ṁ repeat this string once for each character in the second string
Ðe for every other repeated string
Œs swap case
F flatten
= characterwise equality check between the two strings. If the first
string is longer, the leftover characters are appended to the result.
e.g. 'abABab' and 'xbA' give [0,1,1,'B','a','b']
ċ0 count the number of 0s, giving the Hamming distance.
Ruby, (削除) 89 (削除ここまで) 82 bytes
Creates the cross-product of the input list against itself before calculating the Hamming distance of each pair, using a duplication method similar to Chas Brown's answer. Ruby can't zip strings together or add booleans without additional overhead, though, so it becomes necessary to iterate through the pair of strings manually instead.
-7 bytes from GB.
->a{a.product(a).map{|s,t|(0...w=t.size).count{|i|(s+s.swapcase)[i%w]!=t[i]}}.max}
-
1
Java 10, (削除) 748 (削除ここまで) (削除) 740 (削除ここまで) (削除) 667 (削除ここまで) (削除) 666 (削除ここまで) 616 bytes
This has to be the most dense and unreadable, yet the longest golf I ever came up with.
Call method h(String[]) with an explicit array (no var args): eg,
h(new String[] {"a", "b", "c"});
returns 1.
char e(boolean w,char c){return(char)(w&(64<c&c<91|96<c&c<123)?c^32:c);}String p(String s,int l){var p="";int m=s.length(),n=l/m,r=l%m,i=0,j=0;var w=1<0;for(;i<n;++i,w=!w)for(char c:s.toCharArray())p+=e(w,c);for(;j<r;)p+=e(w,s.charAt(j++));return p;}int d(String s,String t){int l=s.length(),n=0,i=0;for(;i<l;)if(s.charAt(i)!=t.charAt(i++))++n;return n;}int h(String s,String t){int l=s.length(),m=t.length();return l>m?d(s,p(t,l)):l<m?d(p(s,m),t):d(s,t);}int h(String[]s){int l=s.length,i=0,j;int[]n=new int[l*l];for(;i<l;++i)for(j=i;++j<l;)n[i*l+j]=h(s[i],s[j]);return java.util.Arrays.stream(n).max().getAsInt();}
You can Try It Online!
Ungolfed and commented:
// Encode the character (swap case)
char e(boolean w, char c) {
return (char) (w & (64 < c & c < 91 | 96 < c & c < 123) ? c ^ 32 : c);
}
// Pad the string to desired length
String p(String s, int l) {
var p = "";
int m = s.length(), n = l / m, r = l % m, i = 0, j = 0;
var w = 1 < 0;
for (; i < n; ++i, w = !w)
for (char c : s.toCharArray())
p += e(w, c);
for (; j < r;)
p += e(w, s.charAt(j++));
return p;
}
// Calculate the actual hamming distance between two same-length strings
int d(String s, String t) {
int l = s.length(), n = 0, i = 0;
for (; i < l;)
if (s.charAt(i) != t.charAt(i++))
++n;
return n;
}
// Pad the strings as needed and return their hamming distance
int h(String s, String t) {
int l = s.length(), m = t.length();
return l > m ? d(s, p(t, l)) : l < m ? d(p(s, m), t) : d(s, t);
}
// Dispatch the strings and gather their hamming distances, return the max
int h(String[] s) {
int l = s.length, i = 0, j;
int[] n = new int[l * l];
for (; i < l; ++i)
for (j = i; ++j < l;)
n[i * l + j] = h(s[i], s[j]);
return java.util.Arrays.stream(n).max().getAsInt();
}
I know a better solution can be achieved, especially for the string pairing part.
EDIT: shave off 8 bytes by changing the size of the int array in hammingDistance() to the square of the numbe of strings given. It also fixes an ArrayIndexOutOfBounds thrown in one of the test cases.
EDIT 2: Saved 33 bytes thanks to Kevin Cruijssen's comments: class declaration removed, names shortened to 1 char, operators changed, etc.
EDIT 3: Save 1 byte and reach Satan-approved score by changing method with var-arg to array.
EDIT 4: Save another 50 bytes thanks to Kevin Cruijssen, again: update Java version from 8 to 10 to use var keyword, removed StringBuilder instance, etc.
-
1\$\begingroup\$ I don't have a lot of time, but some basic things to golf: Drop the class, only the methods is enough. Change all method and variable names to single bytes.. So instead of
hammingDistanceusedor some other unused variable. Most of your&&can be&and||can be|.c^' 'can bec^32.boolean w = false;can beboolean w=0>1;.i=0in the loop initialization can be removed and change the,i,jto,i=0,j.++jcan be removed and++can be added to the.charAt(j++)..toString()can be+"".for(j=i+1;j<l;++j)can befor(j=0;++j<l;). Etc. etc. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2018年08月14日 13:38:29 +00:00Commented Aug 14, 2018 at 13:38 -
1\$\begingroup\$ Tips for golfing in Java and Tips for golfing in <all languages> might be interesting to read through as well. :) \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2018年08月14日 13:41:05 +00:00Commented Aug 14, 2018 at 13:41
-
\$\begingroup\$ Thanks! That's some nice byte-lifting. Thanks for the links too, I'm taking a look at it and will edit asap! \$\endgroup\$joH1– joH12018年08月14日 13:47:44 +00:00Commented Aug 14, 2018 at 13:47
-
1\$\begingroup\$ Upvoted for the Satan-approved score. xD Some more small things:
StringBuildercan beStringBuffer(if you switch to Java 10 it could bevar b=new StringBuffer(l);. Thebooleanandcharcan then also bevar. If you don't have Java 10 locally, it is available on TIO). In addition,for(;i<n;++i){for(char c:s.toCharArray())b.append(e(w,c));w=!w;}can befor(;i++<n;w=!w)for(char c:s.toCharArray())b.append(e(w,c));. And I'm pretty sure you can remove theStringBuffercompletely and just useStringand+=instead ofappend. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2018年08月14日 15:13:03 +00:00Commented Aug 14, 2018 at 15:13 -
\$\begingroup\$ Man, some many months of clean code and good coding practices made me forget how to even golf! I'll update my answer and include TIO. \$\endgroup\$joH1– joH12018年08月16日 07:15:05 +00:00Commented Aug 16, 2018 at 7:15
05AB1E, (削除) 33 (削除ここまで) 29 bytes
Ćü)€é©εćDš«s`g∍}®€¤‚ø€ζ€€Ë_Oà
Try it online or verify all test cases.
Can most likely be halved in byte-count, but it works..
Explanation:
Ć # Enclose the input-list (adding the first item to the end of the list)
# i.e. ['ABC1','abcD','abCd32e'] → ['ABC1','abcD','abCd32e','ABC1']
ü) # Pair-vectorize each of them
# i.e. ['ABC1','abcD','abCd32e','ABC1']
# → [['ABC1','abcD'],['abcD','abCd32e'],['abCd32e','ABC1']]
ێ # Sort each pair by length
# i.e. [['ABC1','abcD'],['abcD','abCd32e'],['abCd32e','ABC1']]
# → [['ABC1','abcD'],['abcD','abCd32e'],['ABC1','abCd32e']]
© # Store this list in the register to re-use later on
ε } # Map each inner list in this list to:
ć # Head extracted
# i.e. ['abcD','abCd32e'] → 'abcD' and ['abCd32e']
Dš # Duplicate it, and swap the capitalization of the copy
# i.e. 'abcD' → 'ABCd'
« # Then merge it together
# i.e. 'abcD' and 'ABCd' → 'abcDABCd'
s` # Swap so the tail-list is at the top of the stack, and get it's single item
# i.e. ['abCd32e'] → 'abCd32e'
g # Get the length of that
# i.e. 'abCd32e' → 7
∍ # Extend/shorten the string to that length
# i.e. 'abcDABCd' and 7 → 'abcDABC'
® # Get the saved list from the register again
€¤ # Get the tail from each
# i.e. [['ABC1','abcD'],['abcD','abCd32e'],['abCd32e','ABC1']]
# → ['abcD','abCd32e','abCd32e']
‚ # Pair it with the other list
# i.e. ['ABC1','abcDABC','ABC1abc'] and ['abcD','abCd32e','abCd32e']
# → [['ABC1','abcDABC','ABC1abc'],['abcD','abCd32e','abCd32e']]
ø # Zip it, swapping rows / columns
# i.e. [['ABC1','abcDABC','ABC1abc'],['abcD','abCd32e','abCd32e']]
# → [['ABC1','abcD'],['abcDABC','abCd32e'],['ABC1abc','abCd32e']]
€ζ # And then zip each pair again
# i.e. [['ABC1','abcD'],['abcDABC','abCd32e'],['ABC1abc','abCd32e']]
# → [['Aa','Bb','Cc','1D'],['aa','bb','cC','Dd','A3','B2','Ce'],['Aa','Bb','CC','1d','a3','b2','ce']]
€ # Then for each inner list
€ # And for each inner string
Ë # Check if all characters are the same
# i.e. 'aa' → 1
# i.e. 'cC' → 0
_ # And inverse the booleans
# i.e. [['Aa','Bb','Cc','1D'],['aa','bb','cC','Dd','A3','B2','Ce'],['Aa','Bb','CC','1d','a3','b2','ce']]
# → [[1,1,1,1],[0,0,1,1,1,1,1],[1,1,0,1,1,1,1]]
O # Then sum each inner list
# i.e. [[1,1,1,1],[0,0,1,1,1,1,1],[1,1,0,1,1,1,1]] → [4,5,6]
à # And take the max as result
# i.e. [4,5,6] → 6
Java 11, 387 bytes
a->{int l=a.length,i=l,t,j=0,C[]=new int[l];var p=new String[l][2];for(;i-->0;p[i][0]=a[t>0?i:j],p[i][1]=a[t>0?j:i])t=a[i].length()<a[j=-~i%l].length()?1:0;i=0;for(var P:p){var s="";for(var x:P[0].getBytes())s+=(char)(x>64&x<91|x>96&x<123?x^32:x);for(P[0]=repeat(P[0]+s,t=P[1].length()).substring(0,t);t-->0;)if(P[0].charAt(t)!=P[1].charAt(t))C[i]++;i++;}for(int c:C)j=c>j?c:j;return j;}
Try it online. (NOTE: Since Java 11 isn't on TIO yet, String.repeat(int) has been emulated as repeat(String,int) for the same byte-count.)
Explanation:
a->{ // Method with String-array parameter and integer return-type
int l=a.length, // Length of the input-array
i=l, // Index-integer, starting at the length
t,j=0, // Temp-integers
C[]=new int[l]; // Count-array the same size as the input
var p=new String[l][2]; // String-pairs array the same size as the input
for(;i-->0 // Loop `i` in the range [`l`, 0)
; // After every iteration:
p[i][0]= // Set the first String of the pair at index `i` to:
a[t>0?i:j],// The smallest of the `i`'th or `j`'th Strings of the input-array
p[i][1]= // And set the second String of the pair at index `i` to:
a[t>0?j:i])// The largest of the `i`'th or `j`'th Strings of the input-array
t=a[i].length()< // If the length of the `i`'th item is smaller than
a[j=-~i%l].length()?// the length of the `i+1`'th item
// (and set `j` to this `i+1` with wrap-around to 0 for the last item
1 // Set `t` to 1 as flag
: // Else:
0; // Set `t` to 0 as flag
// We've now created the String pairs, where each pair is sorted by length
i=0; // Reset `i` to 0
for(var P:p){ // Loop over the pairs
var s=""; // Temp-String starting empty
for(var x:P[0].getBytes())
// Loop over the characters of the first String of the pair
s+= // Append the temp-String with:
(char)(x>64&x<91|x>96&x<123?
// If the current character is a letter:
x^32 // Swap it's case before appending it to `s`
: // Else (not a letter):
x); // Append it to `s` as is
for(P[0]= // Now replace the first String with:
repeat(P[0]+s, // The first String appended with the case-swapped first String
t=P[1].length())
// Repeated `t` amount of times,
// where `t` is the length of the second String of the pair
.substring(0,t); // And then shorten it to length `t`
t-->0;) // Inner loop over the character of the now same-length Pairs
if(P[0].charAt(t)!=P[1].charAt(t))
// If the characters at the same indices in the pair are not equal
C[i]++; // Increase the counter for this pair by 1
i++;} // After every iteration of the outer loop,
// increase `i` by 1 for the next iteration
for(int c:C) // Now loop over the calculated counts
j=c>j?c:j; // And set `j` to the maximum
return j;} // And finally return this maximum `j` as result
R, 173 bytes
function(x,U=utf8ToInt,N=nchar)max(combn(x,2,function(z,v=z[order(N(z))])sum(U(substr(Reduce(paste0,rep(c(v[1],chartr('A-Za-z','a-zA-Z',v[1])),n<-N(v[2]))),1,n))!=U(v[2]))))
@ngm : I tried my best to golf your code (with my heavy customizations of course) but, as you well know, R is not very golfy in manipulating strings :P
-
\$\begingroup\$ I bet this can be sub 150 bytes, but I'm not sure how just yet. \$\endgroup\$Giuseppe– Giuseppe2018年08月15日 17:50:04 +00:00Commented Aug 15, 2018 at 17:50
-
\$\begingroup\$ @Giuseppe: I suspect that too... but I'm not really good in writing short strings manipulation codes and R doesn't help me very much either :D \$\endgroup\$digEmAll– digEmAll2018年08月15日 18:56:40 +00:00Commented Aug 15, 2018 at 18:56
-
\$\begingroup\$ @digEmAll I'm not going to try to solve my own challenge, but few possibilities might include
outerto get all the combinations, and doing modular arithmetic on the code points in lieu ofchartr. \$\endgroup\$ngm– ngm2018年08月16日 00:33:27 +00:00Commented Aug 16, 2018 at 0:33 -
\$\begingroup\$ @ngm: possible...I discarded the arithmetic approach because I couldn't find a short solution/formula to change the case for letters without touching the numbers... \$\endgroup\$digEmAll– digEmAll2018年08月16日 06:18:51 +00:00Commented Aug 16, 2018 at 6:18
["AacaAc", "Aab"] => 2. A purposed golf to my Jelly answer would have failed that case, but would have passes all the other ones. \$\endgroup\$