Background
Two strings s and t are called k-Abelian equivalent (shortened to k-equivalent in the following) for a positive integer k if the following conditions hold:
- The length-
k-1prefixes ofsandtare equal. - The length-
k-1suffixes ofsandtare equal. - The strings
sandthave the same multisets of length-kcontiguous substrings. In other words, every string of lengthkoccurs an equal number of times insandt. Overlapping occurrences are counted as distinct.
Note that k-equivalent strings are always k-1-equivalent if k > 1, and the third condition implies that s and t have the same length.
It is also known, and possibly useful, that the three conditions above are equivalent to the following one:
- Every non-empty string of length
kork-1occurs an equal number of times insandt.
An Example
For example, consider the strings s = "abbaaabb" and t = "aabbaabb".
Because the strings have the same number of each character, 4 of a and 4 of b, they are 1-equivalent.
Consider then 2-equivalence.
Their first and last characters are the same (both strings begin with a and end with b), so the first two conditions are satisfied.
The length-2 substrings of s are aa (occurs twice), ab (twice), ba (once), and bb (twice), and those of t are exactly the same.
This means that the strings are 2-equivalent.
However, since their second letters are different, the strings are not 3-equivalent.
Input
Two alphanumeric strings s and t of the same length n > 0.
Output
The largest integer k between 1 and n (inclusive) such that s and t are k-equivalent.
If no such k exists, the output is 0.
In the above example, the correct output would be 2.
Rules
You can write a full program or a function. The lowest byte count wins, and standard loopholes are disallowed. Crashing on malformed input is perfectly fine.
Test Cases
"abc" "def" -> 0
"abbaabbb" "aabbbaab" -> 0
"abbaaabb" "aabbbaab" -> 2
"abbaaabb" "aabbaabb" -> 2
"aaabbaabb" "aabbaaabb" -> 3
"aaabbaabb" "aaabbaabb" -> 9
"abbabaabbaababbabaababbaabbabaab" "abbabaabbaababbaabbabaababbabaab" -> 9
"yzxyxx" "yxyzxx" -> 2
"xxxxxyxxxx" "xxxxyxxxxx" -> 5
2 Answers 2
Pyth: (削除) 26 (削除ここまで) 22 bytes
lfq_TTCmmSm<>dbhkldldQ
Input are two strings like "abc","def". Try it online: Pyth Compiler/Executor
Explanation:
Quite some complicated code. But I'll start easy.
First we want to create all substrings of length k of the string d.
Sm<>dbhkld
m ld apply a function ... to all elements b
in the list [0, 1, 2, ..., len(d)-1]
>db take the string d starting from the position b
< hk but only hk = k + 1 characters
S Sort the resulting list.
For the string "acbde" and k = 3 this results in ['abc', 'bde', 'cbd', 'de', 'e']. Notice, that 'de' and 'e' are part of this list. This will be quite important in a bit.
I want the substrings for each k, one map again.
mSm<>dbhkldld
m ld apply a function (create all substrings of lenght k) to
all elements k in the list [0, 1, 2, ..., len(d)-1]
For the string "acbd" this results in [['a', 'b', 'c', 'd'], ['ac', 'bd', 'cb', 'd'], ['acb', 'bd', 'cbd', 'd'], ['acbd', 'bd', 'cbd', 'd']].
I want this for both strings, another map.
mmSm<>dbhkldldQ
m Q apply the function (see above) for all d in input()
[[['a', 'b', 'c'], ['ab', 'bc', 'c'], ['abc', 'bc', 'c']], [['d', 'e', 'f'], ['de', 'ef', 'f'], ['def', 'ef', 'f']]] for the input "abc","def", and after a
zip C, this looks [(['a', 'b', 'c'], ['d', 'e', 'f']), (['ab', 'bc', 'c'], ['de', 'ef', 'f']), (['abc', 'bc', 'c'], ['def', 'ef', 'f'])].
Now comes the exciting part: We are of course only interest in the ks, where the substrings of both words are the same.
lfq_TTCmmSm<>dbhkldldQ
f I filter the list for elements T, where
q_TT reversed(T) == T
(basically T[0] == T[1])
l print the lenght of the resulting list.
(= the number of times, where the substrings of length k
of string 1 are equal to the substrings of
length k of string 2)
Since we also have the suffixes of length < k in Ts, this also checks, if the suffixes are equal.
The only thing, which I don't check is, if the prefixes are equal. But as it turns out, this isn't necessary. Lemma 2.3 (4<=>5) of the linked paper says: If the substrings of length k are the same, than pref_{k-1}(string 1) = pref_{k-1}(string 2) is equivalent to suff_{k-1}(string 1) = suff_{k-1}(string 2). Since the suffixes are equal, the prefixes are also equal.
-
\$\begingroup\$ the
'ab'ing"abbab"3is from the last two characters, not the prefix. So no, you are not correctly following the spec as you do not check for k-1 length prefix at all \$\endgroup\$Optimizer– Optimizer2015年02月06日 17:46:19 +00:00Commented Feb 6, 2015 at 17:46 -
\$\begingroup\$ @Optimizer Lemma 2.3 in the linked paper does the job. I'll add some more explanation. \$\endgroup\$Jakube– Jakube2015年02月06日 18:00:27 +00:00Commented Feb 6, 2015 at 18:00
-
\$\begingroup\$ Are you saying that checking for k-1 length prefix is not at all required ? \$\endgroup\$Optimizer– Optimizer2015年02月06日 18:04:23 +00:00Commented Feb 6, 2015 at 18:04
-
\$\begingroup\$ @Optimizer Exactly \$\endgroup\$Jakube– Jakube2015年02月06日 18:09:54 +00:00Commented Feb 6, 2015 at 18:09
-
\$\begingroup\$ @Zgarb Thanks for accepting my answer. Too sad, this question didn't get more recognition. I found this was one of the best challenges in a while, and am quite proud of my solution. \$\endgroup\$Jakube– Jakube2015年02月23日 09:46:48 +00:00Commented Feb 23, 2015 at 9:46
CJam, (削除) 51 49 47 41 (削除ここまで) 34 bytes
WqN/:Q0=,,{Qf{_,,\f>\)f<$}~=},+W=)
Just to get something started. This can surely be golfed a lot. Looking at a different algorithm too.
Input goes as the two strings without quotes on separate lines.
Example:
aaabbaabb
aabbaaabb
Output:
3
"abbaaabb" "aabbbaab"= 1? The prefixesa,aand the suffixesb,bare the same, and the substringsaa,ab,bbappear each twice, and the substringbaappears once in each string. \$\endgroup\$k-1prefixes of s and t are equal." \$\endgroup\$