Given a set of input strings, how do I find a shortest string S so that all input strings appear as a substring of S?
for example:
foobar + bar -> foobar
foo + bar + rfo -> barfoo
I guess it's related to graphs, but don't know how to solve it.
- first check all the "substrings", for example, "bar" to "foobar". Remove "bar".
- the more two string share head and tail, the more likely they will be joined together, like nodes in a graph.
background:
GCC seems to do an optimization where it combines string literals that share the same suffix.
https://stackoverflow.com/questions/51551068/
So I want to broaden this problem.
-
1$\begingroup$ What did you try? $\endgroup$Nathaniel– Nathaniel2023年05月24日 19:27:56 +00:00Commented May 24, 2023 at 19:27
-
$\begingroup$ Please edit your post to specify the problem more precisely. Please define what "includes" means. Does it mean as a substring (consecutively?) or as a subsequence (letters not necessarily consecutive)? Where did you encounter this problem? $\endgroup$D.W.– D.W. ♦2023年05月25日 00:25:17 +00:00Commented May 25, 2023 at 0:25
-
$\begingroup$ Please state the problem in the body of the post. Reading the body of the post (without reading the title), it should make sense. The title is not just the first sentence of the body of the post. $\endgroup$D.W.– D.W. ♦2023年05月26日 06:06:35 +00:00Commented May 26, 2023 at 6:06
1 Answer 1
This is called the shortest common superstring problem. It is NP-hard. Therefore, you should not expect any efficient algorithm that works for all possible inputs.
Like many NP-hard problems, I imagine you could solve it with a SAT solver or ILP solver, or use an approximation algorithm (see the Wikipedia article for one approximation algorithm).