What is correct time complexity of the substring generation algo

Question 1

The naive algorithm for generating all unique substrings involves traversing all the string from 1..n, thus takes $\mathcal{O}(n^2)$, but it also involves maintaining a list for all generated substring to check everytime for repetitions.

So the total time can't be $\mathcal{O}(n^2)$, since there is an added checking of uniqueness, which means that I have to maintain a list of all previous generated substrings.

So how is it correct that when it is said that the algorithm takes only $\mathcal{O}(n^2)$ time.

There is obviously an added cost at each substring generation.

The actual time complexity should be as pointed out by Raphael should be something like the following -:

$\mathcal{O}(n^2) + (\mathcal{O}(m^2) * \mathcal{O}(n)))$

where $m$ is the total number of substrings generated $n(n+1)/2$, and the time to compare 2 different strings is the smaller length between two strings, assuming the strings are being compared character by character.

Now if the above is correct then what I am confused about is should it be plus or a multiply with the comparision cost.

Question 2

Let us continue this discussion in chat.

Question 3

To be blunt, the Stack Overflow question and its answers illustrate why you want to ask about such things here.

The propsed algorithm clearly does not run in quadratic time.
Since the length $n$ of the original string is our input size, and the length of the longest possible result string, we can not assume that string comparisons run in constant time.
The bound you give is therefore also wrong, since you can not check in constant time if any string is in a list of size $L$.

Instead, the running time of the algorithm is more complicated and depends on several factors (new variable names).

The length $N$ of the input string.
The number $m$ of output strings. $m \in \Omega(n) \cap O(n^2)$.
The total length $M$ of all output strings. $M \in \Omega(n^2) \cap O(n^3)$.

You should be able to derive rough bounds using standard approaches. Keep in mind to fix the dictionary implementation you use for the "list"! Using, say, tries you get lookup (and insertion) costs bounded by the length of the string at hand.

Raphael Raphael 73.3k31 gold badges183 silver badges403 bronze badges · Answer 1 · 2017-09-19 11:38:28Z

To be blunt, the Stack Overflow question and its answers illustrate why you want to ask about such things here.

The propsed algorithm clearly does not run in quadratic time.
Since the length $n$ of the original string is our input size, and the length of the longest possible result string, we can not assume that string comparisons run in constant time.
The bound you give is therefore also wrong, since you can not check in constant time if any string is in a list of size $L$.

Instead, the running time of the algorithm is more complicated and depends on several factors (new variable names).

The length $N$ of the input string.
The number $m$ of output strings. $m \in \Omega(n) \cap O(n^2)$.
The total length $M$ of all output strings. $M \in \Omega(n^2) \cap O(n^3)$.

You should be able to derive rough bounds using standard approaches. Keep in mind to fix the dictionary implementation you use for the "list"! Using, say, tries you get lookup (and insertion) costs bounded by the length of the string at hand.

Stack Exchange Network

What is correct time complexity of the substring generation algo

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

What is correct time complexity of the substring generation algo

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions