Let $A_P = (Q,\Sigma,\delta,0,\{m\})$ the string matching automaton for pattern $P \in \Sigma^m,ドル that is
- $Q = \{0,1,\dots,m\}$
- $\delta(q,a) = \sigma_P(P_{0,q}\cdot a)$ for all $q\in Q$ and $a\in \Sigma$
with $\sigma_P(w)$ the length of the longest prefix of $P$ that is a Suffix of $w,ドル that is
$\qquad \displaystyle \sigma_P(w) = \max \left\{k \in \mathbb{N}_0 \mid P_{0,k} \sqsupset w \right\}$.
Now, let $\pi$ the prefix function from the Knuth-Morris-Pratt algorithm, that is
$\qquad \displaystyle \pi_P(q)= \max \{k \mid k < q \wedge P_{0,k} \sqsupset P_{0,q}\}$.
As it turns out, one can use $\pi_P$ to compute $\delta$ quickly; the central observation is:
Assume above notions and $a \in \Sigma$. For $q \in \{0,\dots,m\}$ with $q = m$ or $P_{q+1} \neq a,ドル it holds that
$\qquad \displaystyle \delta(q,a) = \delta(\pi_P(q),a)$
But how can I prove this?
For reference, this is how you compute $\pi_P$:
m ← length[P ]
π[0] ← 0
k ← 0
for q ← 1 to m − 1 do
while k > 0 and P [k + 1] =6 P [q] do
k ← π[k]
if P [k + 1] = P [q] then
k ← k + 1
end if
π[q] ← k
end while
end for
return π
-
3$\begingroup$ Can you provide a little more detail? What is the pattern? What precisely is the prefix function -- I didn't see a prefix function on the link you provided? Is the automaton deterministic or nondeterministic? $\endgroup$Dave Clarke– Dave Clarke2012年05月05日 10:35:12 +00:00Commented May 5, 2012 at 10:35
-
3$\begingroup$ Isn't that the definition of the KMP transition function? You might find these notes helpful (but see footnote 1). $\endgroup$JeffE– JeffE2012年05月06日 15:25:30 +00:00Commented May 6, 2012 at 15:25
-
$\begingroup$ I edited the question with the prefix function of KMP $\endgroup$Bob– Bob2012年05月07日 08:20:35 +00:00Commented May 7, 2012 at 8:20
-
1$\begingroup$ So, your question is basically how to prove that the code above computes the prefix function of KMP. $\endgroup$rgrig– rgrig2012年05月07日 14:31:52 +00:00Commented May 7, 2012 at 14:31
-
2$\begingroup$ @Raphael: I find the edited question much less readable. $\endgroup$JeffE– JeffE2012年05月16日 02:55:03 +00:00Commented May 16, 2012 at 2:55
1 Answer 1
First of all, note that by definition
- $\delta(q,a) = \sigma_P(P_{0,q}\cdot a) =: s_1$ and
- $\delta(\pi_P(q),a) = \sigma_P(P_{0,\pi_P(q)}\cdot a) =: s_2$.
Let us investigate $s_1$ and $s_2$ in a sketch:
sketch
[source]
Now assume $s_2 > s_1$; this contradicts the maximal choice of $s_1$ directly. If we assume $s_1 > s_2$ we contradict the fact that both $s_2$ and $\pi_P(q)$ are chosen maximally, in particular because $\pi_P(q) \geq s_1 - 1$. As both cases cases lead to contradictions $s_1=s_2$ holds, q.e.d.
As requested, a more elaborate version of the proof:
Now we have to show $s_1=s_2$; we do this by showing that the opposite leads to contradictions.
- Assume $s_2 > s_1$. Note that $P_{0,s_2} \sqsupset P_{0,q}\cdot a$ because $P_{0,s_2} \sqsupset P_{0,\pi_P(q)}\cdot a$ and $P_{0,\pi_P(q)} \sqsupset P_{0,q}$ by definition of $s_2$. Therefore, $P_{0,s_2}$ -- a prefix of $P$ and a suffix of $P_{0,q}\cdot a$ -- is longer than $P_{0,s_1},ドル which is by definition of $s_1$ the longest prefix of $P$ that is a suffix of $P_{0,q}\cdot a$. This is a contradiction.
Before we continue with the other case, let us see that $\pi_P(q) \geq s_1 - 1$. Observe that because $P_{0,s_1} \sqsupset P_{0,q}\cdot a,ドル we have $P_{0,s-1} \sqsupset P_{0,q}$. Assuming that $\pi_P(q) < s_1 - 1$ immediately contradicts the maximal choice of $\pi_P(q)$ ($s_1 - 1$ is in the set $\pi_P(q)$ is chosen from).
- Assume $s_1 > s_2$. We have just shown $|P_{0,\pi_P(q)}\cdot a| \geq s_1,ドル and remember that $P_{0,\pi_P(q)}\cdot a \sqsupset P_{0,q} \cdot a$. Therefore, $s_1 > s_2$ contradicts the maximal choice of $s_2$ ($s_1$ is in the set $s_2$ is chosen from).
As neither $s_1 > s_2$ nor $s_2 > s_1$ can hold, we have proven that $s_1 = s_2,ドル q.e.d.
-
$\begingroup$ Can you please justify your answer for $q=m$ or $Pq+1\neq a$ ? $\endgroup$user5246– user52462012年12月30日 16:34:47 +00:00Commented Dec 30, 2012 at 16:34
-
$\begingroup$ @SCO: What do you mean by "justify"? This is just a condition in the statement that ensures that the "longest prefix of P that is a Suffix of w" is used. $\endgroup$Raphael– Raphael2013年01月01日 14:40:35 +00:00Commented Jan 1, 2013 at 14:40
Explore related questions
See similar questions with these tags.