Improve
Suggest changes
Like Article
Like
Report
Try it on GfG Practice
redirect icon

In many programming problems involving strings, we often need to search for occurrences of a pattern inside a text. A classic approach like the naive string matching algorithm checks for the pattern at every index of the text, leading to a time complexity of O(n·m), where n is the length of the text and m is the length of the pattern. This is inefficient for large inputs.

To address this, several efficient string matching algorithms exist one of them is the Z-Algorithm, which allows us to perform pattern matching in linear time. That means, it can find all matches in O(n + m) time.

Table of Content

What Is the Z-Array?

The Z-Algorithm revolves around computing something called the Z-array.

Let’s say we have a string s of length n.

In simpler terms:

Z[i] tells us how many characters from position i onwards match with the beginning of the string.

Let’s take a small example to understand this better before we move on to the construction of the Z-array.

Example:
s = "aabxaab".

C++
Z[0]=0// by definition; we don't compare the full string with itself

Now, we compute the remaining values from index 1 to 6.

Z[1] = 1 → Only the first character 'a' matches with the prefix.

Z[2] = 0 → 'b' does not match the first character of the prefix 'a'.

Z[3] = 0 → 'x' does not match the first character of the prefix 'a'.

Z[4] = 3 → The substring "aab" matches the prefix "aab".

Z[5] = 1 → Only 'a' matches with the first character of the prefix.

Z[6] = 0 → 'b' does not match the first character of the prefix 'a'.

Final Z-array

C++
z[]=[0,1,0,0,3,1,0]

Calculation of Z Array

The naive approach to compute the Z-array checks, for each index i, how many characters from s[i...] match the prefix starting at s[0]. This can lead to O(n2) time in the worst case.
However, using the Z-algorithm, we can compute all Z[i] values in O(n) time.

While computing the Z-array, we maintain a window [l, r], known as the Z-box, which represents the rightmost substring that matches the prefix of the string.

This window helps us reuse previous computations to optimize Z-array construction.

Why is Z-box helpful?

When processing index i, there are two possibilities:

  1. If i > r (Outside the Z-box):
    => Start comparing the prefix and the substring beginning at i.
    => Count the number of matching characters and store this length in Z[i].
    => Update the window [L, R] to represent this new matching segment.
  2. If i ≤ r:
    => Let k be the position corresponding to i within the prefix (k = i - L).
    => Use the value Z[k] as a reference.
    -> If Z[k] is strictly less than the remaining length in [L, R], assign Z[i] = Z[k].
    -> Otherwise, begin comparing characters beyond the current window to extend the match.
    => After extending, update the window [L, R] if a longer match was found.
C++
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
usingnamespacestd;
vector<int>zFunction(strings){
intn=s.length();
vector<int>z(n);
intl=0,r=0;
for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;

// Case 2: reuse the previously computed value
z[i]=min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
intmain(){
strings="aabxaab";

vector<int>z=zFunction(s);

for(inti=0;i<z.size();++i){
cout<<z[i]<<" ";
}
}
Java
importjava.util.ArrayList;
importjava.util.Arrays;
publicclass GfG{
publicstaticArrayList<Integer>zFunction(Strings){
intn=s.length();
ArrayList<Integer>z=newArrayList<>(n);
for(inti=0;i<n;i++){
z.add(0);
}
intl=0,r=0;
for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;

// Case 2: reuse the previously computed value
z.set(i,Math.min(r-i+1,z.get(k)));
}
// Try to extend the Z-box beyond r
while(i+z.get(i)<n&&
s.charAt(z.get(i))==s.charAt(i+z.get(i))){
z.set(i,z.get(i)+1);
}
// Update the [l, r] window if extended
if(i+z.get(i)-1>r){
l=i;
r=i+z.get(i)-1;
}
}
returnz;
}
publicstaticvoidmain(String[]args){
Strings="aabxaab";

ArrayList<Integer>z=zFunction(s);

for(intx:z){
System.out.print(x+" ");
}
}
}
Python
def zFunction(s):
 n = len(s)
 z = [0] * n
 l, r = 0, 0
 for i in range(1, n):
 if i <= r:
 k = i - l
 
 # Case 2: reuse the previously computed value
 z[i] = min(r - i + 1, z[k])
 # Try to extend the Z-box beyond r
 while i + z[i] < n and s[z[i]] == s[i + z[i]]:
 z[i] += 1
 # Update the [l, r] window if extended
 if i + z[i] - 1 > r:
 l = i
 r = i + z[i] - 1
 return z
if __name__ == "__main__":
 z = zFunction("aabxaab")
 print(" ".join(map(str, z)))
C#
usingSystem;
usingSystem.Collections.Generic;
publicclassGfG{
publicstaticList<int>zFunction(strings){
intn=s.Length;
List<int>z=newList<int>(newint[n]);
intl=0,r=0;
for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;

// Case 2: reuse the previously computed value
z[i]=Math.Min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
publicstaticvoidMain(){
strings="aabxaab";
List<int>result=zFunction(s);
Console.WriteLine(string.Join(" ",result));
}
}
JavaScript
functionzFunction(s){
letn=s.length;
letz=newArray(n).fill(0);
letl=0,r=0;
for(leti=1;i<n;i++){
if(i<=r){
letk=i-l;

// Case 2: reuse the previously computed value
z[i]=Math.min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s.charAt(z[i])===s.charAt(i+z[i])){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
// Driver Code
constz=zFunction("aabxaab");
console.log(z.join(" "));

Output
0 1 0 0 3 1 0 

Time Complexity: O(n)
Auxiliary Space: O(n)

Why This Works in Linear Time

How Z-array Helps in Pattern Matching

Given two strings text (the text) and pattern (the pattern), consisting of lowercase English alphabets, find all 0-based starting indices where pattern occurs as a substring in text.

Example:

Input: text = "aabxaabxaa", pattern = "aab"
Output: [0, 4]
Explanation:

[画像:KMP-Algorithm-for-Pattern-Searching]

The key idea is to preprocess a new string formed by combining the pattern and the text, separated by a special delimiter (e.g., $) that doesn’t appear in either string. This avoids accidental overlaps.

We construct a new string as:

Kotlin
s=pattern+'$'+text

We then compute the Z-array for this combined string.

The Z-array at any position i tells us the length of the longest prefix of the pattern that matches the substring of the text starting at that position (adjusted for offset due to the pattern and separator).

So, whenever we find a position i such that:

Perl
Z[i]==lengthofpattern

it means the entire pattern matches the text at a position:

Perl
matchposition=i-(patternlength+1)

Illustrations:

C++
#include<iostream>
#include<vector>
usingnamespacestd;
// Z-function to compute Z-array
vector<int>zFunction(string&s){
intn=s.length();
vector<int>z(n);
intl=0,r=0;
for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;

// Case 2: reuse the previously computed value
z[i]=min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
// Function to find all occurrences of pattern in text
vector<int>search(string&text,string&pattern){
strings=pattern+'$'+text;
vector<int>z=zFunction(s);
vector<int>pos;
intm=pattern.size();
for(inti=m+1;i<z.size();i++){
if(z[i]==m){
// pattern match starts here in text
pos.push_back(i-m-1);
}
}
returnpos;
}
intmain(){
stringtext="aabxaabxaa";
stringpattern="aab";
vector<int>matches=search(text,pattern);
for(intpos:matches)
cout<<pos<<" ";
return0;
}
Java
importjava.util.ArrayList;
importjava.util.Arrays;
publicclass GfG{

// Z-function to compute Z-array
staticArrayList<Integer>zFunction(Strings){
intn=s.length();
ArrayList<Integer>z=newArrayList<>();
for(inti=0;i<n;i++){
z.add(0);
}
intl=0,r=0;

for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;

// Case 2: reuse the previously computed value
z.set(i,Math.min(r-i+1,z.get(k)));
}

// Try to extend the Z-box beyond r
while(i+z.get(i)<n&&
s.charAt(z.get(i))==s.charAt(i+z.get(i))){
z.set(i,z.get(i)+1);
}

// Update the [l, r] window if extended
if(i+z.get(i)-1>r){
l=i;
r=i+z.get(i)-1;
}
}

returnz;
}

// Function to find all occurrences of pattern in text
staticArrayList<Integer>search(Stringtext,Stringpattern){
Strings=pattern+'$'+text;
ArrayList<Integer>z=zFunction(s);
ArrayList<Integer>pos=newArrayList<>();
intm=pattern.length();

for(inti=m+1;i<z.size();i++){
if(z.get(i)==m){

// pattern match starts here in text
pos.add(i-m-1);
}
}
returnpos;
}
publicstaticvoidmain(String[]args){
Stringtext="aabxaabxaa";
Stringpattern="aab";

ArrayList<Integer>matches=search(text,pattern);

for(intpos:matches)
System.out.print(pos+" ");
}
}
Python
def zFunction(s):
 n = len(s)
 z = [0] * n
 l, r = 0, 0
 for i in range(1, n):
 if i <= r:
 k = i - l
 # Case 2: reuse the previously computed value
 z[i] = min(r - i + 1, z[k])
 # Try to extend the Z-box beyond r
 while i + z[i] < n and s[z[i]] == s[i + z[i]]:
 z[i] += 1
 # Update the [l, r] window if extended
 if i + z[i] - 1 > r:
 l = i
 r = i + z[i] - 1
 return z
def search(text, pattern):
 s = pattern + '$' + text
 z = zFunction(s)
 pos = []
 m = len(pattern)
 for i in range(m + 1, len(z)):
 if z[i] == m:
 
 # pattern match starts here in text
 pos.append(i - m - 1)
 return pos
if __name__ == '__main__':
 text = 'aabxaabxaa'
 pattern = 'aab'
 matches = search(text, pattern)
 for pos in matches:
 print(pos, end=' ')
C#
usingSystem;
usingSystem.Collections.Generic;
publicclassGfG{

// Z-function to compute Z-array
staticList<int>zFunction(strings){
intn=s.Length;
List<int>z=newList<int>(newint[n]);
intl=0,r=0;
for(inti=1;i<n;i++){
if(i<=r){
intk=i-l;
// Case 2: reuse the previously computed value
z[i]=Math.Min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
// Function to find all occurrences of pattern in text
staticList<int>search(stringtext,stringpattern){
strings=pattern+'$'+text;
List<int>z=zFunction(s);
List<int>pos=newList<int>();
intm=pattern.Length;
for(inti=m+1;i<z.Count;i++){
if(z[i]==m){

// pattern match starts here in text
pos.Add(i-m-1);
}
}
returnpos;
}
publicstaticvoidMain(){
stringtext="aabxaabxaa";
stringpattern="aab";
List<int>matches=search(text,pattern);
foreach(intposinmatches)
Console.Write(pos+" ");
}
}
JavaScript
functionzFunction(s){
letn=s.length;
letz=newArray(n).fill(0);
letl=0,r=0;
for(leti=1;i<n;i++){
if(i<=r){
letk=i-l;
// Case 2: reuse the previously computed value
z[i]=Math.min(r-i+1,z[k]);
}
// Try to extend the Z-box beyond r
while(i+z[i]<n&&s[z[i]]===s[i+z[i]]){
z[i]++;
}
// Update the [l, r] window if extended
if(i+z[i]-1>r){
l=i;
r=i+z[i]-1;
}
}
returnz;
}
functionsearch(text,pattern){
lets=pattern+'$'+text;
letz=zFunction(s);
letpos=[];
letm=pattern.length;
for(leti=m+1;i<z.length;i++){
if(z[i]===m){

// pattern match starts here in text
pos.push(i-m-1);
}
}
returnpos;
}
// Driver Code
lettext='aabxaabxaa';
letpattern='aab';
letmatches=search(text,pattern);
console.log(matches.join(" "));

Output
0 4 

Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern, since the combined string and Z-array are processed linearly.
Auxiliary Space: O(n + m), used to store the combined string and the Z-array for efficient pattern matching.

Advantages of Z-Algorithm

Real-Life Applications


K

kartik
Improve
Article Tags :
Practice Tags :

Explore

We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

AltStyle によって変換されたページ (->オリジナル) /