Z算法用于查找线性时间内字符串中某个模式的出现。假设字符串的长度为n并且要搜索的模式的大小为m,则求解所需的时间为O(m + n)。
Z算法使用Z数组查找模式的出现。
它是与字符串长度相同的数组。z数组的每个元素都由字符串的最长子字符串的长度(从I开始)组成,可以用作字符串本身的前缀。
在该算法中,给定长度为n的字符串S和长度为m的待搜索模式p。
我们将创建一个z数组。现在,该算法将i = 1到n-1的字符串的所有字符循环。现在,我们将创建一个子字符串s [LR],它是一个前缀子字符串,使得1≤L≤I≤R。
现在,以正确的时间间隔为i-1和直到i-1的所有z值创建子字符串[L,R]。使用以下步骤计算z [i]和新间隔[L,R]-
Step1: if i > R, no larger prefix-substring is possible. So, we will compute new interval bt comparing S[0] to S[i] i.e. string starting from index 0 i.e. from start with substring starting from index i. And find z[i] using z[i] = R - L + 1. Step 2: Else if, i ≤ R, [L, R] can be extended to i. For k = i-L, z[i] ≥ min(Z[k], R-i+1). Step 2.1: If, Z[k] < R-i+1, no longer prefix substring s[i] exist. Step 2.2: If Z[k] ≥ R-i+1, then there can be a longer substring. So, we will update [L, R] by changing L = i and changing R by matching from S[R+1].
此过程在一次迭代中找到所有Z值(仅循环一次)。
程序展示算法的实现-
#include<iostream> using namespace std; void createZarray(string str, int Z[]){ int n = str.length(); int L, R, k; L = R = 0; for (int i = 1; i < n; ++i){ if (i > R){ L = R = i; while (R<n && str[R-L] == str[R]) R++; Z[i] = R-L; R--; } else { k = i-L; if (Z[k] < R-i+1) Z[i] = Z[k]; else { L = i; while (R<n && str[R-L] == str[R]) R++; Z[i] = R-L; R--; } } } } void zAlgorithm(string text, string pattern){ string str = pattern+"$"+text; int len = str.length(); int Z[len]; createZarray(str, Z); for (int i = 0; i < len; ++i){ if (Z[i] == pattern.length()) cout<<(i-pattern.length()-1)<<"\t"; } } int main(){ string str = "Hello! Welcome To (cainiaojc.com) programming tutorial"; string pattern = "tutorial"; cout<<"The patter ' "<<pattern<<" ' is found in the string '"<<str<<" ' at index \t"; zAlgorithm(str, pattern); return 0; }
输出结果
The patter ' tutorial ' is found in the string 'Hello! Welcome To (cainiaojc.com) programming tutorial ' at index 18 46