# KMP

Time：2021-5-8

## Basic concepts

### Intuitive understanding of match function

1. The match array records where we need to return to the pattern in case of mismatch

2. The function match (J) can be expressed as

• When the pointer points to p [J] (not including the whole string itself)

• fromHead startA substring and a segment ofEnd at P [J]And preceding stringsEqual lengthString of

• For example, ABCA and ABCA — > are located in J = 0 ~ 6
• -1 — there is no such substring

### Efficient build match

Suppose that all the previous match values have been calculated. Now it is required whether the j-th and match [J-1] + 1 match – whether they are two identical strings

1. Same

• match[j]=match[j-1]+1

• Note: by using the previous results, the time of finding the maximum string is optimized

2. Different

• At the same time, a general feeling is that match [J] needs to be reduced. Then, how much should it be reduced

• In case of mismatch, we find the value of match [J-1] (purple square above)
• Match [J-1] also has its matching string (green square above)

• Note: for smaller subscripts, the match value has been calculated

• In this way, a smaller string (the second purple string in the figure above) is constructed to make it equal to the first string of pattern

• Then, we compare ” Whether the two characters at are equal — J and the next character in the beginning green string, that is

• match[match[j-1]]==?==j
• If they are equal, proceed to the 1 process; Do not wait, repeat the 2 process

### reference material

KMP.pdf

Data structure, Zhejiang University:https://www.icourse163.org/learn/ZJU-93001

## Template

### The template of grandma Chen Yue

``````#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll maxn = 2e6 + 7;
typedef int Position;
#define NotFound -1
int match[maxn];
Void buildmatch (char * pattern) {// construct a match function
Position i, j;
int m = strlen(pattern);// Finding the length of M string
match[0] = -1;// The first one must not match
for (j = 1; j < m; j++) {
i = match[j - 1];
while ((i >= 0) && (pattern[i + 1] != pattern[j]))
i = match[i];// When the mismatch occurs, I value decreases
//Inherit match [J-1] = match [match [J-1]]
if (pattern[i + 1] == pattern[j])
match[j] = i + 1;// When calculating the match value of J, we consider its relationship with J-1 and use the previous match value
//The J-1 ratio is match [J-1], and the j ratio is match [J-1] + 1 -- if matched, the length of the matching string is increased by 1
else match[j] = -1;
}
}
Position KMP(char *string, char *pattern) {
int n = strlen(string);
int m = strlen(pattern);
Position s, p;
if (n < m) return NotFound;
BuildMatch(pattern);
s = p = 0;
While (s < n & & P < m) {// as long as one of them doesn't reach the tail, continue
If (string [S] = = pattern [P]) {// both are the same, both pointers are forward
s++;
p++;
} else if (p > 0) p = match[p - 1] + 1;// If it's different, but not from the first, slide match [P-1] + 1 to the right
else s++;// Different, but also the first pattern string, the s pointer of the string is moved back one bit
}
return (p == m) ? (s - m) : NotFound;
}
char String[maxn];
char pattern[maxn];
int main() {
//    char string[] = "This is a simple example.";
//    char pattern[] = "si";
//    Position p = KMP(string, pattern);
//    else {
//        printf("%s\n", string + p);
//        printf("%d\n", p);
//    }
ll n;
cin >> String;
cin >> n;
while (n--) {
cin >> pattern;
BuildMatch(pattern);
Position p = KMP(String, pattern);
else {
printf("%s\n", String + p);
//Here P refers to the array subscript at the beginning of the same substring, so that the write will start from the same substring and output all the following contents
//Input
//abcabcabcabcacabxy
//3
//abcabcacab
//cabcabcd
//abcabcabcabcacabxyz
//Output
//abcabcacabxy