PROBLEM LINK:
Author: Sergey Kulik
Tester: Yanpei Liu
Editorialist: Pawel Kacprzak
DIFFICULTY:
Medium-Hard
PREREQUISITES:
Strings, Pattern matching, KMP
PROBLEM:
The problem is stated really simple. You are given a string S of N lowercase Latin letters. In addition, you are given a wildcard string T, of length M, consisting of lowercase Latin characters and asterisk symbols ‘*’.
We said that T generates a string P, if and only if, P can be obtained from T by replacing all the asterisks with any, even empty, lowercase Latin letter strings.
The task is, for each suffix S[i..N] of S, compute the position of the first occurrence in this suffix, of a string P, which can be generated from T.
QUICK EXPLANATION:
Split T on asterisks to get a list L of patterns. Find occurences of all patterns from L in S. For each suffix S[i..N] of S, find in S[i..N], the leftmost chain of non overlapping occurrences of all consecutive patterns from L.
EXPLANATION:
The first observations, which we can make, is that, we can replace any run of more than one asterisk with a single one.
After that, we can split T on asterisks and get a list L of K patterns, where K is not greater that the number of asterisks + 1.
The general idea is to use a pattern matching algorithm to find all occurrences of each pattern from L in S. Since we know that the number of asterisks is at most 500, we can compute the array A[i][j], where A[i][j] = b, if and only if, the first occurrence of the j^{th} pattern from L in S[i..N] is at index b in S.
Based on the above idea, we can iterate over all suffixes of S, and for each of them, using the A array, we can easily compute the leftmost, non-overlapping occurrences of all consecutive patterns from L, which is equivalent to the leftmost occurrence of a string generated by wildcard string T.
Time Complexity
Notice that, the time complexity of this approach depends on the method used to find occurrences of patterns from L in S.
Naive pattern matching can be sufficient to pass some subtasks, but if you want a full credit, and we hope you do, you have to use any linear pattern matching algorithm, like KMP for example. If you do that, the total time complexity of a single test case is O(K \cdot N), because we use linear pattern matching to find occurrences of K patterns in a string of length N, and next, for each of N suffixes, we make at most K jumps over A array.
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here.
Tester’s solution can be found here.