Given a string S with length up to 10^5. You have to find a prefix of S that occurs the maximum number of times as a substring in S (among all prefixes). If there’s a tie, pick the longest prefix.
Clearly, the prefix consisting of the first letter alone occurs the maximum number of times among all prefixes. Because whatever is the answer prefix, the first letter is always a prefix of it. However, it’s not necessarily the longest one.
To find the longest one, let’s maintain a list of all appearances of our string’s first letter. Let’s say the prefix consisting of the first 2 letters has maximum occurrences as well, what condition should hold?
Answer: Every appearance of the string’s first letter must be followed immediately by the string’s second letter. If one of these appearances was followed by a different letter, then clearly the prefix consisting of 2 letters occurs fewer times. So what are the condition for lengths > 2? (think about it).
If our answer’s length is K then all appearances of the first letter must be followed by exactly the same K-1 letters. So after we maintain the list of appearances indices, we keep moving them forward simultaneously one letter at each step (scanning substrings which present a prefix) until some mismatch happens or the last index reaches the end of the string.
This solution runs in O(|S|)
We can prove that according to the fact, that our segments built by moving our pointers cannot ever intersect. If 2 of them intersected, it means that at least one of our pointers points to some index with corresponding letter equal to the string’s first letter. BUT when we started a substring from the last appearance of our string’s first letter, there’s no other appearance in front of it, so it’s guaranteed that there will be a mismatch (or it reached the end of the string).