REPSTR - Editorial

admin · November 22, 2012, 7:43pm

PROBLEM LINKS

DIFFICULTY

MEDIUM

EXPLANATION

At first, it is easy to prove that if there exists a substring with the length of K (K > 1) appearing the most times, there must exist another substring with the length of K-1 appearing the same times (it is such a prefix of the substring length K). So we will find the most appearing substring with the length of L to determine the most times T (first result). Then to find the longest substring appearing T times (second result), we just use binary search to solve it.

The key of both steps is counting the number of times that a substring appears. Theoretically we can use a SUFFIX TREE to solve it (easy to find documents over the internet). However I suggest another way which is easier to code and also understand but still usable to solve this. It is using HASH. As we know it is possible to use a MAP to directly store the appearing times of any substring, but it takes a long time to find and compare the strings. So I choose to encode any given substring in the base of 26 (because we have only 26 different characters), then take the modulo of a prime key (because the value of encoding may be very large, search for documents about the technique of HASH) then use it as the key to use in mapping. Another problem is, it may be wrong comparing different substring sharing the same key, so to make the keys completely different, I use a pair of prime keys to take the modulo of. And it is enough to pass this problem.

SETTER’S SOLUTION

Can be found here.

redd · June 2, 2014, 7:56pm

“So we will find the most appearing substring with the length of L to determine the most times T (first result). Then to find the longest substring appearing T times (second result), we just use binary search to solve it.”

can someone explain this line.