Editorial of http://www.codechef.com/COOK43/problems/UNIQUE problem is not found

jangwa · February 17, 2014, 6:53pm

Admin can you please upload the editorial for the problem http://www.codechef.com/COOK43/problems/UNIQUE

betlista · February 17, 2014, 7:09pm

I’m also waiting for this one

xellos0 · February 17, 2014, 7:46pm

You can solve it using a suffix array.

Imagine you listed all suffixes and sorted them in alphabetical order. A unique substring is a prefix of one of those suffixes, which isn’t a prefix of the suffixes preceding and succeeding it. It’s clear that for a fixed suffix: if this property holds for some prefix, it also holds for any longer one, and we can bin-search the shortest such prefix by comparing hashes. Denote the length of this prefix by l_x (x is the starting index of our suffix).

Take a fixed index i. We want the shortest such substring; if it start at the index j <= i, 2 cases can appear:

j+l_j >= i, and the shortest unique prefix of this suffix covers i; we only need the largest such j
j+l_j < i, and the shortest unique prefix of this suffix that covers i is that which ends at i; we also need the largest such j

So, we need to find the shortest substrings for every i and for each of the 2 cases separately (for each case and i, all substrings have different lengths, so lexicographic order is not an issue); then, we can just pick the better solution for each i.

Let’s try it from a different direction: we fix j and vary i. We already know that the first case corresponds to i >= j+l_j, and the second to j <= i < j+l_j; for all indices between j and j+l_j-1, we increase the “second case j” to this j (if we go by increasing j), and for all indices >= j+l_j, we increase the “first case j” to this j.

Aside from a bazooka like lazy-loading interval tree, the first case only needs one array A (where we perform operations A[i+1] =max(A[i+1],A[i]) at the end) and the second case can be done with deque-based sliding window or just two sets, in which you remember the pairs (value,interval) and (interval end,interval), and remove the intervals that have already ended - whatever you prefer.

Lexicographic comparison can be done easily on suffix arrays, and the length of the result is given implicitly by case 1 (length l_j) or 2 (ends at i).

A simple implementation of suffix arrays + bin-search and the consequent traversal can be done in O(N log^2 N) time and should pass easily.