### PROBLEM LINK:

**Author:** Sergey Kulik

**Testers:** Vasya Antoniuk and Kevin Atienza

**Translators:** Vasya Antoniuk (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)

**Editorialist:** Kevin Atienza

### DIFFICULTY:

Medium

### PREREQUISITES:

Segment tree

### PROBLEM:

Given a string s of length N \le 10^5 generated uniformly randomly and independently, support Q \le 10^5 operations of two kinds:

- Given a position j and a character c, replace the j th character of s by c. c is generated uniformly randomly and independently.
- Given i, j, and p, Find the p th character in the minimal cyclic shift of s[i\ldots j].

### QUICK EXPLANATION:

Let C be the length of the longest repeated substring *at any point during the operations*. Then with very very high probability, C is very small, say, C \le 24. So we assume we can compare two suffixes by only comparing their first C characters. Similarly, we can compare cyclic shifts by comparing the first C characters, assuming the cutoff point of the cyclic shift isn’t within the last C positions. Finally, there are only C cyclic shifts where the cutoff point is within the last C positions.

This means that we can set up a *segment tree* to compute *range minimum queries*. Specifically, to find the index of the minimal cyclic shift in the range [i,j], find the minimum suffix in the range [i,j-C], and then manually compare this minimum with the last C cyclic shifts. This runs in O(C^2 + C \log N). Finally, during updates where we replace the j th character, we simply update the suffixes in the index range [j-C,j] in the segment tree. This runs in O(C^2 \log N).

### EXPLANATION:

# Randomness

The fact that the string and the updates are generated *uniformly randomly* and *independently* is important: It means we can expect not to see any “non-random” structures in the input, like long runs of a single letter `...aaaaaaaaaaa...`

and other simple patterns `...aabaabaabaa...`

. There are still some nonzero probability that these appear, but this probability is vanishingly small for really long patterns.

Given this, let’s see how a naïve algorithm will fare. Note that in the naïve algorithm, modifications are trivial. The only slow part is finding the minimal cyclic shift, which takes O(N^2) in the worst case.

In this pseudocode, we let s[i] be the i th character of s. Note that in the substring s[i\ldots j], the p th character in the cyclic shift that starts at k (i \le k \le j) is s[i + (k - i + p - 1) \bmod (j - i + 1)]. Also, be careful: this code uses one-indexing.

```
def s_cycle(i, j, k, p):
// the p'th character in the cyclic shift of s[i..j] starting at k
return s[i + (k - i + p - 1) % (j - i + 1)]
def s_cycle_compare(i, j, k1, k2):
// compare the cyclic shifts starting at k1 and k2
if k1 == k2:
// if k1 == k2, then we already know they're equal
return 0
for p=1..j-i+1:
c1 = s_cycle(i, j, k1, p)
c2 = s_cycle(i, j, k2, p)
if c1 < c2: return -1
if c1 > c2: return +1
// all characters are equal
return 0
def find_minimal_cyclic_shift(i, j, p):
k_min = i
for k=i..j:
if s_cycle_compare(i, j, k_min, k) > 0:
k_min = k
return s_cycle(i, j, k_min, p)
```

Since s is generated uniformly randomly and independently, it means the comparison `s_cycle_compare`

will most likely halt within just a couple of comparisons. Specifically, if `k1`

and `k2`

are different, then the probability of `c1`

being equal to `c2`

(and thus continuing the loop) is equal to 1/26. So the probability that the loop runs C times is 1/26^C, which is a rapidly-decaying function! For C = 24, this probability is already < 1.1\times 10^{-34}. This is a very low number. To illustrate, notice that in a query, the most number of times `s_cycle_compare`

will be called is 10^5. There are at most 10^5 queries, so there are at most 10^{10} calls all in all in a single test case. Thus, the probability that the loop inside `s_cycle_compare`

runs at least 24 times is *approximately* 1.1\times 10^{-34}\times 10^{10} = 1.1\times 10^{-24}, which is still very small. Even if we run this solution against one billion test cases, the probability of it exceeding 24 comparisons is still approximately 1.1\times 10^{-24}\times 10^9 = 1.1\times 10^{-15} which is still very small! (These calculations are not 100% accurate, but that’s okay; they only serve to give some intuition to the smallness of 1.1\times 10^{-34}. If for some weird reason this isn’t small enough for you, you can always replace 24 with something higher, say 30 or 40.)

Thus we can expect the loop in `s_cycle_compare`

to run at most 24 times. In fact, we can even safely modify the `s_cycle_compare`

implementation code above to the following:

```
def s_cycle_compare(i, j, k1, k2):
...
for p=1..min(24,j-i+1):
...
...
```

This is of course incorrect in some cases, but there is a very, very low probability that it will fail, so we’re still good!

But notice that if we’re only comparing the first 24 characters, then for most pairs of cyclic shifts in [i,j], their comparison is actually the same as the comparison between the *suffixes* that start in the same location! Specifically, for cyclic shifts that start in the indices [i,j-24], we can expect that comparing them will finish before the indices start to *wrap around* the substring s[i\ldots j].

To explain further, we also modify the remaining parts of the pseudocode above:

```
def s_cycle(i, j, k, p):
// the p'th character in the cyclic shift of s[i..j] starting at k
return s[i + (k - i + p - 1) % (j - i + 1)]
def s_suffix_compare(k1, k2):
// compare the suffixes starting at k1 and k2
// assume the string 's' is padded with a null character
// (or some unique dummy character) at the end
// so the loop below doesn't go out of bounds.
if k1 == k2:
return 0
for p=1..24:
c1 = s[k1+p-1]
c2 = s[k2+p-1]
if c1 < c2: return -1
if c1 > c2: return +1
return 0
def s_cycle_compare(i, j, k1, k2):
// compare the cyclic shifts starting at k1 and k2
if k1 == k2:
return 0
for p=1..min(24,j-i+1):
c1 = s_cycle(i, j, k1, p)
c2 = s_cycle(i, j, k2, p)
if c1 < c2: return -1
if c1 > c2: return +1
return 0
def find_minimal_cyclic_shift(i, j, p):
k_min = i
cutoff = max(i,j-24)
for k=i..cutoff:
if s_suffix_compare(k_min, k) > 0:
k_min = k
for k=cutoff+1..j:
if s_cycle_compare(i, j, k_min, k) > 0:
k_min = k
return s_cycle(i, j, k_min, p)
```

Notice that in the range [i,j-24], we’re using the `s_suffix_compare`

function instead of `s_cycle_compare`

. Again, this fails with very very low probability.

# Faster range minimum

Unfortunately, the code above is still very slow, because the range [i,j] can be large and there are a lot of queries. Specifically, it runs in O(QNC) time in the worst case, where C is the constant we chose. (24 in the code above.) To improve it, notice that the first loop in `find_minimal_cyclic_shift`

above is actually a **range minimum query**. We’re finding the “minimum” suffix in the range [i,\max(i,j-C)], where comparison is string comparison. But range minimum queries are standard operations which can be solved in a variety of ways!

In this case, we can use **segment trees** because there are also updates. Specifically, we set up a segment tree over the suffixes of the string, but we only store the indices, and for comparison we use `s_suffix_compare`

. Then when we update some character, say s[j], notice that only the suffixes with indices [j-C+1,j] will be affected, because of our assumption. This means we only need to update up to C elements of the segment tree!

Let’s analyze the running time. Segment tree point updates and range queries need O(\log N) comparisons, but each operation takes O(C) time, so a segment tree operation runs in O(C \log N) time. The worst queries are updates, where we perform C segment tree updates, so it takes O(C^2 \log N) time. This means that overall, the running time is O(QC^2 \log N)! There’s also an O(NC)-time construction of the segment tree.

The overall running time is O(NC + QC^2 \log N) which is fast enough, but if O(QC^2 \log N) looks a bit slow and bothers you, then you can consider replacing the C updates with a single *range update*, but this time it’s okay to do it *without* lazy propagation. At the very least it will save you some number of comparisons, particularly near the root.

### Time Complexity:

O(C (N + Q C \log N)) where C is an upper bound to the longest length of any repeated substring