CHEFNUMK - Editorial



Author: Misha Chorniy
Tester: Tuan Anh Tran Dang
Translators: Sergey Kulik (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Tuan Anh Tran Dang




Mo’s algorithm


Given an array of N integers A[1…N] you have to answer Q queries of the form L R K which requires you to find how many number inside the sub-array A[L…R] that appears no more than K time in that sub-array.


The query in this problem would be wierd to who are not familiar with Mo’s algorithm. If you don’t know about this algorithm please take a look at the mentioned blog. This will help you solve a class of query problems with O(Nsqrt(N)) when segment tree cannot handle.

The core of Mo’s algorithm is that you maintain the query value of the range after adding/removing elements. The frequencies of numbers can be easily maintained in this fashion. We just need an cnt[] array to keep the frequencies of each numbers. Note that while the number can be as large as 10^9 we can do some precalculation to convert them into the range [1…n]. So now let assume that A[i] ≤ N and cnt[i] is the frequency of the number i in the current range.

Recall that in this problem we need to find the number of numbers that have frequency no more than K. That value equals to the number of numbers in cnt[] array that is less than or equal to K. Now it comes to the tricky part. Let above[] array where above[i] is the number of numbers in cnt[] that is larger than i. We can see that when we add/remove elements in the current range in Mo’s Algorithm we will change the value of one element in cnt[]. More specifically we need to increase or decrease an element of cnt[] by one when adding/removing one element into the current range. Let’s see how this effect the above[] array.

  • When increase cnt[i] by one: let’s call the value of cnt[i] before the increment is v then after the increment there is one more number in cnt[] that is larger then v so we just need to increase above[v] or basically we need to increase above[cnt[i]] before do so with cnt[].
  • Similary when decrease cnt[i] we need to decrease above[cnt[i]] after that because there is now one less number in cnt[] that is larger than cnt[i].

Now we have cnt[] and above[] all being maintained corresponding to the current range. Now one question is left - What is the answer for the current query? Some of you might see that it is already there: above[0] - above[k]. above[0] is the number of numbers in cnt[] that is larger than 0 which means the number of distinct values in the current range. above[k] is the number of distinct values that is have frequency more than k in the range.




Another solution uses Segment tree and takes O(logN) per removal/insertion of element into the current range.
As already described above make cnt[] array to store frequency of numbers after compressing the values.

Also make a frequency[] array, where frequency[i] store the number of elements having frequency i.

build a segment tree over this array, to perform range-sum query.

  1. Now for adding an element x into the range first decrement the frequency[cnt[x]] also simultaneously update the SegmentTree, then increment cnt[x] and then increment frequency[cnt[x]]

  2. for removing an element x decrement frequency[cnt[x]], decrement cnt[x] and increment frequency[cnt[x]]

both operation takes O(logN) time, due to update in SegmentTree

By doing this we have maintained frequency[] corresponding to the current range
Solution to the query should be obvious now, its just the range sum of frequency[1, K], which will take O(logN), givin us an O(N * ROOT(N) * logN).

This solution would most probably give you TLE, but knowing alternatives never killed anybody.

1 Like

AUTHOR’S AND TESTER’S SOLUTIONS: links are not working