MEXDIV - Editorial

admin · August 17, 2017, 3:53am

DIFFICULTY

Easy-Medium

PREREQUISITES

Dynamic Programming

PROBLEM

Compute number of good partitions an array A of n elements into contiguous subarrays. A partition is good if the mex of each subarray in the partition is \le k. Constraints: n \le 5 \cdot 10^5 and k, A_i \le 10^9.

EXPLANATION

Note that there are many possible solutions to this problem. In this editorial, I will explain one which makes use of the Segment/Fenwick data structure.

There are a couple of observations to make before we proceed:

Firstly, observe that the mex of an array can only increase as you add more elements to it. Formally, mex(A_i, A_{i + 1}, \dots, A_j) \le mex(A_i, A_{i + 1}, \dots, A_k) \forall k \ge j.
Secondly, note that by Pigeonhole Principle the mex of any subarray can never exceed n. Formally, the mex of a list of x elements can never be more than x. This means we don’t need to worry about the case where k > n, since in that case all partitions are valid and the answer is 2^{n - 1}.

These observations are enough for us to be able to come up with a polynomial time dynamic programming solution. Define dp[i] as the number good partitions of A_1, A_2 \dots A_i. To compute dp[i], we can iterate over all possible cuts j < i, and add dp[j] to dp[i] if mex(A_{j + 1}, A_{j + 2}, \dots, A_i) \le k. Formally,

dp[i] = \sum_{\substack{j < i \\ mex(A_{j + 1}, A_{j + 2}, \dots A_i) \le k}} dp[j].

The answer we’re looking for is dp[n]. This would be O(n^2) or O(n^3) based on how you implement it.

Now we use our first observation to speed up this dynamic program. For a fixed i, the set of j that will contribute to dp[i] would be contiguous and a suffix of A_1, A_2, \dots, A_i. So, if we know the length of this suffix, we’ll know which values to add and we can do that in constant time. So the problem that remains is finding how long this suffix is.

The reduced problem can be restated as the following: Given an array A of n elements and an integer k, compute an array suffix. Here, suffix[i] is the maximum integer j such that j \le i and mex(A_{j}, A_{j + 1}, A_{j + 2}, \dots, A_i) = k. If no such j exists, suffix[i] = -1. Intuitively, every integer from 0 to k should appear atleast once in A_{j}, A_{j + 1}, \dots, A_i. To compute this efficiently, we’ll use a Segment/Fenwick tree. The procedure is as follows:

Iterate from 1 to n. For each i, we will compute dp[i].
Let last[x] denote the latest occurrence of the value x seen till index i. Initialize last[x] = -1 for all x.
If A[i] \le k, update position A[i] in the tree to store last[A[i]] = i.
Query the tree to compute min(last[0], last[1], \dots, last[k]). Let this value be v.
Mark suffix[i] = v.

After computing suffix, the problem becomes trivial. Since we query and update the tree atmost n times, the complexity is O(n \log n).

NOTE

There exists an O(n) solution as well, which does not require any fancy data structures. However, it is very useful to have basic knowledge of segment tree, as you really require no thinking to optimize such dynamic programs from O(n^2) to O(n \log n) and is often easier to code.

COMPLEXITY

O(n \log{n})