SEREJA - Editorial

PROBLEM LINKS

Practice

Contest

DIFFICULTY

MEDIUM

PREREQUISITES

Simple Math

PROBLEM

You are given a sequence of random integers A1, A2, …, AN. Compute the product of max(Ai, Ai+1, …, Aj) - min(Ai, Ai+1, …, Aj) over all 1 ≤ i < j ≤ N, modulo 1,000,000,007.

QUICK EXPLANATION

Let ans[j] be the answer to the problem for subsequence A1, A2, …, Aj; i.e., ans[j] = diff(1, j) × diff(2, j) × … diff(j-1, j), where diff(i, j) = max(Ai, Ai+1, …, Aj) - min(Ai, Ai+1, …, Aj). Then, the actual answer is ans[1] × ans[2] × … × ans[N]. The value of ans[j+1] can be calculated from ans[j] in O(1) amortized time since the array is random.

EXPLANATION

In this solution, we will add the elements of A one-by-one. Let’s consider subsequence A1, A2, …, Aj and solve it. To do this, we will maintain two additional arrays M[] and m[], where:

  • M[i] = max(Ai, Ai+1, ..., Aj)
  • m[i] = min(Ai, Ai+1, ..., Aj)

For example, let A = (3, 6, 5, 1, 5, 3, 4). (Here, j = 7.) Then,

  • M = (6, 6, 5, 5, 5, 4, 4)
  • m = (1, 1, 1, 1, 3, 3, 4)

It is obvious that the elements of M are sorted in non-increasing order, while the elements of m are sorted in non-decreasing order.

Let diff(i, j) = max(Ai, Ai+1, …, Aj) - min(Ai, Ai+1, …, Aj). Suppose we know the answer for this subproblem; i.e., we have calculated ans[j] = the product of diff(i2, j2) for all 1 ≤ i2 < j2 ≤ j. Let’s add element Aj+1. Now, we want calculate the subproblem A1, A2, …, Aj+1. It can be seen that

ans[j+1] = ans[j] × {diff(1, j+1) × diff(2, j+1) × … × diff(j, j+1)}

The value of {diff(1, j+1) × diff(2, j+1) × … × diff(j, j+1)} can be calculated efficiently using the M and m arrays above. First, as we now have a new element Aj+1, let’s update M and m but do not add new elements to them yet. For example, suppose Aj+1 = 5. Then,

  • M = (6, 6, 5, 5, 5, 5, 5)
  • m = (1, 1, 1, 1, 3, 3, 4)

On the other hand, suppose Aj+1 = 2. Then,

  • M = (6, 6, 5, 5, 5, 4, 4)
  • m = (1, 1, 1, 1, 2, 2, 2)

In the other words, when we introduce a new element Aj+1, either the last several elements of M or m will be updated. Let k be the index of the first element of either M or m that is updated; i.e., if Aj+1 = 5 then k = 6, while if Aj+1 = 2 then k = 5 as shown on the above examples. Then, diff(i, j) = diff(i, j+1) for all i < k since the minimum and the maximum will remain equal. Therefore,

ans[j+1] = ans[j] × {diff(1, j) × diff(2, j) × … × diff(k-1, j)} × {diff(k, j+1) × diff(k+1, j+1) × … × diff(j-1, j+1)}

The value of {diff(1, j) × diff(2, j) × … × diff(k-1, j)} can be calculated in O(1) time by precalculating and maintaining the prefix product prod[i] = diff(1, j) × diff(2, j) × … × diff(i, j). The value of {diff(k, j+1) × diff(k+1, j+1) × … × diff(j, j+1)} can be calculated in a single loop form k to j as:

diff(k, j+1) × diff(k+1, j+1) × … × diff(j, j+1) = (M[k] - m[k]) × (M[k+1] - m[k+1]) × … × (M[j] - m[j]).

However, since the input array is random, one can prove that the total number operations for calculating the above expression for 1 ≤ j ≤ N is O(N lg N), not O(N2). Therefore, the overall complexity of this algorithm is O(N lg N).

Here is a pseudocode of this algorithm. Note that all operations should be performed modulo 1,000,000,007.

read(N)
M[0] = infinity
m[0] = -infinity
prod[0] = 1
res = 1
for j = 1; j ≤ N; j++:
    read(A[j])
    k = j-1
    // update M
    while A[j] > M[k]:
        M[k] = A[j]
    k--
    // update m
    while A[j] < m[k]:
        m[k] = A[j]
        k--
    k++
    // update prod
    for i = k; i ≤ j; i++:
        prod[i] = prod[i-1] × (M[k] - m[k])
    // update M and m
    M[j] = m[j] = A[j]
    // update answer
    res = res * prod[j];
    

println(res)

SETTER’S SOLUTION

Can be found here.

TESTER’S SOLUTION

Can be found here.

The tester uses different algorithm that will be explained below.

RMQ Data Structure

In this solution, we need to find the maximum and the minimum element of Ai, Ai+1, …, Aj efficiently. We could use segment tree data structure. However, since the sequence is static we can use RMQ (Range Min/Max Query) that is simpler to code.

Suppose we want to build Range Max Query data structure (the Range Min Query version is similar). Let rmq[i][j] = x such that Ax = max(Ai, Ai+1, …, Ai+2j-1). This RMQ table can be filled as follows.

  • rmq[i][0] = i
  • rmq[i][j] = rmq[i][j-1] if Armq[i][j-1] > Armq[i+2j-1][j-1], or rmq[i+2j-1][j-1] otherwise

This RMQ table can be built in O(N lg N) time and space complexity.

Now, the maximum element of Ai, Ai+1, …, Aj can be found in O(1) time. Let k = maximum integer such that 2k ≤ j-1+1. Then,

max(Ai, Ai+1, …, Aj) = max(Armq[i][k], Armq[j-2k+1][k]).

The value of k for each value j-i+1 can be precomputed in at most O(N lg N) time so that it can be retrieved in O(1) time.

Modular Multiplicative Inverse

In this solution, we need a division modulo M = 1,000,000,007. Since M is a prime number, we can use Fermat’s little theorem:

A / B mod M = A × B-1 mod M = A × BM-2 mod M.

The value of BM-2 mod M can be calculated in O(lg M) time using exponentiation by squaring.

Solution

Now let’s discuss the actual solution. We will use dynamic programming approach. Let dp[i][j] = the answer for the subproblem Ai, Ai+1, …, Aj, i.e. the product of max(Ai2, Ai2+1, …, Aj2) - min(Ai2, Ai2+1, …, Aj2) over all i ≤ i2 < j2 ≤ j. Use the RMQ data structure to retrieve indices p and q such that

  • Ap = max(Ai, Ai+1, ..., Aj)
  • Aq = min(Ai, Ai+1, ..., Aj)

Suppose s = max(p, q) and t = min(p, q). We can illustrate the positions of the indices in the below ASCII art.

+-+-+-+     +-+     +-+     +-+-+-+-+
| | | | ... | | ... | | ... | | | | | 
+-+-+-+     +-+     +-+     +-+-+-+-+
 i           t       s             j

Let’s count the number of ranges [i2, j2] that covers indices p and q. Those ranges will then have the maximum value of Ap and the minimum value of Aq. From the above image, the valid ranges [i2, j2] are such that i ≤ i2 ≤ t; s ≤ j2 ≤ j. There are (t-i+1) × (j-s+1) such ranges. Therefore, the value (Ap - Aq) will be multiplied (t-i+1) × (j-s+1) times.

There are remaining ranges that have not been considered yet, namely ranges such that i ≤ i2 < j2 ≤ s-1 and ranges such that t+1 ≤ i2 < j2 ≤ j. But they are just subproblems of the original problems! The values are dp[i][s-1] and dp[t+1][j], respectively. However, we will consider the ranges such that t+1 ≤ i2 < j2 ≤ s-1 twice. We can fix this by dividing the result by dp[t+1][s-1].

In summary,

dp[i][j] = (Ap - Aq)(j-s+1) × (t-i+1) × dp[i][s-1] × dp[t+1][j] / dp[t+1][s-1].

Because we only consider ranges that have at least 2 elements, the base case is dp[i][i] = multiplicative identity = 1 for all 1 ≤ i ≤ N.

Random Array

Now, it seems that we will need O(N2) memory to store the DP table, which is too much as N can be up to 100,000. However, there is an important constraint: all elements are generated independently and uniformly at random. It can be proved that the number of states for such arrays is at most 2 × N. Therefore, we can use top-down approach for filling the DP table and store only visited states in a map or hash table. The time complexity of this solution is therefore O(N lg N) and the space complexity is O(N).

4 Likes

after reading the editorial for this problem, all i can do is to just laugh at myself ;-).
and today i got what ‘randomised’ mean in terms of programming - “dont consider worst case, most of the time worst case doesnt occur, so O(N^2) can be considered O(N*logN)”.

1 Like

I started with this question 3hours left, and just could not find a bug in my solution. :wink:

what is M[k] and m[k] element how do we know that A[j]>M[k] or A[j]<m[k] you just know M[0] and m[0]

In my solution, I used All Nearest Smaller Values, which is O(N), instead of range minimum queries.

2 Likes

It seems there are many approaches to this problem. What I did was find the minimum element; then find the increasing subsequence going right, and the increasing subsequence going left. Due to the randomness, each of these has expected length sqrt(N).

You can then loop over ~ sqrt(N) * sqrt(N) intervals; for each one you know the min (found originally) and max (max(left point, right point)). This handles all intervals containing the min, and you can then recursively solve the left and right halves.

This works out to f(N) = 2f(N/2) + N, which is like merge sort.

6 Likes

I implemented the tester’s algorithm during the contest, but i can’t seem to find the bug in my code.
Please check this solution and let me know where it can go wrong.
http://www.codechef.com/viewsolution/1600999

I tested my answers with brute force method for small n ~= 1000 and all answers matched.

I too was considering the same logic as O(N^2)…just learn today that it can be O(N*logN) when the array is generated randomly…

i cannot understand this ascii chart. Please explain.

Hmmm my solution is much simpler, so I thought I would share.

“However, since the input array is random, one can prove that the total number operations for calculating the above expression for 1 ≤ j ≤ N is O(N lg N)”…How can one prove this ??

I could get the answer using the first process however , i am not able to understand the second process and how the RMQ is initially set up .I Am a new guy .So any help on better understanding of the second process will be appreciated :slight_smile:

In the pseudo code for the 1st algorithm for the part ‘update prod’,
isn’t this line wrong:
prod[i] = prod[i-1] × (M[k] - m[k])
It should be:
prod[i] = prod[i-1] × (M[i] - m[i])

Will there be any difference in time and space complexity by using Segment Tree data structure instead of RMQ Data Structure ?

Time complexity will become O(log N) per query, but you will use only O(N) space.