Problem Link:
Difficulty:
Medium
Pre-requisites:
Data Structure (rank-queries on dictionary type D.S.)
Problem:
You are given a NxN board, with each cell initially 0. You are also given queries of the form
**RowSet i x:
ColSet i x:**
which set the i’th row/col to x (0 or 1).
Also, you are given queries of the form
**RowQuery i:
ColQuery i:**
which ask for how many 0’s are there in the i’th row/col.
Quick Explanation:
Imagine you are solving the problem for just RowQueries (the solution for ColQueries works by same logic when you take the transpose of the board). So you are asked for the number of 0’s in a particular row.
Now, lets see what happens: when you get a colset query, then ALL rows’ values get shifted to what it is set to. So by and large, your row is going to look like ???x??? (with the x in the corresponding column).
When you get a rowset query, that entire row gets overwritten. The point therefore is, you would need to consider only future colset queries for this particular row.
Hence, we need to ask; given a rowquery i:
(a) what was the last time that rowset i was called.
(b) what happened in terms of colset after that time.
Therefore, we store Time-Stamps.
For each [Row|Col]Set i [0|1] we store a timestamp. We also delete the previous stored timestamp of that row/col.
Now, given a RowQuery i, we first ask: was that row last set to 0 or to 1.
Then, if it was set to 0, at time ‘t’, then we ask, How many ColSet ‘1’ exist after time ‘t’. This can be done using Binary Indexed Trees in O(logN) time. The answer is then N - #1-cols.
Similarly, if it was last set to 1, then we query how many columns were set to 0, and answer #0-cols.
As mentioned earlier, combining both RowQuery and ColQuery can be done using the exact same logic when you consider the board as its transpose.
Explanation:
Let us first consider 2 simple versions of the problem.
The first simple version asks for the solution using only RowSet and RowQuery. Now, given a RowSet i x call, you would change the entire row i to x. Hence, it looks like xxxxxxxxx now. And you can easily tell by what the last call was whether your answer is 0 or N.
The second simple version case asks for the solution using only ColSet and RowQuery. Now, a ColSet i x is going to make the i’th column x and the row that you query for will look like ???x??? with the x at the i’th position. This is over all the rows, and hence irrespective of row, when you are asked for the count, you will know what the row looks like just by which all columns are set, and can hence answer the query easily here too.
The problem however, asks for given both RowSet and ColSet queries which are interleaved between each other, and then you are given the RowQuery i, what will the answer be?
Let us concentrate on row i and look at its history. Row i will be affected whenever: “Rowset i x” is called, or whenever “Colset j x” is called for ANY j. Further, a single “Rowset i x” overwrites ALL “Colset j x”. However, once we are given the last “Rowset i x” query, then this case pretty much boils down to the second simple version of the problem. Hence, the important thing to store, is the last point we made an update to row i.
Hence, if we store the timestamp when we make updates, then we will be able to tell what queries we are too consider, and what to discard. Thus, let us see what information we can glean by just storing all possible time stamps.
Consider the arrays RS0TS[], RS1TS[], CS0TS[], CS1TS[] (abbreviations for “RowSet0TimeStamp” etc.) which store the time points where things were last changed. Here, RS0TS[i] stores when was the last time that you got a “Rowset i 0”. Now, given RowQuery i, we look at row i and we know.
If RS0TS[i] > RS1TS[i], then the row was last completely set to 0, else it was set to 1.
If it was last completely set to 0, at say time point t (= RS0TS[i]), we then need to count, how many columns were set to 1 after this point. That is, we need to know, for how many ‘j’, do we have CS1TS[j] > CS0TS[j] and CS1TS[j] > t. Subtracting this count from N is the answer to our query here. Similarly, we need to take cases for RS1TS[i] > RS0TS[i], and for ColQuery i etc.
The main issue with this is : finding the count of how many ‘j’ satisfy the above property, if implemented naively, is O(N) time. We need to do this faster.
Using BIT as a rank-query dictionary-type DS:
Suppose on top of storing arrays of time-stamps, what if we used something like a STL-set as well. So we have now say sets RS0, RS1, CS0, CS1. When we get a RowQuery i, we can do the check for whether RS0TS[i] > RS1TS[i] and find whether it was 0 or 1 last, and then we can binary search in the set CS1 for the time-stamp value RS0TS[i]. We now just need how many values are there between this and the end of the set CS1.
This is called the Rank-Query, where we are required to ask how many elements are there in a “set” whose value is <= x. In a STL set though we can binary search for x’s position, taking the difference with the beginning (or end) of the set is an O(N) operation.
This can however be accomplished by using a Binary Indexed Tree (abbrv. BIT). To see this, let us look at our set as an underlying frequency array. That is, freq[i] = number of times element i occurs in our set. Now, rank-query (x) is just summation (i <= x) freq[i].
Thus, if we were to instead construct a BIT on this underlying frequency array, we would be able to do all that an STL-set offers (inserting i is simply freq[i]++, searching for i is simply freq[i] > freq[i-1], deleting i is simply freq[i]–), as well the rank-query.
Thus, we now have our complete solution that takes O(logN) time for each RowSet, ColSet, RowQuery, ColQuery operation. Hence, time complexity = O(QlogN).
Algorithm:
On RowSet i 0, at query T:
if RS0TS[i] > RS1TS[i]
RS0.increment(RS0TS[i], -1); //delete the previously stored time-stamp of RS0TS[i]
else
RS1.increment(RS1TS[i], -1); //delete the previously stored time-stamp of RS1TS[i]
RS0.increment(T, 1); //insert time point T into RS0
RS0TS[i] = T;
Similar code holds for ColSet i 0, at query T, and for RowSet i 1 and ColSet i 1.
Now, for RowQuery i, at query T:
if RS0TS[i] > RS1TS[i]
T0 = RS0TS[i]
ones = CS1.query(INF) - CS1.query(T0)
return N - ones
else
T0 = RS1TS[i]
zeros = CS0.query(INF) - CS0.query(T0)
return zeros
And similarly for ColQuery i, at query T.
Practice on Rank-Query:
The spoj problem ORDERSET is a very classic problem that uses just rank-query. You could play around using STL set, segment trees, BITs, maps/vectors etc. on this problem.
Setter’s Solution:
Can be found here
Tester’s Solution:
Can be found here