MOSTDIST - Editorial

kostya_by · November 24, 2014, 12:05am

Difficulty: Easy

Pre-requisites: Binary heap, Geometry, Manhattan distance

Problem:

Let’s consider a set of points S. Your task is to implement a data structure that can process the following queries efficiently:

“+ X Y” - add a new point P with coordinates (X, Y) to S;
“- N” - remove the point that was added during the N’th adding query from S;
“? X Y” - calculate the maximal Manhattan distance between a point P with coordinates (X, Y) and any point from S.

It’s required to implement an on-line solution, i.e. which processes queries in the order of their appearance.

Explanation:

Let’s transform the formula of Manhattan distance a little bit:

dist(P1, P2) = |X1 - X2| + |Y1 - Y2| = max(X1 - X2, X2 - X1) + max(Y1 - Y2, Y2 - Y1) = max((X1 + Y1) - (X2 + Y2), (X1 - Y1) - (X2 - Y2), (-X1 + Y1) - (-X2 + Y2), (-X1 - Y1) - (-X2 - Y2)) = max(f₁(P1) - f₁(P2), f₂(P1) - f₂(P2), f₃(P1) - f₃(P2), f₄(P1) - f₄(P2)), where

f₁§ = X + Y;
f₂§ = X - Y;
f₃§ = -X + Y;
f₄§ = -X - Y.

In other words, dist(P1, P2) equals to the maximum of four values, that don’t contain abs() function anymore.

It’s time to make the key observation of this problem:

Let’s suppose that we are to find the maximal Manhattan distance between a point P and any point from S. The key fact is that there are only four candidates to be the most distant one: the ones with the maximal f₁§, f₂§, f₃§ and f₄§ from S.

So, we can maintain S as four separate binary heaps H₁, H₂, H₃ and H₄, each point of S belongs to each of H_i. The points are ordered according to f_i§ in heap H_i.

Now, if we want to add a new point to S, then we add it to each of H_i. If we want to calculate the maximal Manhattan distance between a point P and any point from S, then we find the most distant point from S(as the most distant one among ones with the maximal f₁§, f₂§, f₃§ and f₄§).

What about removing points from S? As we know, a binary heap doesn’t provide any kind of removing except removing from the top of the heap. But that’s OK, we won’t remove them for real until they affect the answers. We can just mark removed points somehow and every time, when a ‘removed’ point is on the top of one of the binary heaps, we remove it from this heap for real. The reason why we are not removing a point from S until it is not on the top is that such a point doesn’t affect the answers, since only the ones on the tops are considered as candidite for being the most distant point every time we are interested to know that information.

The total complexity of the solution is O(Q log Q).

Please, check out Setter’s and Tester’s solutions for your better understanding.

Setter’s Solution: link

Tester’s Solution: link

pranavr93 · November 24, 2014, 1:19am

Codechef should try and fix their Time limits.

My first solution printing answers using cout:
http://www.codechef.com/viewplaintext/5453096

Second solution printing answers using printf:
http://www.codechef.com/viewplaintext/5453133

First one TLE and second one passed

betlista · November 24, 2014, 1:29am

Wow, with sync_with_stdio(0)… Noticed, thanks

xellos0 · November 24, 2014, 1:29am

I don’t think it’s about cout, but endl. When you use endl instead of “\n”, it flushes the output, which is incredibly slow. (Also, sometimes the output is flushed when reading with cin, so you should put cin.tie(0) at the beginning of your code to avoid it in situations such as this one.)

flaminrage · November 24, 2014, 1:32am

Not surprised since cout is usually more slower than printf.

prashantmahesh · November 24, 2014, 11:08am

The candidates for distant point are from the maximum of each of the functions. It will be true if we find the minimum too right? Just for a better understanding. Or can someone explain how it is the maximum? I assumed from the formula written for 2 points as f1(P1) - f1(P2) … , we take P2 to be the point given and P1 the points on the set.

EDIT: I used the minimum instead of maximum and it accepted. Hence i could just use a map and do deletions easily too.

ashwini007k · November 24, 2014, 12:01pm

can you explain how you have made the transformation and its validity? I can’t seem to understand it and it’s bugging me a lot.

pranjalranjan · November 24, 2014, 12:10pm

yea, maybe its because of endl…! cin, cout with sync_with_stdio(0) works faster than scanf printf. Endl seems to be the only issue here.

pranavr93 · November 25, 2014, 2:49am

Point noted. Though cin.tie(0) still doesn’t solve the issue. Either way ,should try and avoid endl on codechef. One point which I would like to raise would be that online judges like codeforces never have such problems. In codechef, even if you get the complexities right, there is a possibility of getting a TLE. Why involve border line time limits in the first place? I guess reducing the time limit a bit wouldn’t hurt or let higher complexity solutions pass.

betlista · November 25, 2014, 3:00am

Transformation is as follows.

Hopefully when shown that way, all steps are clear enough…

Just two possibilities for each absolute value

alt text

…tried both all possibilities…

alt text

…reordered and grouped…

alt text

f₁…f₄ are defined in editorial above, we can skip f_x§ at the end, because it is the same for all points in S…

beroul · November 25, 2014, 3:38am

Actually, max for F2 is min for F3. And max for F1 is min for F4…
But my solution doesn’t pass. Weird NZEC…Grmph

xellos0 · November 25, 2014, 8:09am

Yeah well, if you only use cin.tie(0) but don’t replace endl by “\n”, it will have no effect - that’s why I listed the steps to take in order of priority. Just copy-paste the first 2 lines from my main() and don’t use endl (except for debugging), and you’ll be fine from the vast majority of cin-related problems.

ashwini007k · November 25, 2014, 3:08pm

@betlista thanks

gauravsc · November 26, 2014, 12:31pm

There are no accepted solution to this problem in python. I implemented the log(Q) per query approach, but it still gives TLE error. Can you check if the problems time constraints are too strict for python?

betlista · November 26, 2014, 9:07pm

It might be I/O problem, if there is one + 0 0 and 500,000 times ? 1000000000 1000000000 (10⁹ both) how quick you got response on your local PC?

cristhianr · November 26, 2014, 11:07pm

@ashwini007k, another way of seeing all of this, which made the understanding easier for me, is that you can consider each fi function as maintaining the most distant point of a particular quadrant from the center.
Although all of the points coordinates are positive, and then are in the same quadrant, the f1 function returns the point which is in the most upper-right side of the plane, whereas the f2 function returns the point which is in the most bottom-right side of the plane, and so on.
From that you could assume that the most distant point from P is some point which is in one of the four corners of the plane.

gauravsc · November 28, 2014, 12:16pm

I take each query using raw_input() command and give the input as you asked above. It takes 38.27s for the code to finish. I also try by reading the entire input in one attempt using sys.readlines(), but it takes even longer 86.6s. I now think that the bitwise xor operator is the only cause of delay in the output.

Here is my code http://www.codechef.com/viewsolution/5458355

betlista · November 28, 2014, 12:32pm

almost 40 seconds is too slow, I don’t think it is because of xor, it’s slow I/O I’m afraid…

Other python guys, are there quicker I/O methods?

One more thing you can try, what I’m doing in Java… Instead of print answer for each query, use string concatenation and print this huge string once at the end…

gauravsc · November 28, 2014, 1:26pm

Something interesting, I mask the output of each xor with 0xFFFFFFFF & (query ^ answer) and because of this the time reduces to 4.3s on my PC, but it still says TLE on codechef.

gauravsc · November 28, 2014, 6:44pm

This problem definitely has too strict time constraints for python. I think the time constraint of 3*1.5=4.5s for python is a little off the mark. It is near impossible for me to bring the time below 5.5s. I think the time limit for python should be at least 6.5s. Please reset the time limit for python or else a lot of other people would end up being frustrated.