MXDIST - EDITORIAL

taran_1407 · January 14, 2019, 4:12pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2

Setter: Bogdan Ciobanu
Tester: Xiuhan Wang
Editorialist: Taranpreet Singh

DIFFICULTY:

Hard

PREREQUISITES:

Convex Hull, Rotating Calipers, Range Minimum Query (RMQ) and bucket decomposition.

PROBLEM:

Given N points and Q queries (f,t) specifying range [f,t], find the maximum squared euclidian distance between any pair of points within the specified range.

QUICK EXPLANATION

Divide the points into buckets and build a convex hull for each bucket (preferably using Monotone Chain algorithm).
Build an RMQ sparse table considering each block as one unit, rmq[i][j] representing the convex hull formed by points from blocks in the range [i,i+2^j-1]. Hulls should be merged in O(S) where S represent the size of the hull.
Most distant point pair from a set of points can be easily found by using Rotating Calipers on the convex hull of the given set of points. To answer a query, make a convex hull consisting of points belonging to partially filled buckets and merge it into the appropriate pre-built convex hull from RMQ for intermediate blocks and calculate the answer.

EXPLANATION

Let us solve a trivial problem first. Given a set of points, find the maximum distance between any pair of points.

Lemma: The pair of points having the maximum distance lie on convex hull formed by points.

Proof: Let us assume we have a point pair AB being the most distant point pair, Point A strictly inside the convex hull and point B on or inside convex hull. Lt us extend this line toward A till it intersects the border of the convex hull, say at segment CD. Now, it can be seen, that at least one of |BC| > |BA| or |BD| > |BA| holds, which contradicts the fact that points AB have the maximum distance. This contradicts our assumption. Hence, for the points to be most distant, both points shall lie on the convex hull.

Now, to find the maximal distance, we can use the rotating calipers technique, explained in brief below.

Let us have a pointer variable p and for every point indexed i, move it forward till the distance of (p+1)th point from the line formed by ith and (i+1)th point is more than distance of pth point from same segment, considering distance of pth and (p+1)th point from ith point. whenever moving from ith point to (i+1)th point, consider its distance of both these points from pth point. The maximum of all these is the distance of most distant point pair.

An important point to note is, that binary search is not going to work, as explained well here.

Merging Convex hulls

We merging convex hulls naively (assuming Monotone Chain algorithm is being used). In Monotone Chain algorithm, we find the points having leftmost and rightmost x-coordinate and build lower and upper convex hull separately. So, while merging, we can use merge-sort type technique to merge lower and upper hulls and removing the useless points in time proportional to the size of the convex hull.

Returning to original problem, we can build a RMQ sparse table, where RMQ[i][j] represent the convex hull formed by points in range [i, i+2^j-1] which can be easily rewritten as the convex hull formed by merging RMQ$[i][j-1] and RMQ[i+2^{j-1}][j-1]$. The details of RMQ can be found here.

Now, Let us evaluate the time complexity of this solution. During preprocessing, we have log(N) levels having N convex hulls. We can see, that merging convex Hull take time proportional to the size of convex hull, resulting in preprocessing time being O(N*log(N)*S) after which, each query may take O(S) time as both merging convex hull and calculating most distant point pair take O(S) time.

Now, the question is, what is the size of the convex hull? We have randomly generated points in the second sub-task while worst cases in the third sub-task. It can be seen from research that the expected number of points in convex hull of N randomly generated points is roughly of the order of N^{1/3} while if we generate points carefully in MX*MX grid, we can have up to around MX^{2/3} points in convex hull, MX = 2*10^5 in our case. This explains the difference in time complexity of the same solution for second and third sub-task. See references here and here.

Now, we see, though it may be sufficient with optimizations to pass this solution for the second subtask, it shall be hard to pass the final subtask. So, Let us optimize the preprocessing part to fit the time limit by bucket decomposition.

Let’s first divide all N points into blocks of size SZ and now, build RMQ considering each block as one position in RMQ. This way, to answer queries, we can use RMQ to find convex hull formed by points in intermediate blocks and naively build a temporary convex hull by points which lie in partially filled blocks (can be at most 2 for each query). We can merge these two blocks and calculate the answer for this block.

Now, Let us calculate the time complexity of this solution. If C = \lceil N/SZ \rceil, we can build RMQ on blocks in O(S*C*log(C)) time and answer each query in O(SZ+S) time, resulting in time complexity O(S*(N/SZ)*log(N) + Q*(S+SZ)). It works fast enough for SZ = sqrt(S*N*log(N)/Q).

Editorialist solution

The editorialist who like different solutions, choose to use graham scan as means of building and merging convex hulls as opposed to Monotone Chain algorithm in case of setter and tester solution. This slows down editorialist’s solution by log(S) due to the fact that for Graham scan, sorting by the polar angle is required while in Monotone Chain algorithm, merging convex hull can be done in merge sort type manner in O(S) for both lower and upper hulls. This is the reason Monotone Chain algorithm is preferred to Graham scan in this problem.

An interesting problem to try is GEOCHEAT.

Time Complexity

Overall time complexity is O(S*(N/SZ)*log(N/SZ) + Q*(S+SZ)) where SZ is the block size chosen.

Space Complexity is O(N+S*(N/SZ)*log(N/SZ)).

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter’s solution
Tester’s solution
Editorialist’s solution

Feel free to Share your approach, If it differs. Suggestions are always welcomed.

sdssudhu · January 14, 2019, 9:03pm

How did it matter when points are distributed uniformly randomly (Subtask 2) ? Cause I saw quite a good amount of people get that subtask

rajarshi_basu · January 14, 2019, 9:41pm

I used CH in each node of segtree. During queries, i merged the relevent node’s segtree and computed the CH of this resulant set of ppints. I then computed the farthest point using o(n^2) bruteforce. (Stupid of me… I had thgt that rotating callipers would TLE… )this gets 50 pnts.

shoom · January 14, 2019, 11:31pm

It matter, since the solution time is proportional to the convex hull size. The hull size (as percentage from the total number of points) is smaller for randomly generated points as majority of the points fall inside the convex hull rather than on it.

taran_1407 · January 14, 2019, 11:34pm

See the part of editorial discussing hull size. Time complexity of merging as well as rotating calipers is proportional to hull size.

roll_no_1 · January 14, 2019, 11:42pm

@sdssudhu I googled uniform distribution of points on square and this is what showed up as one of the first few results. If we look at the figure 1 in here, then it clearly shows that in a uniform distribution there is a high probability that the points that are the furthest apart are located closer to the corners of a square. The region where are points are distributed is a square in the problem. So, I chose the 5 best points closest to every point of the square and computed the distance between pairs of points that are diagonally opposite. The points closest to the corners are those that:

Have the largest (x + y) value
Have the smallest(x + y) value
Have the largest (x - y) value
Have the smallest (x - y) value

Now, we pick 5 best points from each category and take pairs of points from 1 and 2 and compute their distances and then choose the best one and then do the same thing with points from 3 and 4.

Note here that increasing no. of points increases the probability of obtaining a solution. So, if this wouldn’t have worked out, we might have had to try say taking 10 points from each category. But the reason it worked was high probability of a solution and I think that was what the setter had in mind.

shoom · January 15, 2019, 12:00am

If the points are carefully placed approximately on the circle, then the worst case for the hull size would include all points, not just MX^(2/3). If the points are consecutively numbered, then no matter how we split them into blocks the merged hulls would include all points in the range. I wonder if there was such test case.

I used a segment tree of convex hulls, which is probably slower that the proposed solution. So I had to make optimization for constants.

amoghk · January 15, 2019, 12:26am

@shoom What I think the post meant, was that given a grid MX * MX, the maximum points I can place such that all the points constitute the convex hull cannot be generally more than MX^(2/3). And I think all these points will be approximately on a circle like you said, but due to integer coordinate constraints, it can never surpass MX^(2/3). This is just a possibility and I could be wrong.

nima101 · January 15, 2019, 2:13am

I also built a segtree with convex hull of points in the subtree and used rotating calipers to find farthest points. I got 50 points. I also tried storing upper hull / lower hull and merging them but it didn’t get me any further. Unfortunately I don’t understand the trick related to bucket decomposition to make it work for the third case. I need to read about it a bit probably.
thanks for posting this editorial!

meooow · January 15, 2019, 2:46am

It can be seen from research that the expected number of points in convex hull of N randomly generated points is roughly of the order of N^{1/3} while if we generate points carefully in MX∗MX grid, we can have up to around MX^{2/3} points in convex hull.

First of all, the bound for randomly generated points in a square in O(\log n), not O(n^{1/3}). This is present in the paper linked in the editorial. No reference for the case of carefully generated worst case is provided. Can you share that @taran_1407?

oleg_b · January 15, 2019, 3:05am

To @meooow:

The worst case is when the points form roughly a circle. However the placement of the points is a bit tricky but here is the idea:

For the case 0 <= x, y <= 200,000 generate all co-prime pairs x,y (x < y) where xx + yy < 79*79. That’s about 1500 of such pairs. Sort them by slope x/y, and also include [0,1] and [1,1]. Consider such x,y pairs as “edges” to use to construct 1/8 of a circle while using them in sorted order, starting with [0,1] and ending with [1,1] (the slope changes from 0 to 1). Repeat the process 8 times by flipping to construct the full circle. That’s about 12,000 points, and all of them are in the convex hull. Then add extra points (to reach 200,000 points) by placing them closely to the original points inside the circle. Then randomly shuffle their order.

oleg_b · January 15, 2019, 3:17am

To clarify the circle construction:

The few first co-prime pairs are [0,1], [1,78], [1,77], [1,76], …, [1/40], [1/39], [2/77], etc.
Then place points at:


x0,y0 = [0, 100000]
x1,y1 = [x0 + 0, y0 + 1] =  [0, 100001]
x2,y2 = [x1 + 1, y1 + 78] = [1, 100079]
x3,y3 = [x2 + 1, y2 + 77] = [2, 100156]
...

carre · January 15, 2019, 10:11am

@nima101 I couldn’t doit during the contest but it can be solved with segment tree of Convex_Hulls and no need of bucket decomposition.
Check the following submition: https://www.codechef.com/viewsolution/22489940 . It uses the Convex Hull class of the Tester Solution as nodes of a segment tree.

taran_1407 · January 15, 2019, 12:17pm

Updated link.

See this blog for details. https://codeforces.com/blog/entry/62183

sdssudhu · January 15, 2019, 1:20pm

Ok thanks.

step_by_step · January 16, 2019, 1:12am

What’s the reason of giving such problems? The main idea to build Segment Tree/Sparse Table of convex hulls was covered by the second subtask, which appeared as a problem on Codechef a year ago (https://www.codechef.com/LTIME59A/problems/DARTSEGM). The last subtask only needs googling a weird statement, or trying to squeeze your solution without knowing that it is actually correct.

mgch · January 16, 2019, 2:16am

Here is no other problems. Moreover, no med-hard+ problems without FFT at all :((

meooow · January 17, 2019, 7:33am

Thank you!

meooow · January 17, 2019, 7:35am

Thanks!

hell_1997 · January 19, 2019, 10:49pm

The solve function in the setter’s solution looks like it is using 2 pointers? If it is indeed 2 pointers, then how is it correct? I saw this comment on the CF blog with the counter to the 2 pointer method to find the diameter of a convex hull.

Counter Example