DARTSEGM - Editorial

PROBLEM LINK:

Div1
Div2
Practice

Setter- Ildar Gainullin
Tester- Hasan Jaddouh
Editorialist- Abhishek Pandey

DIFFICULTY:

HARD

PRE-REQUISITES:

Segment Tree, Convex Hull, Computation Geometry (i.e. the proof mentioned here)

PROBLEM:

N darts are thrown randomly on a {10}^{9} \times {10}^{9} board. We need to answer Q queries where each query asks for the distance between farthest darts where we have to consider only darts thrown in throws [l,r].

QUICK EXPLANATION:

Winning Point- The one who could deduce the statement proved by proof given in pre-requisites can easily do this question like a cakewalk.

Go through the proof in the pre-requisites. It says that, for N randomly generated points in 2-D plane, the expected size of their convex hull is of order O(LogN). (In other words, all the N points will be “enclosed” or “covered” by \approx LogN points.)

It then becomes quite easy. We make a segment tree, where each node stores the convex hull of darts thrown in range [l,r]. The farthest points must be part of the hull (Why?). The expected size of hull is O(LogN), hence brute-force iteration over points can answer the queries in O({Log}^{2}N).

EXPLANATION:

The explanation of this question is surprisingly straightforward. I will assume you guys went through the requested links in the pre-requisites. If not, go back and make sure you grasp the concept, especially the definition of convex hull and the concept of segment trees (along with the proof, of course!).

Despite a straight-forward explanation, the editorial will have 3 parts, each of them highlighting/discussing one of the three key concepts to solve this question. We will be using setter’s solution primarily, for any doubts in implementation.

1. Randomness of Points-

There is a good reason why the setter bothered to mention in black and white that points are randomly generated. This allows us to use the observation of above proof. The convex hull of N randomly generated points will be of (expected) size LogN.

In other words, we can quote it as “Expected number of points on convex hull when points are generated randomly from square K⋅K is O(logK).”

"That is fine Mr. So-So-Editorialist. But whats the use of this?"

The use is, that, the farthest points must be a part of the hull (Why?) (Answer given in my corner :D). Hence, we don’t need to keep account of all the points, but only a selected few which make the hull. If this doesnt strike a bell, look in explanation in below tab.

Click to view

Now, why do we not need to keep a track of all the points? Recall that convex hull is made up of outermost or “enclosing” points. The farthest points must lie ON the hull. Hence, all the points lying (strictly) INSIDE the hull can never have “maximum distance”. Hence, we can safely ignore them, and work on the points making the hull, which are of \approx O(LogN) in number.

"Ya-Ya, thats fine Mr So-So Editorialist. But what about the worst case?"

Yes, I know this is a question among minds of >80\% newbies. To answer this, yes, the solution will fail at the worst case, which happens if all the N points are part of the hull. But again, we must make one thing clear, the points are randomly generated. Hence, the chances of this worst case happening are <{10}^{-9} (Actually even lesser!). So, we can find the expected number of test cases having worst case as 0 (XD). Jokes aside, the real significance is same, that the probability of the worst case happening is way too less, its lesser than probability of us getting hit by a meteorite. We dont often get hit by meteorites, do we? Getting hit by worst case is even more rare! :smiley:

In crux, probabilistic approach is followed here, which rules out worst case happening with a considerable amount of confidence and accuracy.

As a homework, can you at least theoretically explore why the chances of worst case are so less?

2. Building Segment Tree-

The first thing to note in any segment tree problem is, what to store in the nodes. We will be storing the convex hull of points lying in range [l,r] in the respective node. (As usual, tabs if this explanation seems insufficient to you.)

Click to view

In other words, we will be storing only the \approx LogN outermost points which can be the part of answer in the node. Or in easier words, we are storing only the \approx LogN points of convex hull, which are “candidate points” for the the farthest distance.

The implementation is also quite easy. Here is the commented code to aid in implementation-

void build(int v, int l, int r)
{
    if (r - l == 1)//Only 1 point. 
    {
        t[v] = {a[l]};
    }
    else
    {
        int m = (l + r) / 2;
        build(v * 2 + 1, l, m);//Storing of points in range [l,m]
        build(v * 2 + 2, m, r);//Storing of points in range [m,r]
        t[v] = mrg(t[v * 2 + 1], t[v * 2 + 2]);//Merging above two to get points in range [l,r]
        t[v] = hull(t[v]);//Making hull of these points via standard algorithm
    }
}

3. Answering the Queries-

Answering the queries is also, quite simple. I will again stress that, the points with farthest distance must lie of the hull. We have stored the hull of points in range [l,r] in nodes, its time to retrieve them for answer :). Refer to code below-

vector <pt> get(int v, int l, int r, int tl, int tr)//Return a vector (dynamic array) consisting of
//points in hull
{
    if (tl >= r || tr <= l)//Out of range. Return nothing.
    {
        return {};
    }
    if (tl >= l && tr <= r)//Completely within range. Return the stored hull.
    {
        return t[v];
    }
    else
    {
        int tm = (tl + tr) / 2;
//Query for range [l,m] and [m,r] and return the answer obtained after merging them
        return mrg(get(v * 2 + 1, l, r, tl, tm), get(v * 2 + 2, l, r, tm, tr));
    }
}

The above is nothing but a typical segment tree query function. If any part of above (like out-of-range, completely in range etc.) isnt clear, do first check up segment trees in more detail. Else, drop a comment here :slight_smile:

Now, all thats left is to iterate over the points. Since we have \approx LogN points, we can easily do a brute force to answer query in O({Log}^{2}N). Refer to pseudo code below-

for (int i = 0; i < (int) b.size(); i++)//b stores the hull
        {
            for (int j = i + 1; j < (int) b.size(); j++)
            {
                ans = max(ans, abs(b[i] - b[j]));
//Iterate over all pairs of points. Only Log^2 N pairs are possible. Store the maximum distance obtained.
            }
        }

SOLUTION:

For your quick reference, the required codes are pasted in tabs. This is, so that you can refer to code without having to wait for @admin to link them. Editorialist’s solution will be put on demand.

Setter

Click to view
#include <cmath>
#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
#include <set>
#include <map>
#include <list>
#include <time.h>
#include <math.h>
#include <random>
#include <deque>
#include <queue>
#include <cassert>
#include <unordered_map>
#include <unordered_set>
#include <iomanip>
#include <bitset>
#include <sstream>
 
using namespace std;
 
typedef long long ll;
 
mt19937 rnd(228);
 
struct pt
{
    int x, y;
};
 
pt operator - (const pt &a, const pt &b)
{
    return {a.x - b.x, a.y - b.y};
}
 
ll abs(const pt &p)
{
    return p.x * (ll) p.x + p.y * (ll) p.y;
}
 
ll operator * (const pt &a, const pt &b)
{
    return a.x * (ll) b.y - a.y * (ll) b.x;
}
 
bool cmp(const pt &a, const pt &b)
{
    if (a.x != b.x)
    {
        return a.x < b.x;
    }
    else
    {
        return a.y < b.y;
    }
}
 
bool cw(pt a, pt b, pt c)
{
	return a.x * (ll) (b.y - c.y) + b.x * (ll) (c.y - a.y) + c.x * (ll) (a.y - b.y) < 0;
}
 
bool ccw(pt a, pt b, pt c)
{
	return a.x * (ll) (b.y - c.y) + b.x * (ll) (c.y - a.y) + c.x * (ll) (a.y - b.y) > 0;
}
 
vector <pt> p, up, down;
 
vector <pt> mrg(const vector <pt> &a, const vector <pt> &b)
{
    p.resize(a.size() + b.size());
    merge(a.begin(), a.end(), b.begin(), b.end(), p.begin(), cmp);
    return p;
}
 
vector <pt> hull(const vector <pt> &e)
{
    if ((int) e.size() <= 3) return e;
    up.clear();
    down.clear();
    p.clear();
    pt p1 = e[0], p2 = e.back();
    up.push_back(p1);
    down.push_back(p1);
    for (int i = 1; i < (int) e.size(); i++)
    {
        if (i == (int) e.size() - 1 || cw(p1, e[i], p2))
        {
            while ((int) up.size() >= 2 && !cw(up[(int) up.size() - 2], up[(int) up.size() - 1], e[i]))
            {
                up.pop_back();
            }
            up.push_back(e[i]);
        }
        if (i == (int) e.size() - 1 || ccw(p1, e[i], p2))
        {
            while ((int) down.size() >= 2 && !ccw(down[(int) down.size() - 2], down[(int) down.size() - 1], e[i]))
            {
                down.pop_back();
            }
            down.push_back(e[i]);
        }
    }
    down.erase(down.begin());
    down.pop_back();
    return mrg(up, down);
}
 
const int N = 1e5;
 
pt a[N];
vector <pt> t[4 * N];
 
 
void build(int v, int l, int r)
{
    if (r - l == 1)
    {
        t[v] = {a[l]};
    }
    else
    {
        int m = (l + r) / 2;
        build(v * 2 + 1, l, m);
        build(v * 2 + 2, m, r);
        t[v] = mrg(t[v * 2 + 1], t[v * 2 + 2]);
        t[v] = hull(t[v]);
    }
}
 
vector <pt> get(int v, int l, int r, int tl, int tr)
{
    if (tl >= r || tr <= l)
    {
        return {};
    }
    if (tl >= l && tr <= r)
    {
        return t[v];
    }
    else
    {
        int tm = (tl + tr) / 2;
        return mrg(get(v * 2 + 1, l, r, tl, tm), get(v * 2 + 2, l, r, tm, tr));
    }
}
 
int n;
 
vector <pt> get(int l, int r)
{
    vector <pt> b = get(0, l, r + 1, 0, n);
    b = hull(b);
    return b;
}
 
int main()
{
#ifdef ONPC
    freopen("a.in", "r", stdin);
#endif
    ios::sync_with_stdio(0);
    cin.tie(0);
    cin >> n;
    for (int i = 0; i < n; i++)
    {
        cin >> a[i].x >> a[i].y;
    }
    build(0, 0, n);
    int q;
    cin >> q;
    while (q--)
    {
        int l, r;
        cin >> l >> r;
        l--, r--;
        auto b = get(l, r);
        ll ans = 0;
        for (int i = 0; i < (int) b.size(); i++)
        {
            for (int j = i + 1; j < (int) b.size(); j++)
            {
                ans = max(ans, abs(b[i] - b[j]));
            }
        }
        cout << ans << '\n';
    }
}

Tester

Click to view
#include <iostream>
#include <algorithm>
#include <assert.h>
#include <string>
#include <vector>
using namespace std;
 
long long readInt(long long l,long long r,char endd){
	long long x=0;
	int cnt=0;
	int fi=-1;
	bool is_neg=false;
	while(true){
		char g=getchar();
		if(g=='-'){
			assert(fi==-1);
			is_neg=true;
			continue;
		}
		if('0'<=g && g<='9'){
			x*=10;
			x+=g-'0';
			if(cnt==0){
				fi=g-'0';
			}
			cnt++;
			assert(fi!=0 || cnt==1);
			assert(fi!=0 || is_neg==false);
			
			assert(!(cnt>19 || ( cnt==19 && fi>1) ));
		} else if(g==endd){
			assert(cnt>0);
			if(is_neg){
				x= -x;
			}
			assert(l<=x && x<=r);
			return x;
		} else {
			assert(false);
		}
	}
}
string readString(int l,int r,char endd){
	string ret="";
	int cnt=0;
	while(true){
		char g=getchar();
		assert(g!=-1);
		if(g==endd){
			break;
		}
		cnt++;
		ret+=g;
	}
	assert(l<=cnt && cnt<=r);
	return ret;
}
long long readIntSp(long long l,long long r){
	return readInt(l,r,' ');
}
long long readIntLn(long long l,long long r){
	return readInt(l,r,'\n');
}
string readStringLn(int l,int r){
	return readString(l,r,'\n');
}
string readStringSp(int l,int r){
	return readString(l,r,' ');
}
 
 
struct point{
	long long x;
	long long y;
};
bool operator<(point a,point b){
	if(a.x==b.x){
		return a.y<b.y;
	}
	return a.x<b.x;
}
 
bool is_left(point a,point b,point c){
	long long x1=b.x-a.x;
	long long y1=b.y-a.y;
	long long x2=c.x-b.x;
	long long y2=c.y-b.y;
 
	return x1*y2 - x2*y1 >= 0;
}
 
bool is_right(point a,point b,point c){
	long long x1=b.x-a.x;
	long long y1=b.y-a.y;
	long long x2=c.x-b.x;
	long long y2=c.y-b.y;
 
	return x1*y2 - x2*y1 <= 0;
}
 
vector<point> convex(vector<point> a){
	if(a.size() == 1){
		return a;
	}
	sort(a.begin(),a.end());
	vector<point> upper,lower,ret;
	int upr=0,lwr=0;
	int n=a.size();
	for(int i=0;i<n;i++){
		while(upr > 1 && is_left(upper[upr-2],upper[upr-1],a[i])){
			upr--;
			upper.pop_back();
		}
		while(lwr > 1 && is_right(lower[lwr-2],lower[lwr-1],a[i])){
			lwr--;
			lower.pop_back();
		}
		upper.push_back(a[i]);
		lower.push_back(a[i]);
		lwr++;
		upr++;
	}
	upper.pop_back();
	while(lower.size()> 0){
		upper.push_back(lower.back());
		lower.pop_back();
	}
	upper.pop_back();
	return upper;
}
 
 
int n,q;
vector<point> sgt[400400];
point arr[100100];
 
 
void build(int nd,int l,int r){
	if(l==r){
		sgt[nd].push_back(arr[l]);
		return;
	}
	int mid=(r+l)/2;
	build(2*nd,l,mid);
	build(2*nd+1,mid+1,r);
	vector<point> pts=sgt[2*nd];
	/*for(int i=0;i<sgt[2*nd].size();i++){
		pts.push_back(sgt[2*nd][i]);
	}*/
	for(int i=0;i<sgt[2*nd+1].size();i++){
		pts.push_back(sgt[2*nd+1][i]);
	}
	sgt[nd]=convex(pts);
}
 
vector<point> query(int nd,int l,int r,int s,int e){
	if(s<=l && r<=e){
		return sgt[nd];
	}
	int mid=(r+l)/2;
	vector<point> pts;
	if(s<=mid){
		pts=query(2*nd,l,mid,s,e);
	}
	if(mid+1<=e){
		vector<point> tm = query(2*nd+1,mid+1,r,s,e);
		for(int i=0;i<tm.size();i++){
			pts.push_back(tm[i]);
		}
	}
	return convex(pts);
}
 
 
int main(){
	n=readIntLn(1,100000);
	for(int i=1;i<=n;i++){
		arr[i].x = readIntSp(1,1000000000);
		arr[i].y = readIntLn(1,1000000000);
	}
	build(1,1,n);
	q=readIntLn(1,100000);
	while(q--){
		int l,r;
		l=readIntSp(1,n);
		r=readIntLn(l,n);
		vector<point> a=query(1,1,n,l,r);
		long long ans=0;
		for(int i=0;i<a.size();i++){
			for(int j=i+1;j<a.size();j++){
				ans = max(ans,(a[i].x - a[j].x)*(a[i].x - a[j].x) + (a[i].y - a[j].y) * (a[i].y - a[j].y));
			}
		}
		printf("%lld\n",ans);
	}
	assert(getchar()==-1);
} 

Time \space Complexity= O( (N+Q)*{Log}^{2}N)

CHEF VIJJU’S CORNER :smiley:

1. Why the farthest points must lie on hull.

Click to view

Assume we have 2 points. None of them is part of hull. Lets explore the possibility that they can have farthest distance or not.

Say our points are A and B. Lets ignore A for now and focus on B. We say distance between A and B is farthest. However, B is an internal point, meaning it is enclosed by the hull. This means, there must be at least one point in B's direction (w.r.t A) which is enclosing it! Lets call that point C This directly means that distance of |AC|>|AB|. This contradicts our assumption that distance between A and B is farthest.

2. Look at the proof given by me in Point 1. Evaluate it in terms of correctness and soundedness, and hence prove the same in given scenario, "A is an internal point, B is part of hull. Prove that A must be part of hull as well for maximum distance."

3. Refer to code for answering the queries. Then, look at the diagram in the tab below, and answer how does the query function merges them. Try to answer by giving an image for clearer answer, along with your reasoning for the same. Analyze the case for any possible issues with time complexity, if they can exist. If not, argue why not.

Click to view

alt text

4. Setter’s Notes-

Click to view

Both points from diameter are lying on convex hull.Expected number of points on convex hull when points are generated randomly from square K \cdot K is O(log K).

You can read the proof here: On the Expected Complexity of Random Convex Hulls.

Let’s build segment tree, each segment will contain convex hull of points from the segment. So to answer the question, we will split query segment on the O(log n) segments from segment tree, and find convex hull using only points from these hulls.

With Andrew algorithm we can build new hull in O(log n \cdot log K), and then find diameter in O(log K) or O(log^2 K). So, complexity of this algorithm is O((n + q) \cdot log K \cdot log N)

5. Tester’s Notes-

Click to view

Solution for DARTSEGM: Lemma: If we have N points on 2D plane (which are) randomly generated then the expected size of their convex hull will be O(logn). Now the solution is clear we use segment tree in each node we store the convex hull of all points on the node’s range, the number of nodes will not be much.

6. Why does the worst case occur so rarely? Can you give at least intuitions for it? Some grounds of arguments are, “Number of specific arrangements in numerator v/s ALL possible arrangements of all N points in denominator” or arguments on "Distance between X randomly generated points being at least K" &etc.