PROBLEM LINK:
Author: Abdullah Aslam
Tester: Suchan Park
Editorialist: Suchan Park
DIFFICULTY:
Easy-Medium
PREREQUISITES:
How binary search works, greedy argument
PROBLEM:
You are given a pairwise distinct 1D array A of size N. For each element A[i], find the number of minimum swaps required to make the binary search algorithm that trys to find A[i] successfully find that it is in the position i. We are not allowed to swap A[i]. For details, consult to the problem statement. The binary search algorithm is defined as the following:
integer binary_search(array a, integer n, integer x):
integer low, high, mid
low := 1
high := n
while low ≤ high:
mid := (low + high) / 2
if a[mid] == x:
break
else if a[mid] is less than x:
low := mid+1
else:
high := mid-1
return mid
QUICK EXPLANATION:
Let’s solve it for fixed i. From the binary search algorithm that finishes at position i, we can find O(\log n) constraints of the form “A[j] < x” and “A[k] > x”. There will be some j s and k s that doesn’t satisfy the given constraints, so let’s define such set of indices as J' and K'. If j' \in J' and k' \in K', A[j'] > x and A[k'] < x holds. So, if we swap these two elements, A[j'] < x and A[k'] > x will hold, so the constraints involving j' and k' both becomes valid. After repeating this as much as possible, at most one of the two sets will be non-empty. In that case, it is possible to just greedily swap with other elements to make such constraint valid, only if there are enough elements to swap. In conclusion, only \max(|J'|, |K'|) swaps are required or it is impossible. It takes only O(\texttt{\# of constraints}) = O(\log n) time to determine everything, we can just fix i = 1, 2, 3, \cdots, n to solve the whole problem in O(n \log n).
EXPLANATION:
How does the binary search work? It assumes the array A is sorted, and takes an interval [low, high], meaning that there must be an i \in [low, high] such that A[i] = x. Take mid = (low + high) / 2. If A[mid] < x, A[low] < A[low+1] < \cdots < A[mid] < x holds, so the algorithm shrinks the interval to [mid+1, high]. There is no way 1, 2, \cdots, mid can be our desired position. If A[mid] > x, x < A[mid] < A[mid+1] < \cdots < A[high] holds, so the algorithm shrinks the interval to [low, mid-1]. There is no way mid, \cdots, high can be our desired position. One thing we can note here is, if an element is once excluded in the “candidate interval” [low, high], it will never re-appear forever.
For the sake of understanding, let’s draw a decision tree of a binary search, which is a well-known concept. I take a look at the case when n = 9.
Internal nodes denotes the case when the answer is not decided. In that case, it will compare A[mid] with x to shrink the interval or find the answer. Leaf nodes of the tree denotes the case when the answer is decided.
Since this is a tree, we can see that there are some unique constraints required to obtain a certain resulting position. For example, if we want the answer to be 7, we have to visit nodes [1, 10], [6, 10], [6, 7] and [7, 7]. So the necessary conditions required are A[5] < x, A[8] > x, A[6] < x and A[7] = x. There will be at most O(\log n) constraints since this is a binary search decision tree (there are at most O(\log n) comparisons). Also, the important constraints will look like “A[j] < x” or “A[k] > x”. A[i] = x constraint is useless since we seek to find position i with parameter x = A[i].
So, let’s think that we want the resulting position to be i when x = A[i].
First thing to check: if there are s_1 constraints of the form A[j] < x and s_2 constraints of the form A[k] > x, there must be s_1 elements smaller than x and s_2 elements larger than x in the array A. This can be checked by looking at the position of A[i] in the sorted array of A.
Suppose this check is passed. Some constraints will be satisfied, others will not. We have to swap elements that appears in the non-satisfied constraints, so that all constraints could be satisfied. Define J' as a set of indices j such that the constraint A[j] < x exists and isn’t true right now. Similarly, define K' as a set of indices k such that the constraint A[k] > x exists and isn’t true right now.
If j' \in J', A[j'] > x holds. Also, if k' \in K', A[k'] < x holds. This inequality looks exactly the same as other constraints. Therefore, if we swap A[j'] and A[k'], both of the original constraints A[j'] < x and A[k'] > x will be satisfied!
If we apply this swap as long as both sets J' and K' are non-empty, eventually both sets will be empty or only one set will be non-empty. If the former case happens, we are done and this is the optimal strategy (minimizes the number of swaps). Otherwise, WLOG let’s say conditions of the form A[j''] < x are left unsatisfied and let those indices as J''. If there are |J''| or more indices p such that A[p] < x, we can swap A[j''] and A[p] for all j''. We can say the same argument for K'' too.
To summarize: Let A[i] be the v-th smallest element of A. Let J be a set of indices j such that the constraint “A[j] < x” exists, and K be a set of indices k such that the constraint "A[k] > x exists. Then,
- If p < |J| (there are less than |J| elements smaller than x) or N - p + 1 < |K| (there are less then |K| elements larger than x), the answer is -1.
- Otherwise,
- If |J'| = |K'|, the answer is |J'|.
- If |J'| > |K'|, the answer is |J'| if v \ge |J'| - |K'|, otherwise -1.
- If |J'| < |K'|, the answer is |K'| if N - v \ge |K'| - |J'|, otherwise -1.
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here.
Tester’s solution can be found here.
Editorialist’s solution can be found here.