PROBLEM LINK:
Setter- Anuj Maheshwari
Tester- Misha Chorniy
Editorialist- Abhishek Pandey
DIFFICULTY:
Simple
PRE-REQUISITES:
Array, Looping, Basic Math and Principles of Counting, Frequency Array.
PROBLEM:
Given an array of length N, find number of unordered pairs (i,j) such that their average exists in the array (real division), or in other words, find number of unordered pairs of (i,j) for which we can find an element in array A_k which satisfies-
2*A_k=A_i+A_j where A_i and A_j are i 'th and j 'th element of the array.
Unordered pair means that order of listing elements in pair doesnt matter. It means (2,1) is treated same as (1,2). In ordered pairs, we treat both of them as different.
QUICK EXPLANATION:
We immediately see the low constraints for value of A_i. We can make a frequency array to record frequency of each array element. The only problem is, A_i can be negative as well. This is easily solved by adding a constant K (preferably 1000) to all array elements. With all elements positive, we iterate over all possible pair of values of array elements, i.e. we check all pairs of (a,b) where 0\le a,b \le 2000. We check if a, b and their average exists in array and update answer accordingly.
EXPLANATION:
This editorial will discuss 2 approaches. First is mine (Editorialist), and second one is Misha’s (@mgch), i.e. the tester’s. My approach is a bit more difficult than tester’s because I use some formulas and observations, while tester’s solution is simply basic.
We noticed a lot of Div2 users getting stuck in lots of useless and unnecessary things, hence we will answer how to overcome such things in Chef Vijju’s Corner at the end of editorial.
1. Editorialist’s Approach-
The first thing I did was to add 1000 to each element so that all elements of array are non-negative and between [0,2000]. We can show that if we add any constant to all values of array, then although the average of numbers increases, it has no effect on “existence of average (of 2 array elements) inside array.” Try to prove it if you can. (Answer is in tab below).
Click to view
Given-
2*A_k=A_i+A_j
Adding 2*K on both sides-
\implies 2*A_k+2*K=A_i+A_j+2*K
\implies 2*(A_k+K)=(A_i+K)+(A_j+K)
\because Both the expressions turn out to be same, we see that adding a constant K to all elements in array has no effect on “whether average of two array elements exist in the array or not.”
For now, forget about duplicates. My approach counts even duplicates in the process, and removes them later at the end.
Once we have the frequencies of elements inside array, I iterate from all possible “pairs of values” allowed, i.e. , we know that now values of array elements are in range [0,2000], so we will iterate over all pairs of (a,b) where 0 \le a,b \le 2000 and do the following-
**1.**If (a,b) exist, and if their average (by real division) exists in the array as well, goto 2, else check next pair.
**2.**Is a==b ? If yes, then add (freq[a]-1)*freq[b] to answer, else add freq[a]*freq[b]
To remove duplicates, simply divide the answer by 2 - because each pair is counted twice. The proof for the 2 formulas I used, along with fact that dividing by 2 removes duplicates is left to reader as an exercise. (Answer/Hints are discussed in Chef Vijju’s corner.)
Time Complexity-O({2000}^{2}+N)=O(N+K)=O(N) (where K is a constant).
Tester’s Approach (Easy)-
The first step is the same, he also made all elements non-negative by adding 1000. He deals with duplicates immediately.
1.For each possible value in [0,2000], check if it exists in array. If it does, goto 2, else check the next value.
2.Count all pairs formed by this value and its duplicates by adding cnt[mid] * (cnt[mid] - 1) / 2 to the answer. mid here is the value being investigated.
3.If mid is average of 2 numbers, this means both the numbers are equidistant from mid. Hence, count all pairs possible due to presence of equidistant numbers present at a distant x , where x is in range [0,mid] and mid+x and mid-x are in valid range of [0,2000]. For each such pair, add cnt[mid - x] * cnt[mid + x] to the final answer.
Time complexity is same as my solution in worst case :).
SOLUTION:
CHEF VIJJU’S CORNER:
**1.**Lets first discuss the mistakes. Most of the participants got the formulas right…but they did a tremendous blunder. Suffered from Overflow!!. If you are declaring your frequency array as int, then for larger test cases where N is as large as {10}^{5}, the frequency of elements can be as large. Hence, when we do freq[a]*freq[b], we must make sure to use proper type casting, else the answer will overflow!! Many contestants stored answer in an int variable, and that code is doomed to be wrong even before submission!!
One thing I will tell, if you are getting AC for smaller sub tasks, and WA for larger ones, do check for overflow!!. There is a good 60\% chance that it is overflow issue giving you WA. We were going through the wrong solutions in detail, and our heart sank a little bit each time we saw a solution killed because of overflow
**2.**Some of them mistook |A_i| \le 1000 as 0 \le A_i\le 1000. The difference between the two is that, the first condition allows negative integers in the input. Whenever in such confusion, assume value of A_i to be negative - and ask yourself - Does it satisfy this constraint? If yes, its in the input, else its not.
**3.**Derivation of freq[a]*freq[b] in tab below.
Click to view
Say we have values freq[a]=K and freq[b]=L.
Now, for each of the K values of a in array, we can form a pair with all L values of B. Hence add freq[a]*freq[b] to the final answer.
My code will add freq[a]*freq[b] for both pairs, (a,b) and (b,a). Since this portion gets executed when a \neq b, hence each pair is counted twice. Half proof of why halving the value of answer removes duplicates.
Proof when a==b is on same lines.
**4.**Whenever you see that the max value of array elements is comparable to size of the array, frequency array approach can be used. Its usually asked as a part of question rather than an individual question, like here it was Math (principle of counting) +Frequency array. With some modification, you can also count number of each character in a string using this approach.
5. Frequency Array approach to count number of each element is one of the steps involved in Counting Sort , an O(N) sorting technique.
6. What would we do if this question had constraints on A_i upto |A_i| \le {10}^{5}? It would be a difficult question, but still possible to solve. That problem then, can be solved using FFT. We create a polynomial P(x) such that P(x)=\sum_{0}^{2*{10}^{5}}(freq[i]*{x}^{i}). On squaring P(x), we will have freq[i]*freq[j] for all (i,j) between 1 \le i,j \le N, but we need to check the case when i=j. The next step to do is, if i+j is even, and freq[(i+j)/2]>0 , it has to be added to the final answer. This outline is just meant to introduce beginners that there is something like FFT existing in this universe (xD) and to offer an interesting insight to the solution. Time Complexity is O(|A|Log|A|)
7. Questions to practice frequency array-
(More will be added if I find more, or if the community helps )