TWOFL - Editorial

likecs · June 10, 2018, 1:17pm

Problem Link

Practice

Contest

Author: Stacy Hong

Tester: Misha Chorniy

Editorialist: Bhuvnesh Jain

Difficulty

MEDIUM

Prerequisites

BFS, DFS, DSU, Sorting

Problem

You are given a grid with numbers written on every cell. You are required to find the largest connected component on the grid such that it contains at most 2 distinct numbers in it.

Note that editorial refers to the flowers as numbers.

Explanation

Subtask 1: N, M ≤ 100

A simple brute force which considers every pair of different numbers and performs BFS on the grid to find the connected component and their sizes is enough to pass this subtask. Below is a pseudo-code for it.


	def solve(c1, c2):
		res = 0
		for i in [1, n]:
			for j in [1, m]:
				if vis[i][j] or (a[i][j] != c1 and a[i][j] != c2):
					continue
				# perform bfs to find size of connected component containing c1 and c2
				# update res variable with maximum value
		return res

	ans = 0
	for c1 in [1, 100]:
		for c2 in [i, 100]:
			# also handle case of one color only.
			ans = max(ans, solve(c1, c2))

The time complexity of the above approach is O(n * m * {\text{Max(a[i][j])}}^{2}).

Subtask 2, 3: N, Q ≤ 2000

The solution approach for last 2 subtasks is same. Just difference in implementation and constant factors can lead to passing subtask 2 only.

The full solution idea relies on the brute force solution above. Instead of doing a complete BFS on the grid every time based on the type of colour you need to consider, we store the edges containing different numbers and do BFS/DSU on it. Below are the 2 different approaches for it.

Author/Tester Solution: DSU + Sorting/HashMap

First find all connected components which have the same type of numbers. Store with each component the size of the component and the index of the component as well. This can be easily done with a BFS/DFS. Now create edges between 2 component which are adjacent to each other in the grid i.e. there exists a cell in component 1 which is adjacent to a cell in component 2. Store the edges in the following format:

\text{{Number in component 1, Number in component 2, Index of component 1, Index of component 2}}

To avoid doing the same computation again, store only the edges such that \text{Number in component 1} < \text{Number in component 2}. This will help to reduce the constant factor by 2 in your code.

Now, we traverse each edge based on the numbers they connect and perform DSU to find the maximum connected component. Below is the explanation of sample test case in the problem.

The left side image shows the different components formed after the first BFS is done. The second side image shows the result of performing DSU. So, once the BFS is done, we see that cell (2, 1) containing 3 is adjacent to cell (1, 1) containing 1, so we merge them using DSU. Again cell (4, 1) containing 1 is adjacent to cell (4, 2) containing 3 so, we merge them using DSU. The process continues as cell (4, 3) containing 1 is adjacent to (4, 2) containing 3 (which was already expanded before) and the component increases.

The only issue which needs to be taken care while implementing the above solution is resetting of DSU after every iteration. Doing it in naive manner i.e. resetting all the components to original sizes and parent as themselves will TLE even for subtask 2 as the complexity will be similar to subtask 1. Instead, we only restore the values whose parent or size changed while performing the DSU operation. This can be stored while performing the merge operation.

For more details, you can refer to the author’s or tester’s solution below.

The complexity of the above approach will be O(n * m * log(n * m)) as you will traverse each edge once. The logarithmic factor is due to sorting or using maps to store the edges. If you use hash-maps instead to store the edges, the complexity will be O(n * m) but will large constant factor.

Editorialist Solution: Edge-based Graph traversal

UPDATE: This solution is slow. The author created a counter test against it. The time complexity will be O(n^2*m^2) as pointed out in comments below. Even the test case provided is similar. Please refer to author’s solution above for full problem. Will ask Codechef team to add that case in the practice section too for help.

Instead of forming any components or performing DSU or doing BFS like the normal way, we see simple observation in the brute force approach. If, each edge connecting adjacent cells in the grid, whether having same numbers or different numbers if traversed at most twice while determining the size, the complexity will be O(8 * n * m) overall. Below is the detailed description of the idea:

First, select an edge in the grid. Find the numbers (at most 2) which will form the component. We will try to find the full component which contains this edge and make sure this component is traversed only once. For this, you start a normal BFS from each cell. You go to adjacent cell only if it is one of the 2 required numbers. While adding the node in the BFS queue, we all update the possible ways in which the edge could be traversed as part of the same component. So, cell (3, 2) containing 2 was visited from cell (2, 2) containing 1. But it could be traversed from cell (3, 3) containing 1 in future and will form the same component. So, we update all 4 directions of a cell can be visited in the same component before adding the cell in BFS queue. The BFS for a component is performed only if the edge was never traversed before in any component. The image below shows how the component is expanded in each step of BFS when the starting edge is selected as (1, 2) and (1, 3).

The time complexity of the above approach is O(n^2*m^2) and will pass subtask 1 only.

Once, you are clear with the above idea, you can see the editorialist implementation below for help.

Feel free to share your approach, if it was somewhat different.

Time Complexity

O(n * m)

Space Complexity

O(n * m)

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.

Tester’s solution can be found here.

Editorialist’s solution can be found here. (Will TLE).

panik · June 11, 2018, 4:13pm

Editorialist solution was the best method. I used it in the competition and it took me just 0.7 sec to pass the question with a AC.
Link: https://www.codechef.com/viewsolution/18816600 link text
Please accept if considered correct.

akamoha · June 11, 2018, 4:39pm

I used the method outlined in the editorialist’s solution and got AC:
https://www.codechef.com/viewsolution/18766440

But then I thought of this test case:

1  2  3  4  5  6  7  8  9  10 11
12 1  1  1  1  1  1  1  1  1  13
14 15 16 17 18 1  19 20 21 22 23
24 1  1  1  1  1  1  1  1  1  25
26 27 28 29 30 1  31 32 33 34 35
36 1  1  1  1  1  1  1  1  1  37
38 39 40 41 42 1  43 44 45 46 47
48 1  1  1  1  1  1  1  1  1  49
50 51 52 53 54 1  55 56 57 58 59
60 1  1  1  1  1  1  1  1  1  61
62 63 64 65 66 67 68 69 70 71 72

There’s a 1 connecting each of the rows of 1s, so the entire sea of 1’s should be traversed for each edge that we start a BFS on, right?
That would make this worst case O(N^2 * M^2)?

Maybe I’m missing something here, let me know if it is so. Thanks!

devarshi09 · June 11, 2018, 5:42pm

Woah, I actually tried both the approaches. Got AC(100 pts) with Edge based Graph traversal!

akamoha · June 11, 2018, 5:58pm

I didn’t do anything special to handle such a case, and when I tested my solution on a test case constructed like this with N and M both in the order of 1000, it ran for more than two minutes and I had to quit it eventually… So I’m pretty sure my solution is N^2*M^2. Yet, it passed with 100 points in the contest which may mean the test cases were weak?

hemant_dhanuka · June 11, 2018, 6:03pm

If Question level is Medium then why added to practice easy section instead of adding to practice medium…
otherwise it will demotivate beginner, who is looking to solve easy problems

jung_e_elaan · June 11, 2018, 7:49pm

Please any one can tell me,


[1]
It is showing TLE again and again.
Any Optimization ?


  [1]: https://www.codechef.com/viewsolution/18854836

mindjolt · June 11, 2018, 8:23pm

I agree with akamoha. I was actually trying to pass subtask 1 when I got AC. My code would’ve done akamohas test case in O(n*m), but if one modified the 1 to a number greater than all the numbers 1’s connected to, like 100, it would take O(n^2m^2). This is because I only did the dfs if the second number was greater than the first, just like how it’s mentioned in the editorial.

rishav007 · June 11, 2018, 9:48pm

in the tester solution , there is a function

ull get_c(int c1, int c2) {

if(c1 < c2) return (1ULL*c1 << 32) | c2;

else return (1ULL*c2 << 32) | c1;
}

Can anyone tell me the reason why we are doing this?

hemant_dhanuka · June 11, 2018, 10:50pm

Getting Confused with these lines
"If, each edge connecting adjacent cells in the grid, whether having same numbers or different numbers if traversed at most twice while determining the size, the complexity will be O(8∗n∗m) overall."

what i am underStanding,(i know, i am understanding too much wrong):-
if i have total k types of flowers, then total types pairs will be k(k-1), and for each pair it will traverse each edge atmost twice… and total edges are m * n… then complexity should be (k*(k-1))2m * n…

please help, its getting very difficult to for me understand this editorial/Problem/Solution

below is what i did,…

code link

Output
OutPut ScreenShot
Can i Optimse myCode further…

l_returns · June 12, 2018, 12:38am

agree… I found it medium…

likecs · June 12, 2018, 8:32am

Update: You are correct @akamoha. The test cases were weak. Thanks for a similar test case. Have updated the editorial with a warning too.

sonu_628 · June 12, 2018, 8:36am

can you share the input file ? (like on a G drive or dropbox)

priyatamsai_78 · June 12, 2018, 9:00am

i have seen editorialist solution

https://s3.amazonaws.com/codechef_shared/download/Solutions/JUNE18/editorialist/TWOFL.cpp

lets take test case

1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
2 3 4 5 6 7 8 9 10

if we consider this type of test case, Time complexity will lead to N.N.M or even to M^2.N^2 so i think we cant exactly say complexity for this solution is N.M

likecs · June 12, 2018, 9:51am

This pointed out in the editorial. Please check again.

priyatamsai_78 · June 12, 2018, 10:16am

ya, seen that, thank you!!

soham1234 · June 12, 2018, 11:20am

@likecs I also did edge based traversal and I am pretty confident my solution is O(n * m * log(n * m)) at worst case. My idea is , I first labelled cells, consecutive cells having same color gets same label. Now merging these same labelled cells together to form a single node, I have a graph where each edge is connected to 2 different colored nodes. Also one important thing to notice, a edge will be visited atmost once. Also as an edge is visited atmost once, total number of nodes visited is 2 * no_of_edges. So net complexity remains (n * m * log(n * m)) i guess? Can you please share the counter case or prove anyway my solution is n^2 * m^2? Because it did just pass under time limit after a lot of TLES.

Edit:here is the implementation for the same

sorb1997 · June 12, 2018, 2:23pm

Can anybody explain me what does

if (rand() & 1) {
   swap(xx, yy);
}

mean in Author’s solution? And why is it being used here?

likecs · June 12, 2018, 2:40pm

I will have to ask the Codechef team and author to share the counter test case.

likecs · June 12, 2018, 2:41pm

Union-find algorithm works in amortised O(n) only if it is implemented by “Rank” algorithm, i.e. merging smaller sets to larger ones. But in practice, even random merging works fine and it is tough to generate a test case against it.