PROBLEM LINK:
Author: Misha Chorniy
Tester: Jingbo Shang
Editorialist: Pushkar Mishra
DIFFICULTY:
Easy-Medium
PREREQUISITES:
Hash-Table
PROBLEM:
Given two types of queries:
- Add the number to the set of numbers being maintained.
- Remove the given number from the set if it exists in it.
The numbers to be inserted/deleted are generated according to the following scheme:
S = (a*S_{i-1} + b) mod 2^{32},
where a and b are integers.
The sum of the numbers finally remaining in the set after all the operations is to be reported.
EXPLANATION:
Given the number of operations to be executed, i.e., q = 10^7, it should be clear that the insertions and deletions from the set are to be done in \mathcal{O}(1). Use of STL or pre-written libraries in C++/Java can’t be used since they will definitely time out because of associated overheads.
Hash Table Solution
One way is to write our own hash table. Editorialist’s solution provides a standard implementation of Hash-Table with chaining to handle collision. For chaining, linked list has been used in the given solution. You can read more about it over the internet.
Bitset Solution
A neater and more concise solution is using Bitset. Let’s think about the naive solution to this problem first. If we could simply keep an array of length 2^{32}-1, then direct insert/lookup in the array would have been possible. The the memory constraints don’t allow that big an array. However, let us look at the constraints on the number being inserted/deleted into the set. We know they lie between 0 and 2^{32}-1, i.e., they can be represented by 31 bits. Now, we can use a smart way to index numbers. Let us keep an array of length 2^{26}. This is perfectly fine with the memory limits. Now, for a 31 bit integer x, let the first 26 bits indicate the index in the array where x will reside. The rest 5 bits can be encoded in a 32 bit integer since only different 2^5 possibilities exist. Please read setter’s/tester’s solutions for this approach.
There is one more intricacy here: calculating S_i. If we use the normal mod operation present in some of the programming languages, the solution will exceed time limit. This is because mod is a complex operation and performing it 10^7 times will not work. The observation is that we have to take mod 2^{32}. If we keep the data type of S, a, b to be unsigned, in that case, we will not have to use mod operation because the range of unsigned int is 0 to 2^{32}-1. Thus, if S goes over 2^{32} it will automatically overflow around to greater than 0. In other words, the compiler will perform the mod function implicitly because of the overflow.
COMPLEXITY:
\mathcal{O}(1) per insert and delete operation.