PROBLEM LINK:
Setter- Igor Barenblat
Tester- Misha Chorniy
Editorialist- ???
DIFFICULTY:
MEDIUM-HARD
PRE-REQUISITES:
SOS Dp, Square Root Decomposition, Bitwise Operations
PROBLEM:
Given an array A of N numbers, we have to support the following operations-
- Add a number to it
- Remove a number from it
- Calculate f(x) where f(x) is the maximum of “Bitwise-AND” of all sub-sequences of A with length x
QUICK EXPLANATION:
Key to Success- Deducing the type of question is important. The one who deduced that it is SOS-DP and were able to derive that queries were to be broken by Square-Root Decomposition can get a AC. Such intuition comes with practice as questions on similar concepts are asked before.
Some experience of SOS Dynamic Programming helps here. We will use square-root decomposition along with it. Divide the queries into buckets of size K. Updates of frequency of elements and re-calculation of SOS dp-table takes place for every bucket instead of every query.
We will iterate over bits of answers from highest to lowest. Greedy works here! (Why? Q1). For each bit, we first set it and see if its possible obtain this answer. If it is, we set that bit of answe and move on to next.
EXPLANATION:
I will assume that you guys have at least some idea about SOS-Dp. The editorial will have a three sections, each describing one step towards the final answer. I will try to give my intuition and help in grasping concept wherever possible, but please make sure you fulfill the pre-requisites! Implementation is integrated with these sections- we use tester’s code for that.
1.SOS-Dp
The first question to ask "What is the maximum AND we can obtain if we AND a\&b where a< b? Of course, the answer will be a! Why? Is the maximum value obtainable from AND operation of 2 numbers is the minimum one of them?
Click to view
Remember that NO NEW BIT is set in AND operation. A bit which is 0 (unset) is never set. A bit which is 1 may be unset, or may remain set. In BEST case, the number can remain as it is. Of course, the maximum number will have at least 1 bit which is going to be unset. Hence, best case answer is minimum number of the two.
The function to calculate SOS dp table, along with explanation is below.
void SOS() { // sum-over-subsets
memcpy(foo, cnt, sizeof(foo));//Initialized with cnt or frequency array of elements
for (int bit = 0; bit < k; ++bit) {//Iterating from bit 0 to bit k (0-based index)
for (int mask = 0; mask < 1 << k; ++mask) {
if (~mask & (1 << bit)) {//If mask has '0' at that position
foo[mask] += foo[mask ^ (1 << bit)];
//Number of ways to get mask+=Number of ways to get supermask
}
}
}
}
Initially the dp-table is initialized with frequency array which has count of all the numbers till present query. Using the SOS-dp, we iterate from bit 0 to bit k-1 (0-based indexing!). Now, what are ways we can obtain the given mask? Note that, no new bit is set in AND operation. A bit with value 0 remains 0, and bit with value 1 can get reduced to 0 or may remain 1. Hence, the given mask can only be obtained by super-masks (or numbers which have 1 at ALL places where mask has, and may have 1's at other places where our mask has 0.)
Hence, if mask has 0 at current position, we add number of ways to obtain super-mask (or dp[super-mask]) to dp[mask]. The proof of correctness of this procedure is given in the pre-requisites blog I gave link to. (Well, its standard SOS dp procedure, so hopefully it should be clear )
2. Square Root Decomposition of Queries-
Clearly, updating the SOS-Dp table is very costly! We cannot do that for every query. What we, hence do is, divide the queries into “buckets” or groups, each of size K. SOS dp table is calculated after every bucket (instead of after every query). Tester @mgch chosed this K around 512. Now, what does this mean? This means that, when we are at K'th bucket, our SOS dp-table is updated till K-1 bucket, and we just need to do some manipulations so that we can use that data of SOS dp table, along with data of present bucket to solve queries. What are they?
3. Answering Queries-
First thing is, the SOS dp-table is updated after end of every bucket, but the frequency array of elements can be updated every query as its simply O(1). Frequency array is nothing but count of numbers/elements in the set. Sample code below in the tab-
Click to view
if (t == 1) {//Add element, increase frequency
cnt[x] += 1;
} else if (t == 2) {//Remove element, reduce frequency.
cnt[x] -= 1;
}
We need frequency array to be up-to-date for solving queries of current bucket. Calculating answer after that is actually simple!
Now, for every query asking us to calculate f(x), we first set answer as 0. Then we start from the highest bit. We will see if its possible to set the bit or not. Recall that our SOS dp table (dp[mask]) stored the number of super-masks which have mask as a sub-mask. We have answers for K-1 bucket in SOS-dp table, we have status of queries of present buckets as well stored. HOW can we obtain the final answer?
It will do justice for first giving the code and then explanation.
if (t == 3) {
int ans = 0;
//constructing the answer from the highest bit to lowest one bit-by-bit
//checking if we have 2^(k-1) in the answer,
//
for (int bit = k - 1; bit >= 0; --bit) {
int taked = ans | (1 << bit);//Checking if we can set this bit in
//ans
int cur = foo[taked]; //cur = the number of elements that has (ans |
//2^bit) as submask
//recalculating in O(2^K)
for (int j = i; (j & ((1 << K) - 1)) != 0; --j) {
if (T[j] == 1) {//T[j]=T at j'th query
//X[j]=X at j'th Query
if ((X[j] & taked) == taked) {
//Recall the maximum possible value of AND of 2 numbers. We are trying to
//see if taked is a sub-mask of X[j].
++cur;
//If above is true, and X was inserted==>Increase length of sub-sequence by 1.
}
}
if (T[j] == 2) {
if ((X[j] & taked) == taked) {
--cur;
//If above is true, and X[j] was removed,
//the length of sub-sequence reduces by 1
}
}
}
if (cur >= x) { // checking if this number >= x then the answer for
//this query |= 2^bit
ans |= 1 << bit;//If possible, add 2^(bit-1) to ans
}
//Do we REALLY need to check sub-sequence of length==x?
}
In above code, we first initialized answer to 0. Now, we iterate from highest bit to lowest bit (greedily) and see if we can obtain \ge X (Why not ==X? Q2) numbers such that ans is their sub-mask. We have stats upto K-1 bucket in SOS-dp-table (named foo[mask] in above code). We now simply iterate and brute force over this present bucket. Depending on if a favourable number (number whose sub-mask is ans) was added or removed, we add or subtract 1 from present sub-sequence length. Like this, every query is answered in O(K) or O(\sqrt{N}) whatever you choosed
Try to answer the question I asked at end of code. //Do we REALLY need to check sub-sequence of length==x?
SOLUTIONS:
The codes of setter and tester are pasted below for your convenience- so that you can access them while @admin is linking the solutions to the editorial. It is, just for your convenience.
Click to view
#include <bits/stdc++.h>
using namespace std;
//HEADER START
void init(int,vector<int>);
void insert(int);
void erase(int);
int f(int);
//HEADER END
const int max_N=75000;
const int max_M=75000;
const int max_K=18;
const int bsize=1.66*sqrt(75000);
int q_size;
int q_val[bsize+1];
int q_type[bsize+1];
int cnt[1<<max_K];
int dp[1<<max_K];
/// recalc array dp where dp[i] count of j in A that i&j==i
void recalc()
{
for (int i=0;i<(1<<max_K);i++){
dp[i]=cnt[i];
}
for (int i=0;i<max_K;i++){
for (int mask=(1<<max_K)-1;mask>=0;mask--){
if (!(mask&(1<<i))){
dp[mask]+=dp[mask^(1<<i)];
}
}
}
}
void init(int n,int k,vector<int> a)
{
for (auto i:a){
cnt[i]++;
}
recalc();
}
void insert(int x)
{
if (q_size>=bsize){
q_size=0;
recalc();
}
cnt[x]++;
q_val[q_size]=x;
q_type[q_size]=1;
q_size++;
}
void erase(int x)
{
if (q_size>=bsize){
q_size=0;
recalc(); /// recalc if q bigger than sqrt(m)
}
cnt[x]--;
q_val[q_size]=x;
q_type[q_size]=-1;
q_size++;
}
/// return count of j in A that j&mask==mask
int get_nad_mask(int mask)
{
int res=dp[mask];
for (int i=0;i<q_size;i++){
if ((q_val[i]&mask)==mask){
res+=q_type[i];
}
}
return res;
}
int f(int x)
{
int res=0;
for (int i=max_K-1;i>=0;i--){
if (get_nad_mask(res^(1<<i))>=x){ /// easy to see :)
res+=(1<<i);
}
}
return res;
}
//GRADER START
main()
{
ios_base::sync_with_stdio(0);
cin.tie(0);
cout.tie(0);
srand(time(NULL));
int n,m,k;
cin>>n>>m>>k;
vector<int> a(n);
for (auto& i:a){
cin>>i;
}
init(n,k,a);
vector<int> v;
for (int i=0;i<m;i++){
int type,x;
cin>>type>>x;
if (type==1){
insert(x);
}
else if (type==2){
erase(x);
}
else{
v.push_back(f(x));
}
}
for(auto a : v){
cout << a << "\n";
}
}
Click to view
#include <bits/stdc++.h>
using namespace std;
const int MaxN = (int)4e5 + 10;
const int MOD = (int)1e9 + 7;
const int INF = 1e9;
const int K = 9; //Rebuild SOS every (1 << K) queries
int n, m, k;
int a[MaxN], T[MaxN], X[MaxN];
int cnt[1 << 18], foo[1 << 18];
//cnt[i] denotes the number of elements = i in the array A
//foo[i] array needed for Sum-over-Subsets
void SOS() { // sum-over-subsets
memcpy(foo, cnt, sizeof(foo));
for (int bit = 0; bit < k; ++bit) {
for (int mask = 0; mask < 1 << k; ++mask) {
if (~mask & (1 << bit)) {
foo[mask] += foo[mask ^ (1 << bit)];
}
}
}
}
int main() {
// freopen("input.txt", "r", stdin);
scanf("%d%d%d", &n, &m, &k);
assert (n <= 75000); assert (m <= 75000);
for (int i = 1; i <= n; ++i) {
scanf("%d", &a[i]);
++cnt[a[i]];
}
SOS(); //initial caculating
for (int i = 1; i <= m; ++i) {
int t, x;
scanf("%d%d", &t, &x);
T[i] = t;
X[i] = x;
if (t == 1) {
cnt[x] += 1;
} else if (t == 2) {
cnt[x] -= 1;
}
if ((i & ((1 << K) - 1)) == 0) { // rebuild
SOS(); //help the program
}
if (t == 3) {
int ans = 0;
//constructing the answer from the highest bit to lowest one bit-by-bit
//checking if we have 2^(k-1) in the answer,
//
for (int bit = k - 1; bit >= 0; --bit) {
int taked = ans | (1 << bit);
int cur = foo[taked]; //cur = the number of elements that has (ans | 2^bit) as submask
//recalculating in O(2^K)
for (int j = i; (j & ((1 << K) - 1)) != 0; --j) {
if (T[j] == 1) {
if ((X[j] & taked) == taked) {
++cur;
}
}
if (T[j] == 2) {
if ((X[j] & taked) == taked) {
--cur;
}
}
}
if (cur >= x) { // checking if this number >= x then the answer for this query |= 2^bit
ans |= 1 << bit;
}
}
printf("%d\n", ans);
}
}
return 0;
}
Editorialist solution will be put on demand.
Time Complexity=O(\sqrt{m}*{2}^{k}*k+m*sqrt{m}*k)
CHEF VIJJU’S CORNER
1. Ans of Q1 “Why greedily setting bits works here?”
Click to view
Lets consider the worst case. Say i is the highest bit. If by setting bit i, ALL the lower bits have to remain unset, we observe we still obtain a better answer. Setting i'th bit adds {2}^{i-1} to answer (0-based indexing!). While adding all other lower bits adds {2}^{i-2}+{2}^{i-3}....={2}^{i-1}-1 which is less than current answer obtained by setting i'th bit. Hence, greedy stands proved
2. Ans of Q2- "Do we REALLY need to check sub-sequence of length==x?"
Click to view
NO! Recall that we are storing “How many numbers exist such that ans is their sub-mask”. This means we can simply take any x of them to get answer. So exactly ==x not needed in given implementation.
3. Setter’s Note-
Click to view
" My solution have complexity O(\sqrt{m}*{2}^{k}*k+m*sqrt{m}*k)
Let’s define g(a)=(count of bєA that b\&a==a). Then easy to call f(x)
res=0
for (i=k-1;I>=0;i—)
if (g(res+1<< k)>=x)
res+=1<< k
g(a) will be calculated using sqrt evristic
After \sqrt{m} requests we can recalculate array G with size {2}^{k} with complexity O({2}^{k}*k). Each g(a) we will calculate using array G and O(\sqrt{m}) requests, which not considered in G. "
4. TEST CASE BANK-
Click to view
No small test case used in problem. All of the test cases are huge, and hence wont help in debugging. Please give solutions giving WA for analysis.
Currently Empty. Give me solutions giving WA so I can help you there
Edit- A small test case from setter -
Test:
14 14 5
2 26 15 14 19 1 11 4 13 26 2 31 27 24
3 10
3 7
1 19
3 6
3 5
1 17
3 1
1 18
1 1
1 4
1 13
3 3
3 1
2 19
Answer:
2
10
18
24
31
26
31
5. Related Problems-
- MONSTER - Based on SOS Dp + Square Root. Should practice this question as well