Regarding C++ stl- set.

Hey guys, i need some help/clarification on sets.

(I searched for this but couldnt find anything relevant. Hence i would be grateful if you guys could help :slight_smile: )

Its with reference to problem XORSUB 's solution here which used the method of Gaussian Elimination .

With reference to that solution, there is a comment-

The logic is correct. But there is small flaw in implementation. The above code will give wrong output for following test case

1 
3 1 
12 10 8
This is because of how STL set works. For correct result, just swap these two statements

s.erase(t);
s.insert(t^m);

Why is this in this way? What is the effect of swapping the two statements? What error is caused in this method and how does swapping fix this? The usual pages on sets dont mention this anywhere, and I feel such exceptional cases should be known to the coder. I would be grateful if anyone can explain this :slight_smile:

1 Like

A set stores content in lexicographic order.

Initially you did t = *it, but if you insert a number lesser than *it, t will chance and in the subsequent step you are doing s.erase(t) which thus changes the output.  

I hope this solves your problem ! :slight_smile:

What i simply mean is, pointer pointing to a value may not be the same after insertion to the set. Thanks for the edits btw.

The bug was fixed by doing insertion first and deletion later. Can you explain it a bit more in that reference?

t here is iterator, you can think of it as index of element(like in simple array) in consideration for set.
As the sets always maintain some order(sorted in ascending order by default), by inserting a value in the set may cause the iterator to point to something else.

Example:

S ={5,6,7,8,9};
let iterator t = 3(using decimals for ease), therefore *t = 7 and m = 10; 

case 1:

 s.erase(t) // S = {5,6,8,9}, t is still = 3 but *t = 8
 s.insert(t^m) // 10^8 = 2,  S =  {2,5,6,8,9}. Note: t is still 3, but *t =6

Case 2:

s.insert(t^m) // 10^7 = 13, S = {5,6,7,8,9,13} t = 3, *t =7
s.erase(t)    // S = {5,6,8,9,13}

Clearly the result is different in both the cases.

Also something else to think about:

Case 3:

int val = *t;
s.erase(val);
s.insert(val^m);

Case 4:

int val = *t;
s.insert(val^m);
s.erase(val);

These two look fine right? but!
if m = 0, val = val^m. Then case 3 will have no difference(as you first erase val then insert it). But in case 4, you first insert it and since val is already present and set doesn’t allow repetition it will have no effect but when you erase it you will lose “val” which has present before, so again case 3 and 4 are different.

1 Like

I very well agree with what you say!! But can you give a read to the comment on first answer here - https://discuss.codechef.com/questions/58422/xorsub-editorial

It claims that we insert first and delete later, and that REALLY got me confused. Your explanation is awesome by the way ^^

@vijju123, the code goes something like this

for(auto it=s.begin(); it!=st.end(); it++) {
    t = *it;
    if (t>>z & 1) {
        s.erase(t);
        s.insert(t^m);
    }
}

The issue here is that the iterator it is iterating over the set s, but inside the iterator loop the set itself is being modified, which may cause the iterator to skip over elements or go over certain elements more than once. This is termed concurrent modification. In Java this code would throw a ConcurrentModificationException instead of giving a wrong answer. Reversing the order of two operations appears to fix this for the given testcase but I think it is not guaranteed to work always. To be certain one would have to know the internal implementation of std::set.

To be sure that the code does what I intend, I would write something like

vector toerase, toinsert;
for(auto it=s.begin(); it!=s.end(); it++) {
    t = *it;
    if (t>>z & 1) {
        toerase.push_back(t);
        toinsert.push_back(t^m);
    }
}
for(int x : toerase)
    s.erase(x);
for(int x : toinsert)
    s.insert(x);

PS: Thanks for introducing me to this XOR maximization trick :slight_smile:

2 Likes

So if i get the last part right, separately iterating over the vectors (separately for erasing and separately for inserting) will do the trick??

Thanks a LOT for the explanation bro!! I looked up 2 days before asking this Q, and i was unable to come up with anything except that the fault is coming from iterator.

And i really felt that just reversing will fix it up. Thanks for giving the latter code as well!!

Thanks a lot bro :slight_smile:

Yes, the bottom line is that it is not safe to modify any container that you are currently iterating over if it may change which element the iterator currently points to.
Sometimes this does not happen, such as when you are iterating over a vector and your code performs inserts only at the end using push_back(). Here you can be certain of the behaviour of the iterator, so there is no problem.

As answered by “meooow”, that to avoid this problem we can use separate containers and then use insert and erase method separately. But this idea of taking extra containers is not quite effective.
So, the problem with the erase method is that it invalidates the iterator, so erasing element from container while iterating through container give suspicious results. But Post-Increment operator comes to rescue here. We can rewrite the code as -

for(auto it=s.begin(); it!=st.end(); ) {
t = *it;
if (t>>z & 1) {
    s.erase(it++);
    s.insert(t^m);
}
else ++it;
}

What happens here is that due to Post-Increment operator, iterator will point to the next element in the set but erase method will erase the current element.

Initially Invalidation of iterator used to cause problems that’s why c++11 resolved this issue and now after c++11 versions erase method doesn’t invalidate the iterator, rather it returns an iterator which points to the next element of the set. So, now we don’t need to use Post-Increment operator while erasing.

So, now we can rewrite the same code as-

for(auto it=s.begin(); it!=st.end(); ) {
t = *it;
if (t>>z & 1) {
    s.erase(it);
    s.insert(t^m);
}
else ++it;
}