STRQUERY - Will rope work here?

bcurcio · April 16, 2013, 4:23pm

What as the O(N^1.5) solution?

mugurelionut · April 16, 2013, 5:42pm

@wjmzbmr: Do you mean N=the length of S (or the number of operations performed) or N=the total length of the query strings ? I am curious because one of my initial attempts tried to achieve O(sqrt(total length of the query strings)) per each operation (inserts, deletes, queries) on average. My idea was to keep counts of “small” query strings and to use KMP on the whole of S for “large” query strings (the threshold between “small” and “large” was about sqrt(total length of query strings)). But this turned out to be too slow (but I used STL maps for the counts which increased the complexity).

wjmzbmr · April 16, 2013, 6:13pm

@mugurelionut both can pass.

I have tried the sqrt(total length of query strings) in testing, but it seems I lack of optimization skills, it run very slow. so I just make the TL a bit tight.

wjmzbmr · April 16, 2013, 6:14pm

I think that is too slow.

amitbaranwal53 · April 16, 2013, 7:58pm

@wjmzbmr,can you please tell me what was your expected data structure(rope/generalised suffix tree/extended suffix array) and please give us tutorial for STRQUERY.

anton_lunyov · April 17, 2013, 1:23am

Hi all.

Sorry for delay with the editorial for STRQUERY.
This weekend was full of contests - I’ve participated in 6
Hence only brief editorial for INVBINCF was prepared and no for STRQUERY.
This problem is quite tough and explaining it properly requires some time.
I spend a lot of time only to understand the solution having some short abstract present in author’s code.
Now I completely understand @winger solution and if you wait for one day I will present you a very good editorial so that anyone could understand and implement the solution.

In short, the solution uses dynamic suffix array that supports modification operations at the beginning of the string and count queries. This is achieved by using special balanced binary search tree and it is the toughest part of the problem. Let’s call this data structure as suffix stack.

Then the actual problem can be solved by using 4 suffix stacks - we divide the string into 4 nearly equal pieces such that each modification is performed at the beginning of one of the pieces (two of them are reversed). Also sometimes we should completely rebuild two neighboring suffix stacks (when one of them becomes empty). But this happens only O(log Q) times (the reasoning is the same as in the implementation of disjoint set data structure with linked list).

Also we should count occurrences of query string at the boundary of neighboring pieces. But this is simple. If we have two strings A and B and want to count the number of those occurrences of the string C in concatenation A + B that touch both A and B we create string D of the size at most 2 * |C| by concatenating suffix of A of size min(|A|,|C|-1) and prefix of B of size min(|B|,|C|-1) and then count number of occurrences of C in the string D using any standard algorithm like KMP, Z-Algorithm or hashing in O(|C|+|D|) = O(|C|) time.

It seems that among AC solutions there are many different ideas for mentioned above data structure that supports only modification queries at the beginning, so editorial will cover only one possible solution.

Regards,
Anton.

UPD
Finally editorial is ready. It is THE LONGEST editorial ever. Check it out here.

anton_lunyov · April 17, 2013, 3:36am

To all: My answer was considerably updated.

mugurelionut · April 17, 2013, 5:21am

So, just to see if I got things right… Does the official solution work in the ONLINE mode ? (i.e. it can process a new operation as soon as it reads it) I think many of the AC solutions (mine included) only work in the OFFLINE case (in my solution I initially construct a trie of all the query strings from the input file), so an ONLINE solution would be quite interesting.

wjmzbmr · April 17, 2013, 9:04am

@mugurelionut yes.It could be even extended to something like “how many times does string q occur in s after the k-th operation?” in online mode.

wjmzbmr · April 17, 2013, 9:05am

the tutorial is written by anton_lunyov, it will become available soon.

syker · April 17, 2013, 10:59am

read my comment … was getting problem… i think dynamic allocation is the problem

mugurelionut · April 17, 2013, 2:16pm

@wjmzbmr: I see… So you can also keep the history/versions of the data structure then… Well, I can’t wait for the editorial, then. In case you based your solution on some papers/tutorials available online (e.g. like the paper suggested by @cyberax), then some links to them would also be appreciated.

anton_lunyov · April 17, 2013, 2:21pm

I didn’t think on history/versions so don’t expect this in the editorial
And I don’t know about papers/tutorials that contain this approach.

atulbond · April 17, 2013, 3:33pm

@anton_lunyov : i am waiting for your tutorial eagerly.
i tried to solved this problem using simple approach but unfortunately was getting WA.I checked the output with my custom test case set with the AC solutions but mine and their output were same.
here is my code link(used KMP algo to count occurrence ):-
http://www.codechef.com/viewsolution/2024390
It would be great if you can find error in my code or check it against your custom test case set and let me know.
Thanks in Aadvance

mugurelionut · April 17, 2013, 3:40pm

@anton_lunyov: Yes, no problem about history/versions Maybe @wjmzbmr can mention his own ideas (or hints) regarding history/versions after the editorial is published.