My solution for case 2 (large but positive): you’d always want to pick your range as the entire dishes. so, we only need to find out which dishes have c in [c-k, c+k] and sum up their d (we need to do this in log(n)).
to find the sum of dishes in range [c-k, c+k], we find the sum of dishes with c <= c+k and subtract the sum of dishes with c <= c-k-1.
so, we just need to find the sum of d for dishes with c less than or equal to certain C. To do that, we keep all dishes in a binary search tree (sorted by their c) and at each node, we keep the sum of d for all dishes in that subtree. any time you add a dish, as you go down from root to find the place to put this new node, add its d to all the nodes you visit. then you can also query the sum of d for dishes with c less than or equal to a given C in O(log(n)) time. (BTW, I got lucky and I got this problem without having to balance the BST. perhaps they could have included a sorted case where my solution’s query costs O(n))
Choose a block size of about sqrt(Q). Call it B.
When B dishes have been added to the start or B dishes have been added to the end, then compute some partial answers as follows for this block of size B.
For each possible color c, find the deliciousness of each of the following:
the most delicious sequence of dishes (with colors in range [c-K:c+K]) which includes the first dish in the block
the most delicious sequence of dishes which includes the last dish in the block
the most delicious sequence of dishes in the block
the sum of the deliciousness of all dishes in the block
These answers are in groups. A range of adjacent colors will have the same answers. There are at most 2B+1 possible ranges, so set up an array of the break points between the ranges, and a second array of the partial answers. It takes O(B^2) time to compute these answers for a block of B dishes.
To answer a type “3” query, loop over upto B-1 dishes at the start, then up to B blocks for which we have just calculated the partial answers, and up to B-1 dishes at the end. Finding the partial answer in one of the B blocks takes log(B) steps (use C++ upper_bound function, or similar), and combining the partial answers takes O(B) steps. So total time for one type 3 answer is O(B log(B))
@john_smith_3’s answer seems very helpful. This looks like some dynamic sqrt decomposition, which I haven’t encountered before. Learnt a new concept today :). Thanks for sharing!