I was solving this question https://www.hackerrank.com/contests/quantium/challenges/k-mismatch but sometimes it gives me wrong answer. What is wrong with my approach?
I used suffix array and lcp array.
We know that any prefix of a suffix is a substring of the string itself, so after finding the suffix array and the lcp array I did the following:
I choose two suffixes say i & j, suppose suffix i smaller than j.
Their lcp(i, j) will tell me the number of chars in common until I hit a mismatch or the suffix ends.
if the suffix ends then we r done the number of accepted pairs is the lcp itself.
otherwise if the mismatch doesn’t exceed k then we add lcp(i, j) + 1 to the result, then I start again with new suffixes.
I’ll give an example to make it more clear:
suffix i: abcdef
suffix j: abceeeefff
suppose k is 1
lcp(i, j) = 3 & the first mismatch is d & e. this mismatch doesn’t exceed k so res += (3 + 1) then new suffixes are:
suffix i: ef
suffix j: eeefff
Notice how I removed the lcp and the mismatch.
now lcp(i, j) = 1 & mismatch is f & e but the mismatch will now exceed k so we just count lcp thus res += 1 which yields res = 5
Here is a function that finds solution for i & j;
LL solve(int i, int j, int n, int k) {
int x = n - pos[i]; // Length of suffix i
int y = n - pos[j]; // Length of suffix j
// Put shorter suffix in x and longer in y
if(x > y) swap(x, y), swap(i, j);
LL ret = 0, m = 0;
while(x > 0) {
ret += lcp1[i][j];
// if all suffix i is a match break otherwise we have a mismatch
if(x == lcp1[i][j]) break;
// if this mismatch excceeds k then break otherwise count it
if((m + 1) > k) break;
m++;
// remove the matched chars and mismatched char
x -= (lcp1[i][j] + 1);
y -= (lcp1[i][j] + 1);
// update the position of the new suffixes in the suffix array.
// rank returns the position of suffix starting at position pos[i] + lcp1[i][j] + 1 in //the suffix array
i = rank[pos[i] + lcp1[i][j] + 1];
j = rank[pos[j] + lcp1[i][j] + 1];
}
return ret + m; // return matches + mismatches
}