Given two strings, say A and B, find whether A is a sub-sequence of B, or whether B is a sub-sequence of A.
A sub-sequence is defined as a string obtained from another string, say S, by deleting one or more characters form S, and not changing the order of the remaining characters.
If the length of A is more than the length of B, then A cannot be a sub-sequence of B. This is obvious because you cannot delete characters from B and end up with a string that has more characters than it did orginally.
Thus, if length of A is larger than length of B we can swap them. Now, it only needs to be checked whether A is a sub-sequence of B.
Checking whether A is a sub-sequence of B can be done greedily.
Let us find the first occurence of the first character of A in B.
for i = 1 to B.length if B[i] == A break i++
If we find that i is larger than B.length, then of course the very first character of A doesn’t exist in B. This would mean that it is impossible for A to be a sub-sequence of B.
On the other hand, we have found that the the first character of A occurs in B at position i, first. Now, we can start looking for the second character of A. But, any occurance of the second character of A that occurs before i in B is irrelevant because we cannot perform any operation that changes the order of characters in A (or B for that matter).
Thus, we can resume searching for the second character of A in B, after position i.
for j = i+1 to B.length if B[j] == A break j++
Using the same arguments as above, if j is not more than B.length, we have to resume searching for the third character of A in B, after position j.
When we have found all the characters of A in B, we can safely end the algorithm as well (with a positive). Otherwise we will run out of characters in B and we must return with a negative.
The above algorithm will look like the following pseudo code.
j = 1 for i = 1 to A.length while j < B.length if B[j] == A[i] break j++ if j > B.length return false i++ j++ return true
The complexity of the algorithm is O(|A| + |B|), where |S| is the length of S. If it is not obvious to you why the algorithm isn’t O(|A| * |B|) note that we never decrement the value of j. In every iteration of the above algorithm we always increment i as well as j, and probably increment j more so. Thus, the algorithm must terminate in at most O(|A|) iterations of the outer loop and not more than O(|B|) iterations of the inner loop.
Note how this problem differs from the standard Dynamic Programming problem of finding the largest common sub-sequence between two strings. We could of course solve this problem by finding the longest commong sub-sequence between A and B as well, but doing so requires O(|A| * |B|) which is too slow for the limits set for length of the strings A and B.
Setter’s solution will be updated soon.
Can be found here.