Given a string S of length n characters, is it possible to calculate the Hash of its substring [i, j] (From index i to index j. Inclusive) in O(1) using some form of precomputation ? Maybe a modification of the Rolling Hash ?
Similar Problem
Problem Statement TACHEMIS
I have seen it being used in a similar problem where in a string was given in a compressed form. Meaning, e.g. if the string is "aaabccdeeee"
then the compressed form is:
3 a
1 b
2 c
1 d
4 e
How data was stored
They are stored in an str[]
array as :
str[] = [{'a','3'}, {'b','1'}, {'c','2'}....]
HASHING Concept that was used in the solutions
And programmers had used the following hash concept to find if the given substring is a Palindrome or not. Given a substring of string S as (i,j), they computed the hash of substring [i , (i+j)/2] and the reverse hash of substring [(i+j+2)/2, j] and checked if they were equal or not. So if they wanted to check if in string S = "daabac"
whether substring [1, 5] is a a palindrome or not, they computed the following :
h1 = forward_hash("aa")
h2 = reverse_hash("ba")
h1 == h2
Code for the Hashing Concept
The hash precomputation was done as follows :
/* Computing the Prefix Hash Table */
pre_hash[0] = 0;
for(int i=1;i<=len(str);i++)
{
pre_hash[i] = pre_hash[i-1]*very_large_prime + str[i].first;
pre_hash[i] = pre_hash[i] *very_large_prime + str[i].second;
}
/* Computing the Suffix Hash Table */
suff_hash[0] = 0;
for(int i=1;i<=len(str);i++)
{
suff_hash[i] = suff_hash[i-1]*very_large_prime + str[K-i+1].first;
suff_hash[i] = suff_hash[i] *very_large_prime + str[K-i+1].second;
}
And then the hash was computed using the following functions :
/* Calculates the Forward hash of substring [i,j] */
unsigned long long CalculateHash(int i, int j)
{
if(i>j)
return -1;
unsigned long long ret = pre_hash[j] - POW(very_large_prime, [2*(j-i+1)])*pre_hash[i-1];
return ret;
}
/* Calculates the reverse hash of substring [i,j] */
unsigned long long CalculateHash_Reverse(int i, int j)
{
unsigned long long ret = suff_hash[j] - POW(very_large_prime,[2*(j-i+1)])*suff_hash[i-1];
return ret;
}
What I am trying to do
I am looking for a general approach to the above concept. Given a Pattern P, I want to check if the pattern P is present in a string S. I know the index (i)
to check where it may be present. And I also know the length of pattern P represented as |P|
. In short I want to check if hash of S[i, i+|P|]
and hash of P
match or not in O(1)
using some form of pre computation on S
.
Ignoring the time taken to compute hash of P else it would be O(1+|P|)