PROBLEM LINK:
Setter- Hasan Jaddouh
Tester- Jakub Safin
Editorialist- Abhishek Pandey
DIFFICULTY:
EASY
PRE-REQUISITES:
Hashing (or precisely, Identifying Palindromic Sub-string using Hashing)
PROBLEM:
Given a string S , tell how many k-shifts of the string are palindromic. k-shift of a string is, the string obtained on shifting every character right by k positions (circularly).
QUICK EXPLANATION:
We know that we can obtain all cyclic shifts of an array by appending an array to itself and moving the N-sized window over it (to get different cyclic shifts). We use this to avoid having to cyclically shift the string again and again. After that, we use the concept of hashing to check if the strings are palindrome or not.
EXPLANATION:
This editorial will have two parts. The first one will describe the cyclic-shift trick, and the other will describe the hashing part. It is expected that you have gone through the pre-requisites if concept of hashing is new to you. The basics of identifying palindrome string through hashing is given in the link under pre-requisites. Hence, the editorial will not explicitly describe on how to check that.
1.Cyclically Shifting Trick-
Whenever we are given the task to cyclically shift an array, or a string, we can use this trick to avoid having to actually cyclically shift the entire array again and again.
This trick says that, append the string/array to itself. Eg, if S=abc then S'=S+S=abcabc. Now, consider the string from range i to i+N to obtain the i'th cyclic shift of the string.
Eg-
Starting from i=1 to N (1-based indexing) will give you left cyclic shift of string, while decrementing from i=N to i=1 will give you right cyclic shifts of string. Now, I have one question to ask.
Why does this work? What is the foundation principle of this trick?
(Answer will be in Chef Vijjuās corner)
2. Hashing the String-
You can refer to the setterās code here for the concept for now. Please note that the hash function and operations in it can vary from person to person. We will be discussing w.r.t. setterās solution first.
Now, we know the cyclic-shift trick. Using that, we can reduce the problem to "Find how many substrings (contiguous) of length N are palindrome in the string S'(=S+S)"
If youāve read the pre-requisite link, you would have got how we used the hash function of-
h(x)= \sum_{i=0}^{i=n-1} ({101}^{i}*S_i)\%mod where S_i= iāth character of the string.
Setter has used a similar hashing function of-
H(X)= \sum_{i=0}^{i=n-1} ({31}^{i}*(S_i-`a'))\%mod
The steps to check for palindrome are similar. At each step, he checks for a sub-string of length N, modifies the hash function according to what characters are present in the N length window. This step involves, w.r.t. variable hsh in his solution, removal of term with lowest power of {31}^{i} and adding the term {31}^{i}*(S_i-`a') to the function. (If you cannot make sense of this addition removal of lowest/highest power term, please refer again to link in pre-requisites). Can you try working out for hsh2 of setterās code? Answer will be in the tab below-
Click to view
Donāt get intimidated by the hsh2*=p[2]
thing, its nothing but a correction factor. What happens at hsh2 is completely opposite of what happens at hsh , i.e. the highest power (of p) term gets cut-off and the immediate lowest available term is added. Eg- If N=5 ,then we have calculated hashes for terms in range i=0 to i=4. We multiply it by {31}^{2} (making the terms with us having powers between 2 to 6) and remove the term with highest power of {31} (i.e. {31}^{6}). We then add the term with {31}^{1} and check for equality.
Now, what does this computation on hashing actually mean? At each step, hsh is modified to represent the string shifted left (i+1) times, while hsh2 represents the reverse of string S being cyclically shifted right (i+1) times. Compare them in the table below-
Can you actually tell the significance of {31}^{i} in the hash function? As in, why should we take any constant, say p, and make hash function of type H(X)=\sum_{i=0}^{i=n} {p}^{i}*S_i. Why multiply with {p}^{i} ?
The tester uses a similar concept, except that he tries to make his hash function even less error prone by introducing another prime number for \% operation and making hash as H(X)=H_1*p+H_2 where $H_1$and H_2 and calculated similarly as described above. The tester explicitly uses our cyclic-shift trick. His H represents the normal string S, which H_i represents the reverse of string S. If hashes of these two are equal, then that means that the N length sub-string, and its reverse are equal, which means that the string is palindrome. Hashes are updated in a manner, which is similar to āSliding windowā, i.e. add contribution of new element, remove contribution of the oldest element (element to be removed).
A more formal description of testerās H and H_i function will be-
H= \sum_{i=0}^{i=2N-1} {p}^ {i}*(S_i-`a ' + 1) for both H_1 and H_2 and
H_i = \sum_{i=2N-1}^{i=0} {p}^{i}*(S_i-`a'+1), again, for both H_{i1} and H_{i2}
In a gist, we ārepresentā the strings by hash function. If we find that the hash function of the string, and the reverse is equal, it means that the string is palindrome. The only thing to take care of is, your hash function must be good, and not have a high probability of error. Usually its acceptable if its probability of failing is \le 0.001 \%
SOLUTION:
For immediate availability of setter and testerās solution, they are also pasted in the tabs below. This is for your reference, and you can copy code from there to wherever you are comfortable reading them.
Click to view
#include <iostream>
#include <algorithm>
#include <string>
#include <assert.h>
using namespace std;
long long readInt(long long l,long long r,char endd){
long long x=0;
int cnt=0;
int fi=-1;
bool is_neg=false;
while(true){
char g=getchar();
if(g=='-'){
assert(fi==-1);
is_neg=true;
continue;
}
if('0'<=g && g<='9'){
x*=10;
x+=g-'0';
if(cnt==0){
fi=g-'0';
}
cnt++;
assert(fi!=0 || cnt==1);
assert(fi!=0 || is_neg==false);
assert(!(cnt>19 || ( cnt==19 && fi>1) ));
} else if(g==endd){
assert(cnt>0);
if(is_neg){
x= -x;
}
assert(l<=x && x<=r);
return x;
} else {
assert(false);
}
}
}
string readString(int l,int r,char endd){
string ret="";
int cnt=0;
while(true){
char g=getchar();
assert(g!=-1);
if(g==endd){
break;
}
cnt++;
ret+=g;
}
assert(l<=cnt && cnt<=r);
return ret;
}
long long readIntSp(long long l,long long r){
return readInt(l,r,' ');
}
long long readIntLn(long long l,long long r){
return readInt(l,r,'\n');
}
string readStringLn(int l,int r){
return readString(l,r,'\n');
}
string readStringSp(int l,int r){
return readString(l,r,' ');
}
int T;
int n;
int sm_n=0;
string s,s2;
int b=31;
long long mod=1000000007;
long long p2[400400];
/*long long hsh[400400],hsh2[400400];
void hash_init(){
for(int i=1;i<=n;i++){
hsh[i] = (hsh[i-1] + (s[i] - 'a') * p2[i]) % mod;
hsh2[i] = (hsh2[i-1] + (s2[i] - 'a') * p2[i]) % mod;
}
}
long long get_hsh(int l,int r){
return ((((hsh[r] - hsh[l-1]) * p2[n-l])% mod)+mod)%mod;
}
long long get_hsh2(int l,int r){
return ((((hsh2[r] - hsh2[l-1]) * p2[n-l])% mod)+mod)%mod;
}
long long is_pal(int l,int r){
return get_hsh(l,r) == get_hsh2(n-r+1,n-l+1);
}*/
int main(){
//freopen("05.txt","r",stdin);
//freopen("05o.txt","w",stdout);
p2[0]=1;
for(int i=1;i<400400;i++){
p2[i]= (p2[i-1] * b)%mod;
}
T=readIntLn(1,1000);
while(T--){
s=readStringLn(1,200000);
n= s.length();
sm_n += n;
for(int i=0;i<n;i++){
assert('a'<=s[i] && s[i]<='z');
}
s2=s;
reverse(s2.begin(),s2.end());
int sol=0;
long long hsh=0,hsh2=0;
for(int i=0;i<n;i++){
hsh += p2[i] * (s[i] - 'a');
hsh %= mod;
hsh2 += p2[i] *(s2[i] - 'a');
hsh2 %= mod;
}
for(int i=0;i<n;i++){
hsh -= p2[i] * ( s[i] - 'a');
hsh += p2[i+n] * ( s[i] - 'a');
hsh %= mod;
if(hsh < 0 ) hsh += mod;
hsh2 *= p2[2];
hsh2 %= mod;
hsh2 -= p2[n+1+i] * (s2[n-1-i] - 'a');
hsh2 += p2[i+1] * (s2[n-1-i] - 'a');
hsh2 %= mod;
if(hsh2 < 0 ) hsh2 += mod;
if(hsh == hsh2)sol++;
}
cout<<sol<<endl;
}
assert(getchar()==-1);
}
Click to view
#include <bits/stdc++.h>
// iostream is too mainstream
#include <cstdio>
// bitch please
#include <iostream>
#include <algorithm>
#include <cstdlib>
#include <vector>
#include <set>
#include <map>
#include <queue>
#include <stack>
#include <list>
#include <cmath>
#include <iomanip>
#include <time.h>
#define dibs reserve
#define OVER9000 1234567890
#define ALL_THE(CAKE,LIE) for(auto LIE =CAKE.begin(); LIE != CAKE.end(); LIE++)
#define tisic 47
#define soclose 1e-8
#define chocolate win
// so much chocolate
#define patkan 9
#define ff first
#define ss second
#define abs(x) (((x) < 0)?-(x):(x))
#define uint unsigned int
#define dbl long double
#define pi 3.14159265358979323846
using namespace std;
// mylittledoge
using cat = long long;
#ifdef DONLINE_JUDGE
// palindromic tree is better than splay tree!
#define lld I64d
#endif
const cat mod[2] = {1'000'000'007, 1'000'000'009};
const cat p = 999983;
int main() {
cin.sync_with_stdio(0);
cin.tie(0);
cout << fixed << setprecision(10);
int T;
cin >> T;
for(int t = 0; t < T; t++) {
string S;
cin >> S;
int N = S.length();
S = S + S;
vector<cat> H[2], Hi[2];
vector<cat> pw[2];
for(int k = 0; k < 2; k++) {
H[k].resize(2*N+1, 0);
Hi[k].resize(2*N+1, 0);
for(int i = 0; i < 2*N; i++)
H[k][i+1] = (H[k][i] * p + S[i]-'a'+1) % mod[k];
for(int i = 2*N-1; i >= 0; i--)
Hi[k][i] = (Hi[k][i+1] * p + S[i]-'a'+1) % mod[k];
pw[k].resize(2*N+1, 1);
for(int i = 1; i <= 2*N; i++) pw[k][i] = (pw[k][i-1] * p) % mod[k];
}
int ans = 0;
for(int i = 0; i < N; i++) {
cat hash_direct[2], hash_inverse[2];
for(int k = 0; k < 2; k++) {
hash_direct[k] = (H[k][i+N] - pw[k][N] * H[k][i]) % mod[k];
if(hash_direct[k] < 0) hash_direct[k] += mod[k];
hash_inverse[k] = (Hi[k][i] - pw[k][N] * Hi[k][i+N]) % mod[k];
if(hash_inverse[k] < 0) hash_inverse[k] += mod[k];
}
cat hd = hash_direct[0] * mod[1] + hash_direct[1];
cat hi = hash_inverse[0] * mod[1] + hash_inverse[1];
if(hd == hi) ans++;
}
cout << ans << "\n";
}
return 0;}
// look at my code
// my code is amazing
Time Complexity= O(N) for setter and testerās solution.
CHEF VIJJUāS CORNER
1. Foundation of cyclic-shift principle-
Click to view
This principle works mainly because the shift is cyclic, i.e. the last character warps to first characterās place for right shift and vice-versa for left shift. Appending the array takes care of this beautifully.
Talking with respect to left shift - Say we left shifted the string by 1 character. This means,we have to look at characters from i=1 to N (0-based indexing) in our String S'(=S+S). The character at index 0, is already at index N, the sub-string starts from required character, and relative order of every other character is preserved, so we need not care about them
For now, I am tending to leave the formal proof for you guys to google out. Ping me if its not found.
2. Significance of {p}^{i} in hash function.
Click to view
This {p}^{i} assigns an importance/a weight to position where a character occurs. If we simply use \sum_{i=0}^{i=2N-1}(S_i- ` a '+1), then it will be equal to hash of all strings where āsame characters occur, irrsepective of orderā. Meaning, for this has function, string āaabā and ābaaā will be āsameā.
Recall the concept of Hash functions- If the hash function is ideal, then hashes of 2 strings will be equal if and only if both the strings are equal.
3. A view on Hash functions-
Click to view
As we all know, hashing is never 100\% accurate. Most of the hash functions have a set of input where they may fail. Can you briefly analyze the setter and testerās hash function? Can we make a test case against it? Why or Why not?
Hashing says that any hash function can ādo the jobā, but some hash functions can do the job much better than others. It takes a little amount of experimentation (and risk-taking!) to decide what is the perfect hash function for your input data. While hashing may fail at some input out of infinite inputs in the domain, it is possible to get a hash function completely accurate over the set of limited input (Eg- limited test cases). Theres only one thing to watch out forā¦if the setter hates hashing and has some anti-hash test cases against your popularly-used hash function! Hence, its often good to grasp the concept and make one of your own hash function rather than using the popular ones each time.
4. Regarding any doubts which might arise after reading editorial
Click to view
I am well aware that there are/will-still-be a few doubts related to hashing right now, especially if you are new. Iād say, take some time, try to grasp the pre-requisite link first, and then re-read the editorial. And if that doesnt answer, immediately ask so we can address it. All such discussions will be added and compiled in editorial to help enhance it, and to help people with similar doubts
One thing Iād address is, the aim of the editorial is that after reading it, at least one of the approaches (of tester, or of setter) is clear to you. The mathematics behind the hash function may take some time, but your aim should be in seeing how you can make a hash function yourself than copying theirs. Testerās hash function is easier to understand, while there are some references available for concept of setterās hash function. If you are able to get the physical meaning, and what they are doing/intending to do, thats good enough for first few tries. The mathematics to understand what does what, how to manipulate the hash function &etc. comes with practice, so dont panic if thats not immediately clear after reading the editorial.
5. We did have a look at hashing solution of this problem. Does a solution without hashing exist? Its your task to find out! Check out the list of successful submissions . (I will update this section if there I am able to find a good solution without hashing).
6. An alternate perspective for understanding hash function.
Click to view
You can correlate hash-function with polynomials-
H(X)= \sum_{i=0}^{i=2n-1} {P}^{i}*{X_i}^{i+1}
At each iteration, starting from i=0, we take N continuous terms of this polynomial. X_i is nothing but the āvalue we assigned to the character ch (eg- ch-āaā+1 etc.)ā. We know how quickly {P}^{i} and {X}^{i+1} will grow. This means, it emphasizes/strengthens the fact that, two hashes can be equal if and only if the strings are exactly same.
Think a bit, say I have 2 strings S and S' of length 100. Say, both are same, except that S' = swapping positions of any two characters in S. What is the maximum and minimum hash which we can obtain? Can it collide with hash of S?
7. Testerās Notes-
Click to view
The solution for SHIFTPAL is just āduplicate the string, then check for each substring of length N is a palindromeā; checking if a substring is a palindrome is trivial with hashing.
8. Some related problems on hashing-
- Lucy and the Flowers
- Chef and Queries
- SEAPERM2
- Am I Fibonacci - A cakewalk level problem of hashing. Recommended to solve if you have no prior experience and want to get addicted to hashing