NDIFFPAL - Editorial

PROBLEM LINK:

Contest
Practice

Author: Sergey Kulik
Testers: Vasya Antoniuk and Kevin Atienza
Translators: Vasya Antoniuk (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza

DIFFICULTY:

Easy

PREREQUISITES:

Strings, palindromes, ad hoc, dynamic programming

PROBLEM:

Given N, output a string consisting of lowercase letters such that there are exactly N palindromic substrings. Two substrings are considered different if they have different starting / ending positions.

QUICK EXPLANATION:

Print a string of length N of the following form: abcabcabcabcabcabc.... This string has exactly N palindromic substrings.

EXPLANATION:

This problem admits many kinds of solutions. We’ll first describe the simplest one.

Simple

Notice that any string with length K has at least K palindromic substrings, because every single-character substring is palindromic! Thus, the answer, if it exists, must have a length of \le N. Unfortunately, most strings of length K actually has strictly more than K palindromic substrings. For example, abababab contains 8 one-character palindromic substrings, but it also has the substrings aba, bab, ababa, etc., appearing multiple times. In fact, if there is a character appearing twice consecutively, then you automatically have more than K palindromic substrings. In the extreme case, in a string consisting of only one letter like aaaaa..., all substrings are palindromic.

But in fact there are ways to construct a string of length K having exactly K palindromic substrings. One of the simplest is a string of the form: abcabcabcabcabcabc.... It’s easy to see why it doesn’t have any palindromic substring of length > 1: any substring of length > 1 has at least one of the following substrings: ab, bc, and ca. But the reversals of these substrings don’t appear anywhere in the string!

Thus, a very, very simple solution arises: Simply print a string of the form abcabcabcabcabcabc... of length N!

Short

The above solution is probably the simplest you can ever get. But what if we are conserving letters, i.e., we want the string to be as short as possible?

In that case, we want our string to actually have lots of palindromic substrings of length > 1. As mentioned above, an example of this is a string consisting of only one character like aaaaa.... If the length is K, then there are 1 + 2 + 3 + \ldots + K = K(K+1)/2 palindromic substrings. Thus, to form a string with N palindromic substrings, we simply find a K such that K(K+1)/2 = N. Unfortunately, not all N s can be represented as K(K+1)/2. In fact, such numbers are rare! Below x, there are only approximately \sqrt{2x} such values.

The idea in this solution is to append such single-character strings together to form one large string. For example, consider the strings aaa...aaabbb...bbb. Suppose there are A as and B bs. It’s easy to see that no palindromic substrings containing an a and a b because if so, the substring ab would appear, but the reversal, ba, doesn’t appear anywhere in the string! Thus, there are A(A+1)/2 + B(B+1)/2 palindromic substrings. Similarly, aaa...aaabbb...bbbccc...ccc has A(A+1)/2 + B(B+1)/2 + C(C+1)/2 substrings, etc…

So our goal is to find a representation of N as a sum of triangle numbers where the sum of the bases is as small as possible. We can achieve this using dynamic programming: Let F(N) be the smallest sum of bases for any representation of N as a sum of triangle numbers. Then a simple recurrence can be derived by enumerating all possibities for the last summand. Specifically, if k(k+1)/2 is the last summand in the representation, then:

F(N) = \min_{\frac{k(k+1)}{2} \le N} \left(k + F\left(N - \frac{k(k+1)}{2}\right)\right)

Our base case here is simply F(0). Along with F(N), we also store the k that minimizes the above, so that we can reconstruct the representation!

The following pseudocode illustrates this:

BOUND = 10011
F[0..BOUND]
K[0..BOUND]

for N = 1..BOUND:
    F[N] = INFINITY
    for k=0 increasing where k*(k+1)/2 <= N:
        cost = k + F[N - k*(k+1)/2]
        if F[N] > cost:
            F[N] = cost
            K[N] = k // store the k that minimizes

Now, to find the representation of some N, we simply use the K array:

def find_representation(N):
    rep = []
    while N > 0:
        k = K[N]
        rep.append(k)
        N -= k*(k+1)/2
    return rep

Finally, we can construct the string itself using the representation with something like this:

alphabet = 'abcdefghijklmnopqrstuvwxyz'
def construct(N):
    rep = find_representation(N)
    s = ''
    for i=0..rep.length-1:
        append alphabet[i] to s rep[i] times
    return s

With this answer, it turns out that you only need a string of at most 161 characters, and at most 5 distinct characters!

Investigating the shortest possible strings

The algorithm above produces some really short strings. Unfortunately, they aren’t the shortest possible. The smallest counterexample is N = 26: The algorithm produces a string of length 10: aaaaaabbcd. But in fact a shorter string is possible: aaaaaabaa.

In fact, it doesn’t seem easy to figure out the shortest possible string. So unfortunately I don’t have the answer for that :frowning: If you wish to investigate it, here are the optimal answers for every N \le 55, along with an optimal string. Perhaps you can find a pattern and then derive a formula for the optimal string.

 N opt string
 1   1 a
 2   2 ab
 3   2 aa
 4   3 aab
 5   4 aabc
 6   3 aaa
 7   4 aaab
 8   5 aaabc
 9   5 aaaba
10   4 aaaa
11   5 aaaab
12   6 aaaabc
13   6 aaaaba
14   7 aaaabac
15   5 aaaaa
16   6 aaaaab
17   7 aaaaabc
18   7 aaaaaba
19   8 aaaaabac
20   8 aaaaabab
21   6 aaaaaa
22   7 aaaaaab
23   8 aaaaaabc
24   8 aaaaaaba
25   9 aaaaaabac
26   9 aaaaaabab
27   9 aaaaaabaa
28   7 aaaaaaa
29   8 aaaaaaab
30   9 aaaaaaabc
31   9 aaaaaaaba
32  10 aaaaaaabac
33  10 aaaaaaabab
34  10 aaaaaaabaa
35  11 aaaaaaabaac
36   8 aaaaaaaa
37   9 aaaaaaaab
38  10 aaaaaaaabc
39  10 aaaaaaaaba
40  11 aaaaaaaabac
41  11 aaaaaaaabab
42  11 aaaaaaaabaa
43  12 aaaaaaaabaac
44  12 aaaaaaaabaab
45   9 aaaaaaaaa
46  10 aaaaaaaaab
47  11 aaaaaaaaabc
48  11 aaaaaaaaaba
49  12 aaaaaaaaabac
50  12 aaaaaaaaabab
51  12 aaaaaaaaabaa
52  13 aaaaaaaaabaac
53  13 aaaaaaaaabaab
54  14 aaaaaaaaabaabc
55  10 aaaaaaaaaa

Time Complexity:

O(N)

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter
Tester

4 Likes

Codechef editorials are just amazing! So much detail.

3 Likes

The first method (abcabc…) is giving time limit exceeded error on using JAVA…

I have used something different. I have simply printed abcdef....xyzabcdef.....xyzabc... upto length N. This string has exactly N palindromic substrings with length 1.

Code- https://www.codechef.com/viewsolution/10325809

But I have some doubt. My first submission was WA, though I did the same thing, just with little bit changes. Can anyone figure out the fault?

WA- https://www.codechef.com/viewsolution/10325602

AC- https://www.codechef.com/viewsolution/10325809

1 Like

Not able to understand the formula for F(n) in second method.

Hey @debjitdj cout<< a prints the characters of the string a untill the NULL character(’\0’) occurs in the string.
So, it is important to insert the NULL character(’\0’) in the end of the string.
So, you have to replace only
char a[26]={‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’,‘i’,‘j’,‘k’,‘l’,‘m’,‘n’,‘o’,‘p’,‘q’,‘r’,‘s’,‘t’,‘u’,‘v’,‘w’,‘x’,‘y’,‘z’};
statement by
char a[27]={‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’,‘i’,‘j’,‘k’,‘l’,‘m’,‘n’,‘o’,‘p’,‘q’,‘r’,‘s’,‘t’,‘u’,‘v’,‘w’,‘x’,‘y’,‘z’,’\0’};
OR
char a[]=“abcdefghijklmnopqrstuvwxyz”;
statement to correct your code.
Note that,in the latter statement , ‘\0’ is not necessary, it is inserted automatically.

Have you tested the codes? In both the cases the output remains same (As expected). But I don’t know why it is giving WA in codechef submission.

WA code- https://ideone.com/5tWhzH

AC code- https://ideone.com/RoGIe7

@debjitdj In Codechef Online IDE, both are giving different outputs.

Ooohhh!!..wow!! :stuck_out_tongue: Eurekaaaaaa!!! So, ideone and codechef online ide are giving different outputs!

1 Like

because of null at the last of character array you can see this on codechef code, compile and run

just give n=30 and see the output

I prefer ideone.

WA code- https://ideone.com/5tWhzH

AC code- https://ideone.com/RoGIe7

can you explain now? :smiley:

I describe my approach on my blog too: http://passionsprimitive.blogspot.co.uk/2016/07/codechef-ndiffpal.html

I also instinctively went in for the triangular numbers, without considering the base-case of just using the n-length string of n unique characters (and repeating after exhausting the set).

Here goes:

As a first observation, if we have all characters of a string identical, e.g. “aaa”, we will see that there are 6 different index combo’s ==> (1,1) (1,2) (1,3) (2,2) (2,3) (3,3). After some scribling around for relationship of string length and number of index combo’s, this reduces easily to Triangular numbers.

Now, if we think of all the input to the puzzle that are not triangular numbers, if we reduce that number to a sum of triangular numbers that have appeared before it. e.g. 13 = 6 + 6 + 1 => so, one can represent this as “aaabbbc”. The largest input possible is 10000 => closest floor triangular number to this is 9870 – so the 140 index triangular number.

I think might as well generate all the triangular numbers till we hit the largest input and then decomposition, finally link that to printing unique strings. On second thought that it very inefficient since most of those will not be needed.

What we really need is a way to find the largest triangular number smaller than a given number and then note that and reduce the running number by this triangular number.

An even better hack is to hard-code the 140 triangular numbers and binary search the array. This would be super-quick.

Take-aways:

There were much easier solutions acceptable since this is a very ‘loose’ problem. e.g. just repeating the character set and printing the number of character as the input would suffice as a decent base case.
Figurate Number are an interesting field to explore further – this will probably re-appear a lot in problem solving!
Hard-coding can be king for time limits
ASCII to int and vice-versa

Here is the solution: https://github.com/jubinchheda/codechef/raw/master/ndiffpal.py

@arch_abs this method is working fine in c++.I got an AC on first go.

For Java users, this is working fine-

import java.io.*;
class NDIFFPAL
{
 public static void main(String args[]) throws IOException
 {
     BufferedReader br=new BufferedReader
     (new InputStreamReader(System.in));
     
     int T,N,c,i,j;
     String word,s="abcdefghijklmnopqrstuvwxyz";
     
     System.out.println();
     T=Integer.parseInt(br.readLine());
     
     for(i=0;i<T;i++)
     {
         N=Integer.parseInt(br.readLine());
         word="";
         c=N/26;
         for(j=0;j<c;j++)
         {
             word+=s;
         }
         c=N%26;
         word+=s.substring(0,c);
         System.out.println(word);
      }
    }
}
1 Like