PROBLEM LINK:
Author: Sergey Kulik
Testers: Vasya Antoniuk and Kevin Atienza
Translators: Vasya Antoniuk (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza
DIFFICULTY:
Easy
PREREQUISITES:
Strings, palindromes, ad hoc, dynamic programming
PROBLEM:
Given N, output a string consisting of lowercase letters such that there are exactly N palindromic substrings. Two substrings are considered different if they have different starting / ending positions.
QUICK EXPLANATION:
Print a string of length N of the following form: abcabcabcabcabcabc...
. This string has exactly N palindromic substrings.
EXPLANATION:
This problem admits many kinds of solutions. We’ll first describe the simplest one.
Simple
Notice that any string with length K has at least K palindromic substrings, because every single-character substring is palindromic! Thus, the answer, if it exists, must have a length of \le N. Unfortunately, most strings of length K actually has strictly more than K palindromic substrings. For example, abababab
contains 8 one-character palindromic substrings, but it also has the substrings aba
, bab
, ababa
, etc., appearing multiple times. In fact, if there is a character appearing twice consecutively, then you automatically have more than K palindromic substrings. In the extreme case, in a string consisting of only one letter like aaaaa...
, all substrings are palindromic.
But in fact there are ways to construct a string of length K having exactly K palindromic substrings. One of the simplest is a string of the form: abcabcabcabcabcabc...
. It’s easy to see why it doesn’t have any palindromic substring of length > 1: any substring of length > 1 has at least one of the following substrings: ab
, bc
, and ca
. But the reversals of these substrings don’t appear anywhere in the string!
Thus, a very, very simple solution arises: Simply print a string of the form abcabcabcabcabcabc...
of length N!
Short
The above solution is probably the simplest you can ever get. But what if we are conserving letters, i.e., we want the string to be as short as possible?
In that case, we want our string to actually have lots of palindromic substrings of length > 1. As mentioned above, an example of this is a string consisting of only one character like aaaaa...
. If the length is K, then there are 1 + 2 + 3 + \ldots + K = K(K+1)/2 palindromic substrings. Thus, to form a string with N palindromic substrings, we simply find a K such that K(K+1)/2 = N. Unfortunately, not all N s can be represented as K(K+1)/2. In fact, such numbers are rare! Below x, there are only approximately \sqrt{2x} such values.
The idea in this solution is to append such single-character strings together to form one large string. For example, consider the strings aaa...aaabbb...bbb
. Suppose there are A a
s and B b
s. It’s easy to see that no palindromic substrings containing an a
and a b
because if so, the substring ab
would appear, but the reversal, ba
, doesn’t appear anywhere in the string! Thus, there are A(A+1)/2 + B(B+1)/2 palindromic substrings. Similarly, aaa...aaabbb...bbbccc...ccc
has A(A+1)/2 + B(B+1)/2 + C(C+1)/2 substrings, etc…
So our goal is to find a representation of N as a sum of triangle numbers where the sum of the bases is as small as possible. We can achieve this using dynamic programming: Let F(N) be the smallest sum of bases for any representation of N as a sum of triangle numbers. Then a simple recurrence can be derived by enumerating all possibities for the last summand. Specifically, if k(k+1)/2 is the last summand in the representation, then:
Our base case here is simply F(0). Along with F(N), we also store the k that minimizes the above, so that we can reconstruct the representation!
The following pseudocode illustrates this:
BOUND = 10011
F[0..BOUND]
K[0..BOUND]
for N = 1..BOUND:
F[N] = INFINITY
for k=0 increasing where k*(k+1)/2 <= N:
cost = k + F[N - k*(k+1)/2]
if F[N] > cost:
F[N] = cost
K[N] = k // store the k that minimizes
Now, to find the representation of some N, we simply use the K
array:
def find_representation(N):
rep = []
while N > 0:
k = K[N]
rep.append(k)
N -= k*(k+1)/2
return rep
Finally, we can construct the string itself using the representation with something like this:
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def construct(N):
rep = find_representation(N)
s = ''
for i=0..rep.length-1:
append alphabet[i] to s rep[i] times
return s
With this answer, it turns out that you only need a string of at most 161 characters, and at most 5 distinct characters!
Investigating the shortest possible strings
The algorithm above produces some really short strings. Unfortunately, they aren’t the shortest possible. The smallest counterexample is N = 26: The algorithm produces a string of length 10: aaaaaabbcd
. But in fact a shorter string is possible: aaaaaabaa
.
In fact, it doesn’t seem easy to figure out the shortest possible string. So unfortunately I don’t have the answer for that If you wish to investigate it, here are the optimal answers for every N \le 55, along with an optimal string. Perhaps you can find a pattern and then derive a formula for the optimal string.
N opt string
1 1 a
2 2 ab
3 2 aa
4 3 aab
5 4 aabc
6 3 aaa
7 4 aaab
8 5 aaabc
9 5 aaaba
10 4 aaaa
11 5 aaaab
12 6 aaaabc
13 6 aaaaba
14 7 aaaabac
15 5 aaaaa
16 6 aaaaab
17 7 aaaaabc
18 7 aaaaaba
19 8 aaaaabac
20 8 aaaaabab
21 6 aaaaaa
22 7 aaaaaab
23 8 aaaaaabc
24 8 aaaaaaba
25 9 aaaaaabac
26 9 aaaaaabab
27 9 aaaaaabaa
28 7 aaaaaaa
29 8 aaaaaaab
30 9 aaaaaaabc
31 9 aaaaaaaba
32 10 aaaaaaabac
33 10 aaaaaaabab
34 10 aaaaaaabaa
35 11 aaaaaaabaac
36 8 aaaaaaaa
37 9 aaaaaaaab
38 10 aaaaaaaabc
39 10 aaaaaaaaba
40 11 aaaaaaaabac
41 11 aaaaaaaabab
42 11 aaaaaaaabaa
43 12 aaaaaaaabaac
44 12 aaaaaaaabaab
45 9 aaaaaaaaa
46 10 aaaaaaaaab
47 11 aaaaaaaaabc
48 11 aaaaaaaaaba
49 12 aaaaaaaaabac
50 12 aaaaaaaaabab
51 12 aaaaaaaaabaa
52 13 aaaaaaaaabaac
53 13 aaaaaaaaabaab
54 14 aaaaaaaaabaabc
55 10 aaaaaaaaaa