LTM40EF - Editorial

pkacprzak · September 24, 2016, 5:35pm

PROBLEM LINK:

Practice
Contest

Author: Misha Chorniy
Tester: Sergey Kulik
Editorialist: Pawel Kacprzak

DIFFICULTY:

MEDIUM

PREREQUISITES:

Meet in the Middle, sorting, binary search

PROBLEM:

The task is to return the K^th smallest word among all words of length M consisting of the first N letters of English alphabet. The rank of a word S is denoted by cost(S) = P_{1, S₁} + P_{2, S₂} + … + P_{M, S_M}, where P is integer matrix of size M×N and S_i is the rank of the i^th letter of S in English alphabet. If two words S and W have equal scores, then S is consider smaller than W if and only if S is lexicographically smaller than W.

QUICK EXPLANATION:

For the full credit, use technique called Meet in the Middle. Split the resulting string in two halves. First, generate all possible strings corresponding to the first half and order them by their scores. Next, do the same for all possible strings corresponding to the second half of the resulting string. Use binary search and these two generated sets in order to compute exact score X of the K^th string. Finally, use strings generated for the halves in order to get the wanted string among all strings with score X. For easier subtasks, more explicit solutions can be used.

EXPLANATION:

Subtask 1

In the first subtask we have K ≤ 10⁴, so the problem is significantly easier to approach. There are perhaps several promising solutions here, one of them may be to generate the set of K smallest words explicitly. It can be done for example by starting with a set only consisting of the word with the smallest score, which can be computed by taking the smallest entries in P matrix and try to extend the set by adding new words by exchanging single letters in words already in the set to the letters with greater cost at corresponding positions. We continue this process until there are at least K words in the set. Then the answer can be computed by just selecting the K^th of them in the sorted order.

Subtask 2

In the second subtask we know that M is at most 5. It follows that there are at most 16⁵ = 1048576 possible strings. This is small enough number to generate all of them. After that, the only remaining thing is to return the K^th of them. This can be done using method called QuickSelect in expected linear time, or by explicitly sorting all the strings, which should also pass the test cases if the whole solution is implemented efficiently.

Subtask 3

In the third subtask any method based on explicitly enumeration of all the strings will result in Time Limit Exceeded, because there are up to 16>sup>10 > 10¹² possible strings. However, a technique called Meet in the Middle can be used there. The general idea is to solve two problems of sizes of of the original problem and combine these results into the final result.

The first step is to divide the resulting string into two halves H₁ and H₂. First corresponding to the first ⌊M/2⌋ letters and the second corresponding to the last ⌈M/2⌉ letters in it. For example, if M = 5 and N = 2 as in the sample, then H₁ corresponds to the first 2 letters of the result and H₂ corresponds to its last 3 letters.

Next step is to generate explicitly all possible string falling into each of the halves along with their scores. Since any of the halves has size of at most 5, there are at most 16⁵ valid strings that can be generated for a single half. For each of the halves we want the strings corresponding to them to be sorted by their scores first and then by lexicographically in case of equal scores.

After than, the goal is to combine these subsequent results into the final result. However, it may be not obvious how to do that even if one is familiar with Meet in the Middle technique. The idea here is to first compute the exact score of the K^th string. This can be done using binary search for the smallest value X, such that there are at least K strings with score less or equal to X. The number of such strings for a fixed value of X can be computed using two-pointer technique over sorted results computed for the halves - this is the reason we want them to be sorted. When the smallest value of X is computed, we know that there are at least K strings with score less or equal to X, and that there are less than K strings with score less than X. It means that the resulting string has the score X and moreover, we know that its rank among all strings with scores of X equals K minus the number of strings with score less that X, which we already know how compute.

The problem now is reduced to finding the K^th string among all strings with score X (note that K is now different than the K in the input and denotes the rank among strings with score X, not among all of them). This task is significantly easier than the original one. There are a few possible approaches to solve it, one of them is to first sort all the strings in H₁ lexicographically. Then, process them from starting from the smallest one in that order. Let’s say that it has a score of Y. It follows that it can be paired with any string in H₂ with score X - Y to be combined for a string of total score of X. Thus, if there are less or equal than K strings with score X - Y in H₂, we combine the string from the second half with the K^th of them to get the result and we are done. Otherwise, we reduce K by the number of strings in H₂ with score X - Y and proceed to the next string in lexicographical order in H₁. Notice that both the number of strings with a given score in H₂ and the string with any rank among them can be computed using binary search. At some point, the result will be formed and the problem is solved. The total time complexity of this method is dominated by sorting phases and if we assume that comparing scores of two strings with respect to their numerical value and their lexicographical relative order takes constant time, then time complexity can be bounded by O(N^M/2 * log(N^M/2)).

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter’s solution can be found here.
Tester’s solution can be found here.