PROBLEM LINK
Practice http://www.codechef.com/INLO2015/problems/MOUSCH01
Contest http://www.codechef.com/INLO2015
Editorialist: Aviral Gupta http://www.codechef.com/users/aviral1310
DIFFICULTY:
Intermediate levels like SUMPLE-EASY, EASY-MEDIUM, MEDIUM-HARD are also possible.PREREQUISITES:
Dynamic Programming.
PROBLEM
To find out the minimum number of moves required transform a given word into another
given word when the following operations are allowed:
? Insert a character at any position
? Remove an existing character
? Modify an existing character
? Swap an existing character
QUICK EXPLANATION
Presented here are two algorithms: the first, simpler one, computes what is known as theoptimal string alignment (sometimes called the restricted edit distance), while the second one
computes the Damerau–Levenshtein distance with adjacent transpositions. Adding
transpositions adds significant complexity. The difference between the two algorithms
consists in that the optimal string alignment algorithm computes the number of edit
operations needed to make the strings equal under the condition that no substring is edited
more than once, whereas the second one presents no such restriction.
Take for example the edit distance between CA and ABC. The Damerau–Levenshtein
distance LD(CA,ABC) = 2 because CA ? AC ? ABC, but the optimal string alignment
distance OSA(CA,ABC) = 3 because if the operation CA ? AC is used, it is not possible to
use AC ? ABC because that would require the substring to be edited more than once, which
is not allowed in OSA, and therefore the shortest sequence of operations is CA ? A ? AB
? ABC. Note that for the optimal string alignment distance, the triangle inequality does not
hold: OSA(CA,AC) + OSA(AC,ABC) < OSA(CA,ABC), and so it is not a true metric.
Optimal string alignment distance
Firstly, let us consider a direct extension of the formula used to calculate Levenshtein
distance. Below is pseudocode for a function OptimalStringAlignmentDistance that takes two
strings, str1 of length lenStr1, and str2 of length lenStr2, and computes the optimal string
alignment distance between them:
int OptimalStringAlignmentDistance(char str1[1…lenStr1], char
str2[1…lenStr2])
// d is a table with lenStr1+1 rows and lenStr2+1 columns
declare int d[0..lenStr1, 0..lenStr2]
// i and j are used to iterate over str1 and str2
declare int i, j, cost
// for loop is inclusive, need table 1 row/column larger than string
length
for i from 0 to lenStr1
d[i, 0] := i
for j from 1 to lenStr2
d[0, j] := j
// pseudo-code assumes string indices start at 1, not 0
// if implemented, make sure to start comparing at 1st letter of
strings
for i from 1 to lenStr1
for j from 1 to lenStr2
if str1[i] = str2[j] then cost := 0
else cost := 1
d[i, j] := minimum(
d[i-1, j ] + 1, // deletion
d[i , j-1] + 1, // insertion
d[i-1, j-1] + cost // substitution
)
if(i > 1 and j > 1 and str1[i] = str2[j-1] and str1[i-1] =
str2[j]) then
d[i, j] := minimum(
d[i, j],
d[i-2, j-2] + cost // transposition
)
return d[lenStr1, lenStr2]
Basically this is the algorithm to compute Levenshtein distance with one additional
recurrence:
if(i > 1 and j > 1 and str1[i] = str2[j-1] and str1[i-1] =
str2[j]) then
d[i, j] := minimum(
d[i, j],
d[i-2, j-2] + cost // transposition
)
Distance with adjacent transpositions
Here is the second algorithm that computes the true Damerau–Levenshtein distance with
adjacent transpositions (ActionScript 3.0); this function requires as an additional parameter
the size of the alphabet ©, so that all entries of the arrays are in 0…(C-1):
static public function damerauLevenshteinDistance(a:Array, b:Array,
C:uint):uint
{
// "infinite" distance is just the max possible distance
var INF:uint = a.length + b.length;
// make and initialize the character array indices
var DA:Array = new Array(C);
for (var k:uint = 0; k < C; ++k) DA[k]=0;
// make the distance matrix H[-1..a.length][-1..b.length]
var H:matrix = new matrix(a.length+2,b.length+2);
// initialize the left and top edges of H
H[-1][-1] = INF;
for (var i:uint = 0; i <= a.length; ++i)
{
H[i][-1] = INF;
H[i][ 0] = i;
}
for (var j:uint = 0; j <= b.length; ++j)
{
H[-1][j] = INF;
H[ 0][j] = j;
}
// fill in the distance matrix H
// look at each character in a
for (var i:uint = 1; i <= a.length; ++i)
{
var DB:uint = 0;
// look at each character in b
for (var j:uint = 1; j <= b.length; ++j)
{
var i1:uint = DA[b[j-1]];
var j1:uint = DB;
var cost:uint;
if (a[i-1] == b[j-1])
{
cost = 0;
DB = j;
}
else
cost = 1;
H[i][j] = Math.min( H[i-1 ][j-1 ] + cost, // substitution
H[i ][j-1 ] + 1, // insertion
H[i-1 ][j ] + 1, // deletion
H[i1-1][j1-1] + (i-i1-1) + 1 + (j-j1-
1));
}
DA[a[i-1]] = i;
}
return H[a.length][b.length];
}