Edit Distance medium
Problem Statement
Given two strings word1
and word2
, return the minimum number of operations required to convert word1
to word2
.
You have the following three operations permitted on a word:
- Insert a character
- Delete a character
- Replace a character
Example 1:
Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation:
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')
Steps and Explanation
This problem can be solved efficiently using dynamic programming. We create a DP table dp
where dp[i][j]
represents the minimum edit distance between the first i
characters of word1
and the first j
characters of word2
.
-
Initialization:
dp[i][0] = i
for alli
, because to convert a string of lengthi
to an empty string, we needi
deletions.dp[0][j] = j
for allj
, because to convert an empty string to a string of lengthj
, we needj
insertions.dp[0][0] = 0
, because the edit distance between two empty strings is 0.
-
Iteration:
- We iterate through the DP table, calculating
dp[i][j]
for each cell. - There are three possibilities to consider:
- Replace: If
word1[i-1] == word2[j-1]
, no operation is needed, sodp[i][j] = dp[i-1][j-1]
. - Replace: If
word1[i-1] != word2[j-1]
, we need a replacement, sodp[i][j] = dp[i-1][j-1] + 1
. - Insert: We can insert
word2[j-1]
intoword1
, sodp[i][j] = dp[i][j-1] + 1
. - Delete: We can delete
word1[i-1]
fromword1
, sodp[i][j] = dp[i-1][j] + 1
.
- Replace: If
- We take the minimum of these three possibilities to find the minimum edit distance.
- We iterate through the DP table, calculating
-
Result:
dp[m][n]
(wherem
andn
are the lengths ofword1
andword2
respectively) will contain the minimum edit distance betweenword1
andword2
.
Code
def minDistance(word1, word2):
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = min(dp[i - 1][j - 1], dp[i - 1][j], dp[i][j - 1]) + 1
return dp[m][n]
Complexity
- Time Complexity: O(m*n), where m and n are the lengths of word1 and word2 respectively. This is due to the nested loops iterating through the DP table.
- Space Complexity: O(m*n), due to the DP table used to store intermediate results. This could be optimized to O(min(m,n)) using space optimization techniques but this adds complexity to the code.