Edit Distance medium
Problem Statement
Given two strings word1
and word2
, return the minimum number of operations required to convert word1
to word2
.
You have the following three operations permitted on a word:
- Insert a character
- Delete a character
- Replace a character
Example 1:
Input: word1 = "horse", word2 = "ros" Output: 3 Explanation: horse -> rorse (replace 'h' with 'r') rorse -> rose (remove 'r') rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution" Output: 5 Explanation: intention -> inention (remove 't') inention -> enention (replace 'i' with 'e') enention -> exention (replace 'n' with 'x') exention -> exection (replace 'n' with 'c') exection -> execution (insert 'u')
Steps
The problem can be solved efficiently using dynamic programming. We'll create a DP table dp
where dp[i][j]
represents the minimum edit distance between the first i
characters of word1
and the first j
characters of word2
.
-
Initialization:
dp[i][0] = i
for alli
(because we needi
deletions to transformword1[0...i]
to an empty string).dp[0][j] = j
for allj
(because we needj
insertions to transform an empty string toword2[0...j]
).
-
Iteration:
- Iterate through the DP table, calculating
dp[i][j]
based on the following recursive relation:- If
word1[i-1] == word2[j-1]
:dp[i][j] = dp[i-1][j-1]
(no operation needed) - Otherwise:
dp[i][j] = min(dp[i-1][j] + 1, dp[i][j-1] + 1, dp[i-1][j-1] + 1)
(deletion, insertion, or replacement)
- If
- Iterate through the DP table, calculating
-
Result:
dp[m][n]
(wherem
andn
are the lengths ofword1
andword2
respectively) will contain the minimum edit distance.
Explanation
The dynamic programming approach cleverly avoids redundant calculations by storing the results of subproblems. Each cell in the dp
table represents a subproblem: finding the minimum edit distance between prefixes of the two input strings. By building up the table from smaller subproblems to larger ones, we efficiently find the solution for the overall problem. The recursive relation ensures we consider all possible operations (insert, delete, replace) at each step, choosing the minimum cost.
Code
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
using namespace std;
int minDistance(string word1, string word2) {
int m = word1.length();
int n = word2.length();
// Create DP table and initialize
vector<vector<int>> dp(m + 1, vector<int>(n + 1));
for (int i = 0; i <= m; i++) {
dp[i][0] = i;
}
for (int j = 0; j <= n; j++) {
dp[0][j] = j;
}
// Iterate and fill the DP table
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (word1[i - 1] == word2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1];
} else {
dp[i][j] = min({dp[i - 1][j] + 1, dp[i][j - 1] + 1, dp[i - 1][j - 1] + 1});
}
}
}
return dp[m][n];
}
int main() {
string word1 = "horse";
string word2 = "ros";
cout << minDistance(word1, word2) << endl; // Output: 3
word1 = "intention";
word2 = "execution";
cout << minDistance(word1, word2) << endl; // Output: 5
return 0;
}
Complexity
- Time Complexity: O(m*n), where m and n are the lengths of
word1
andword2
. This is due to the nested loops iterating through the DP table. - Space Complexity: O(m*n) to store the DP table. The space complexity can be optimized to O(min(m, n)) by using only two rows of the DP table.