Longest Common Subsequence medium
Problem Statement
Given two strings text1
and text2
, return the length of their longest common subsequence. A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg., "ace" is a subsequence of "abcde").
A common subsequence of two strings is a subsequence that is common to both strings.
If there is no common subsequence, return 0.
Example 1
- Input: text1 = "abcde", text2 = "ace"
- Output: 3
- Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2
- Input: text1 = "abc", text2 = "abc"
- Output: 3
- Explanation: The longest common subsequence is "abc" and its length is 3.
Example 3
- Input: text1 = "abc", text2 = "def"
- Output: 0
- Explanation: There is no such common subsequence, so the result is 0.
Steps
The problem can be solved efficiently using dynamic programming. We'll create a 2D array dp
where dp[i][j]
represents the length of the longest common subsequence of text1[0...i-1]
and text2[0...j-1]
.
-
Initialization: Create a
dp
array of size (m+1) x (n+1), where m and n are the lengths oftext1
andtext2
respectively. Initialize the first row and first column to 0. -
Iteration: Iterate through the
dp
array. For each celldp[i][j]
:- If
text1[i-1]
equalstext2[j-1]
, it means we've found a common character. In this case,dp[i][j]
will bedp[i-1][j-1] + 1
. - Otherwise,
dp[i][j]
will be the maximum ofdp[i-1][j]
anddp[i][j-1]
, representing either taking the LCS from the previous row or the previous column.
- If
-
Result: The value in
dp[m][n]
will be the length of the longest common subsequence oftext1
andtext2
.
Explanation
The dynamic programming approach builds up the solution from smaller subproblems. Each cell dp[i][j]
depends only on the cells to its left (dp[i][j-1]
), above (dp[i-1][j]
), and diagonally above (dp[i-1][j-1]
). This overlapping subproblem property makes dynamic programming suitable. The algorithm efficiently avoids redundant computations.
Code (C#)
public class Solution {
public int LongestCommonSubsequence(string text1, string text2) {
int m = text1.Length;
int n = text2.Length;
// Create a DP table to store lengths of LCS of substrings.
// Note: We add an extra row and column to handle base cases (empty substrings).
int[,] dp = new int[m + 1, n + 1];
// Initialize the first row and column to 0 (LCS of empty substring with any other substring is 0).
for (int i = 0; i <= m; i++) {
dp[i, 0] = 0;
}
for (int j = 0; j <= n; j++) {
dp[0, j] = 0;
}
// Iterate through the strings and populate the DP table.
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (text1[i - 1] == text2[j - 1]) {
// If characters match, extend the LCS.
dp[i, j] = dp[i - 1, j - 1] + 1;
} else {
// If characters don't match, take the maximum from the previous row or column.
dp[i, j] = Math.Max(dp[i - 1, j], dp[i, j - 1]);
}
}
}
// The bottom-right cell contains the length of the LCS.
return dp[m, n];
}
}
Complexity
- Time Complexity: O(m*n), where m and n are the lengths of the input strings. We iterate through the entire DP table once.
- Space Complexity: O(m*n), due to the size of the DP table. We could optimize this to O(min(m,n)) using space optimization techniques (keeping only the current and previous row).