Longest Common Subsequence medium
Problem Statement
Given two strings text1
and text2
, return the length of their longest common subsequence. A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg., "ace" is a subsequence of "abcde"). A common subsequence of two strings is a subsequence that is common to both strings.
If there is no common subsequence, return 0.
Example 1
Input: text1 = "abcde", text2 = "ace"
Output: 3
Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2
Input: text1 = "abc", text2 = "abc" Output: 3 Explanation: The longest common subsequence is "abc" and its length is 3.
Steps
The problem can be solved efficiently using dynamic programming. We'll create a 2D array dp
where dp[i][j]
represents the length of the longest common subsequence of text1[0...i-1]
and text2[0...j-1]
.
-
Initialization: Create a
dp
array of size (m+1) x (n+1), where 'm' is the length oftext1
and 'n' is the length oftext2
. Initialize the first row and column to 0. -
Iteration: Iterate through the
dp
array. For each celldp[i][j]
:- If
text1[i-1] == text2[j-1]
, it means we found a common character. Therefore,dp[i][j] = dp[i-1][j-1] + 1
. - Otherwise,
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
. We take the maximum length from either excluding the current character fromtext1
ortext2
.
- If
-
Result: The value at
dp[m][n]
will be the length of the longest common subsequence.
Explanation
The dynamic programming approach builds up the solution incrementally. Each cell dp[i][j]
represents a subproblem: finding the LCS of prefixes of the input strings. By solving these smaller subproblems and storing the results, we avoid redundant computations. The recurrence relation dp[i][j] = max(dp[i-1][j], dp[i][j-1])
handles the case where the characters don't match; we choose the maximum LCS from the possibilities of excluding the current character from either string. If they do match (text1[i-1] == text2[j-1]
), we extend the LCS by 1.
Code
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
using namespace std;
int longestCommonSubsequence(string text1, string text2) {
int m = text1.length();
int n = text2.length();
// Create a DP table to store lengths of LCS of substrings.
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
// Fill the DP table bottom-up
for (int i = 1; i <= m; ++i) {
for (int j = 1; j <= n; ++j) {
if (text1[i - 1] == text2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
return dp[m][n];
}
int main() {
string text1 = "abcde";
string text2 = "ace";
cout << "Length of LCS: " << longestCommonSubsequence(text1, text2) << endl; // Output: 3
text1 = "abc";
text2 = "abc";
cout << "Length of LCS: " << longestCommonSubsequence(text1, text2) << endl; // Output: 3
text1 = "AGGTAB";
text2 = "GXTXAYB";
cout << "Length of LCS: " << longestCommonSubsequence(text1, text2) << endl; //Output 4
return 0;
}
Complexity
- Time Complexity: O(m*n), where 'm' and 'n' are the lengths of the input strings. This is because we iterate through the entire DP table.
- Space Complexity: O(m*n) to store the DP table. We could optimize this to O(min(m,n)) space by only storing the previous row or column during the iteration, but this would make the code slightly more complex.