Longest Common Subsequence medium
Problem Statement
Given two strings text1
and text2
, return the length of their longest common subsequence. If there is no common subsequence, return 0.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters. For example, "ace" is a subsequence of "abcde".
A common subsequence of two strings is a subsequence that is common to both strings.
Example 1:
Input: text1 = "abcde", text2 = "ace"
Output: 3
Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2:
Input: text1 = "abc", text2 = "abc" Output: 3 Explanation: The longest common subsequence is "abc" and its length is 3.
Steps to Solve
The problem can be efficiently solved using dynamic programming. We'll create a 2D array (matrix) dp
where dp[i][j]
represents the length of the longest common subsequence of text1[:i+1]
and text2[:j+1]
.
-
Initialization: Create a
dp
matrix of size (len(text1)+1) x (len(text2)+1) and initialize all elements to 0. The extra row and column are for handling base cases (empty subsequences). -
Iteration: Iterate through the
dp
matrix. For each celldp[i][j]
:- If
text1[i-1] == text2[j-1]
, it means we found a common character. In this case,dp[i][j] = dp[i-1][j-1] + 1
. We extend the LCS by one. - Otherwise,
text1[i-1]
andtext2[j-1]
are different. The LCS length is the maximum of the LCS lengths ending atdp[i-1][j]
(excludingtext1[i-1]
) anddp[i][j-1]
(excludingtext2[j-1]
). So,dp[i][j] = max(dp[i-1][j], dp[i][j-1])
.
- If
-
Result: The bottom-right cell
dp[len(text1)][len(text2)]
will contain the length of the longest common subsequence oftext1
andtext2
.
Explanation
The dynamic programming approach systematically builds up the solution from smaller subproblems. Each cell dp[i][j]
depends only on the values of its top, left, and top-left neighbors. This overlapping subproblems property makes dynamic programming highly efficient. The base cases (first row and column of dp
) represent the LCS lengths when one of the strings is empty (which is always 0).
Code
def longestCommonSubsequence(text1, text2):
m, n = len(text1), len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
return dp[m][n]
Complexity
- Time Complexity: O(m*n), where 'm' and 'n' are the lengths of
text1
andtext2
respectively. This is because we iterate through the entiredp
matrix. - Space Complexity: O(m*n) to store the
dp
matrix. We could optimize this to O(min(m,n)) space by using only two rows of the dp matrix.