Regular Expression Matching hard
Problem Statement
Given an input string s
and a pattern p
, implement regular expression matching with support for '.'
and '*'
.
'.'
Matches any single character.'*'
Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Example 1
Input: s = "aa"
, p = "a*"
Output: true
Explanation: '*'
means zero or more 'a's, so "aa" matches "a*".
Example 2
Input: s = "aab"
, p = "c*a*b"
Output: true
Explanation: c*
matches 0 occurrences of 'c', 'a*' matches 2 occurrences of 'a', and 'b' matches 'b'.
Steps and Explanation
This problem can be efficiently solved using Dynamic Programming. We create a DP table dp
where dp[i][j]
represents whether the first i
characters of s
match the first j
characters of p
.
The base cases are:
dp[0][0] = true
(empty string matches empty pattern)dp[i][0] = false
fori > 0
(non-empty string cannot match empty pattern)dp[0][j] = dp[0][j-2]
ifp[j-1] == '*'
(empty string can match pattern ending with*
if the preceding character also matches zero times)
The recursive relation is:
- If
p[j-1]
is not'*'
:- If
s[i-1] == p[j-1]
orp[j-1] == '.'
,dp[i][j] = dp[i-1][j-1]
(Match single characters) - Otherwise,
dp[i][j] = false
(Mismatch)
- If
- If
p[j-1]
is'*'
:dp[i][j] = dp[i][j-2]
(Match zero occurrences of the preceding element)- If
s[i-1] == p[j-2]
orp[j-2] == '.'
,dp[i][j] = dp[i][j] || dp[i-1][j]
(Match one or more occurrences)
Code (C#)
public class Solution {
public bool IsMatch(string s, string p) {
int m = s.Length;
int n = p.Length;
bool[,] dp = new bool[m + 1, n + 1];
dp[0, 0] = true;
for (int j = 1; j <= n; j++) {
if (p[j - 1] == '*') {
dp[0, j] = dp[0, j - 2];
}
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (p[j - 1] == '*') {
dp[i, j] = dp[i, j - 2];
if (s[i - 1] == p[j - 2] || p[j - 2] == '.') {
dp[i, j] = dp[i, j] || dp[i - 1, j];
}
} else if (s[i - 1] == p[j - 1] || p[j - 1] == '.') {
dp[i, j] = dp[i - 1, j - 1];
}
}
}
return dp[m, n];
}
}
Complexity
- Time Complexity: O(mn), where 'm' is the length of string 's' and 'n' is the length of pattern 'p'. This is due to the nested loops iterating through the DP table.
- Space Complexity: O(mn), due to the DP table. While we could optimize the space to O(n) by using only two rows of the DP table, this solution maintains readability and clarity.