Regular Expression Matching hard
Problem Statement
Given an input string s
and a pattern p
, implement regular expression matching with support for '.'
and '*'
.
'.'
Matches any single character.
'*'
Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Example 1
Input: s = "aa", p = "a*" Output: true Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' zero times, it matches the input string "aa".
Example 2
Input: s = "aab", p = "cab" Output: true Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".
Steps
The problem can be solved efficiently using dynamic programming. We create a DP table dp[m+1][n+1]
, where m
is the length of s
and n
is the length of p
. dp[i][j]
is true if the first i
characters of s
match the first j
characters of p
, and false otherwise.
We iterate through the DP table, filling each cell based on the following rules:
-
Base Case:
dp[0][0] = true
(empty string matches empty pattern).dp[i][0] = false
fori > 0
(non-empty string can't match an empty pattern).dp[0][j]
depends on the pattern: ifp[j-1]
is '*', then it can match zero occurrences of the preceding character. -
Matching a single character: If
s[i-1] == p[j-1]
orp[j-1] == '.'
, thendp[i][j] = dp[i-1][j-1]
. The current characters match, so we check if the previous substrings matched. -
Matching zero or more characters using '*': If
p[j-1] == '*'
, there are two possibilities:- The '*' matches zero occurrences of the preceding element:
dp[i][j] = dp[i][j-2]
- The '*' matches one or more occurrences of the preceding element:
dp[i][j] = dp[i-1][j]
(only ifs[i-1] == p[j-2]
orp[j-2] == '.'
)
- The '*' matches zero occurrences of the preceding element:
Explanation
The dynamic programming approach systematically explores all possible matches. By building up the solution from smaller subproblems, we avoid redundant calculations and ensure correctness. The base cases handle the trivial scenarios of empty strings and patterns. The recursive relationships capture the logic of character matching and the special handling of the '*' character.
Code
#include <iostream>
#include <string>
#include <vector>
using namespace std;
bool isMatch(string s, string p) {
int m = s.length();
int n = p.length();
vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
dp[0][0] = true;
for (int j = 1; j <= n; j++) {
if (p[j - 1] == '*') {
dp[0][j] = dp[0][j - 2];
}
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (s[i - 1] == p[j - 1] || p[j - 1] == '.') {
dp[i][j] = dp[i - 1][j - 1];
} else if (p[j - 1] == '*') {
dp[i][j] = dp[i][j - 2]; // Match zero occurrences
if (s[i - 1] == p[j - 2] || p[j - 2] == '.') {
dp[i][j] = dp[i][j] || dp[i - 1][j]; // Match one or more occurrences
}
}
}
}
return dp[m][n];
}
int main() {
string s1 = "aa", p1 = "a*";
string s2 = "aab", p2 = "c*a*b";
string s3 = "mississippi", p3 = "mis*is*p*.";
cout << "Example 1: " << isMatch(s1, p1) << endl; // true
cout << "Example 2: " << isMatch(s2, p2) << endl; // true
cout << "Example 3: " << isMatch(s3, p3) << endl; //false
return 0;
}
Complexity
- Time Complexity: O(mn), where m is the length of
s
and n is the length ofp
. We iterate through the DP table once. - Space Complexity: O(mn), due to the size of the DP table. This can be optimized to O(n) if we only store the previous row of the DP table.