Regular Expression Matching hard
Problem Statement
Given a string s
and a string p
which represents a regular expression, implement a function to check if s
matches p
. The following are the matching rules:
.
matches any single character.*
matches zero or more occurrences of the preceding element.
The matching should cover the entire input string (s
).
Example 1
Input: s = "aa", p = "a*" Output: true Explanation: '' matches zero occurrences of 'a', thus "aa" matches "a".
Example 2
Input: s = "aab", p = "cab" Output: true Explanation: 'c*' matches zero occurrences of 'c', 'a*' matches two occurrences of 'a', and 'b' matches 'b'.
Steps
The problem can be solved efficiently using dynamic programming. We'll create a DP table dp
where dp[i][j]
represents whether the first i
characters of s
match the first j
characters of p
.
-
Initialization:
dp[0][0] = true
(empty string matches empty pattern)dp[i][0] = false
fori > 0
(non-empty string can't match empty pattern)dp[0][j]
needs special handling for the patternp
. Ifp
starts witha*
,b*
, etc., it might match the empty string. We handle this recursively.
-
Iteration: For each
i
andj
, we consider several cases:- If
p[j-1]
is a literal character and it matchess[i-1]
, thendp[i][j] = dp[i-1][j-1]
(matching the previous characters). - If
p[j-1]
is '.', it matches any character, sodp[i][j] = dp[i-1][j-1]
. - If
p[j-1]
is '*', we have two possibilities:- The '' matches zero occurrences of the preceding element. In this case,
dp[i][j] = dp[i][j-2]
(ignore the '' and the preceding element). - The '*' matches one or more occurrences of the preceding element. In this case, we need
p[j-2]
to matchs[i-1]
(or '.' to match anything), anddp[i][j] = dp[i-1][j]
.
- The '' matches zero occurrences of the preceding element. In this case,
- If
-
Result:
dp[s.length()][p.length()]
will indicate whether the entire strings
matches the entire patternp
.
Explanation
The dynamic programming approach breaks down the problem into smaller subproblems. By building the dp
table bottom-up, we avoid redundant calculations and find the solution efficiently. The different cases within the iteration handle the various scenarios presented by literal characters, '.', and ''. The key is understanding how the '' operator affects the matching process—it can either match zero occurrences or one or more occurrences, requiring us to explore both possibilities.
Code
class Solution {
public boolean isMatch(String s, String p) {
int m = s.length();
int n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
for (int j = 1; j <= n; j++) {
if (p.charAt(j - 1) == '*') {
dp[0][j] = dp[0][j - 2];
}
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (p.charAt(j - 1) == '.' || p.charAt(j - 1) == s.charAt(i - 1)) {
dp[i][j] = dp[i - 1][j - 1];
} else if (p.charAt(j - 1) == '*') {
dp[i][j] = dp[i][j - 2]; // Match zero occurrences
if (p.charAt(j - 2) == '.' || p.charAt(j - 2) == s.charAt(i - 1)) {
dp[i][j] = dp[i][j] || dp[i - 1][j]; // Match one or more occurrences
}
}
}
}
return dp[m][n];
}
}
Complexity
- Time Complexity: O(mn), where m is the length of string
s
and n is the length of patternp
. This is because we iterate through the DP table of size (m+1) x (n+1). - Space Complexity: O(mn), due to the DP table we use. We could potentially optimize this to O(min(m,n)) by only storing the current and previous rows of the DP table.