Regular Expression Matching hard

Problem Statement

Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*'.

  • '.' Matches any single character.
  • '*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Example 1

Input: s = "aa", p = "a*" Output: true Explanation: '*' means zero or more 'a's, so "aa" matches "a*".

Example 2

Input: s = "aab", p = "c*a*b" Output: true Explanation: c* matches 0 occurrences of 'c', 'a*' matches 2 occurrences of 'a', and 'b' matches 'b'.

Steps and Explanation

This problem can be efficiently solved using Dynamic Programming. We create a DP table dp where dp[i][j] represents whether the first i characters of s match the first j characters of p.

The base cases are:

  • dp[0][0] = true (empty string matches empty pattern)
  • dp[i][0] = false for i > 0 (non-empty string cannot match empty pattern)
  • dp[0][j] = dp[0][j-2] if p[j-1] == '*' (empty string can match pattern ending with * if the preceding character also matches zero times)

The recursive relation is:

  • If p[j-1] is not '*':
    • If s[i-1] == p[j-1] or p[j-1] == '.', dp[i][j] = dp[i-1][j-1] (Match single characters)
    • Otherwise, dp[i][j] = false (Mismatch)
  • If p[j-1] is '*':
    • dp[i][j] = dp[i][j-2] (Match zero occurrences of the preceding element)
    • If s[i-1] == p[j-2] or p[j-2] == '.', dp[i][j] = dp[i][j] || dp[i-1][j] (Match one or more occurrences)

Code (C#)

public class Solution {
    public bool IsMatch(string s, string p) {
        int m = s.Length;
        int n = p.Length;
        bool[,] dp = new bool[m + 1, n + 1];
        dp[0, 0] = true;

        for (int j = 1; j <= n; j++) {
            if (p[j - 1] == '*') {
                dp[0, j] = dp[0, j - 2];
            }
        }

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (p[j - 1] == '*') {
                    dp[i, j] = dp[i, j - 2];
                    if (s[i - 1] == p[j - 2] || p[j - 2] == '.') {
                        dp[i, j] = dp[i, j] || dp[i - 1, j];
                    }
                } else if (s[i - 1] == p[j - 1] || p[j - 1] == '.') {
                    dp[i, j] = dp[i - 1, j - 1];
                }
            }
        }

        return dp[m, n];
    }
}

Complexity

  • Time Complexity: O(mn), where 'm' is the length of string 's' and 'n' is the length of pattern 'p'. This is due to the nested loops iterating through the DP table.
  • Space Complexity: O(mn), due to the DP table. While we could optimize the space to O(n) by using only two rows of the DP table, this solution maintains readability and clarity.