Find All Anagrams in a String - LeetCode 438 Solution in python

Find All Anagrams in a String medium

Problem Statement

Given two strings s and p, return an array of all the start indices of p's anagrams in s. You may return the answer in any order.

An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Example 1:

Input: s = "cbaebabacd", p = "abc" Output: [0,6] Explanation: The substring with start index = 0 is "cba", which is an anagram of "abc". The substring with start index = 6 is "bac", which is an anagram of "abc".

Example 2:

Input: s = "abab", p = "ab" Output: [0,1,2] Explanation: The substring with start index = 0 is "ab", which is an anagram of "ab". The substring with start index = 1 is "ba", which is an anagram of "ab". The substring with start index = 2 is "ab", which is an anagram of "ab".

Steps:

Create Frequency Maps: Create a frequency map (dictionary) for the pattern string p to store the count of each character.
Sliding Window: Use a sliding window of the same size as the pattern string p to traverse the string s.
Update Window Frequency: As the window slides, update the frequency map for the current window.
Compare Frequencies: Compare the frequency map of the current window with the frequency map of the pattern. If they are identical, it means an anagram is found. Add the starting index of the window to the result list.
Handle Window Movement: When the window slides, subtract the frequency of the character leaving the window and add the frequency of the character entering the window.

Explanation:

The solution uses a sliding window technique along with frequency maps for efficient comparison. Instead of repeatedly checking for anagrams by sorting substrings (which is O(m log m) for each substring where m is the length of p), we use frequency counts, making the comparison O(k) where k is the size of the alphabet (26 for lowercase English letters). This significantly improves the overall efficiency.

The algorithm maintains a frequency count of characters within the sliding window and compares it against the frequency count of the pattern string. Only when both frequency counts match perfectly is an anagram considered found.

Code:

using System;
using System.Collections.Generic;

public class Solution {
    public IList<int> FindAnagrams(string s, string p) {
        IList<int> result = new List<int>();
        if (s == null || p == null || s.Length < p.Length) return result;

        Dictionary<char, int> pFreq = new Dictionary<char, int>();
        foreach (char c in p) {
            pFreq[c] = pFreq.ContainsKey(c) ? pFreq[c] + 1 : 1;
        }

        Dictionary<char, int> windowFreq = new Dictionary<char, int>();
        int windowStart = 0;
        int matched = 0;

        for (int windowEnd = 0; windowEnd < s.Length; windowEnd++) {
            char rightChar = s[windowEnd];
            if (pFreq.ContainsKey(rightChar)) {
                windowFreq[rightChar] = windowFreq.ContainsKey(rightChar) ? windowFreq[rightChar] + 1 : 1;
                if (windowFreq[rightChar] == pFreq[rightChar]) {
                    matched++;
                }
            }

            if (windowEnd - windowStart + 1 == p.Length) {
                if (matched == pFreq.Count) {
                    result.Add(windowStart);
                }

                char leftChar = s[windowStart];
                if (pFreq.ContainsKey(leftChar)) {
                    if (windowFreq[leftChar] == pFreq[leftChar]) {
                        matched--;
                    }
                    windowFreq[leftChar]--;
                }
                windowStart++;
            }
        }
        return result;
    }
}

Complexity:

Time Complexity: O(s), where s is the length of string s. We iterate through the string s once.
Space Complexity: O(k), where k is the number of unique characters in the pattern p. The space used is dominated by the frequency maps, which store at most the number of unique characters in p. In the worst case, this could be the size of the alphabet.