Design Add and Search Words Data Structure medium

Problem Statement

Design a data structure that supports adding new words and finding if a string matches any previously added string.

Implement the WordDictionary class:

  • WordDictionary(): Initializes the object.
  • void addWord(word): Adds word to the data structure.
  • bool search(word): Returns true if there is any string in the data structure that matches word or false otherwise. word may contain dots '.' where dots can be matched with any letter.

Example 1

Input
["WordDictionary","addWord","addWord","addWord","search","search","search","search"]
[[],["bad"],["dad"],["mad"],["pad"],["bad"],[".ad"],["b.."]]
Output
[null,null,null,null,false,true,true,true]
Explanation
WordDictionary wordDictionary = new WordDictionary();
wordDictionary.addWord("bad");
wordDictionary.addWord("dad");
wordDictionary.addWord("mad");
wordDictionary.search("pad"); // return False
wordDictionary.search("bad"); // return True
wordDictionary.search(".ad"); // return True
wordDictionary.search("b.."); // return True

Example 2

Input
["WordDictionary","addWord","search"]
[[],["a"],["."]]
Output
[null,null,true]
Explanation
WordDictionary wordDictionary = new WordDictionary();
wordDictionary.addWord("a");
wordDictionary.search("."); // return True

Steps

  1. Data Structure: We'll use a Trie (prefix tree) to efficiently store and search words. Each node in the Trie represents a character, and the path from the root to a leaf node represents a word.

  2. addWord(word): Traverse the Trie, adding nodes for each character in the word. If a node already exists for a character, we use it; otherwise, we create a new node. Mark the leaf node as the end of a word.

  3. search(word): Recursively traverse the Trie. If a character is a dot (.), recursively explore all possible child nodes. If a character is a letter, we only explore the child node corresponding to that letter. If we reach the end of the word and the current node marks the end of a word, we return true. If we reach the end of the word and the current node doesn't mark the end of a word or we can't find a matching path, return false.

Explanation

The Trie efficiently handles prefix searches. The search function's recursive nature allows for flexible matching with dots. The use of a boolean isWord flag in each Trie node efficiently indicates the end of a valid word.

Code

using System.Collections.Generic;

public class WordDictionary {
    private class TrieNode {
        public Dictionary<char, TrieNode> children;
        public bool isWord;

        public TrieNode() {
            children = new Dictionary<char, TrieNode>();
            isWord = false;
        }
    }

    private TrieNode root;

    public WordDictionary() {
        root = new TrieNode();
    }

    public void AddWord(string word) {
        TrieNode node = root;
        foreach (char c in word) {
            if (!node.children.ContainsKey(c)) {
                node.children[c] = new TrieNode();
            }
            node = node.children[c];
        }
        node.isWord = true;
    }

    public bool Search(string word) {
        return SearchHelper(word, 0, root);
    }

    private bool SearchHelper(string word, int index, TrieNode node) {
        if (index == word.Length) {
            return node.isWord;
        }

        char c = word[index];
        if (c == '.') {
            foreach (TrieNode child in node.children.Values) {
                if (SearchHelper(word, index + 1, child)) {
                    return true;
                }
            }
            return false;
        } else {
            if (node.children.ContainsKey(c)) {
                return SearchHelper(word, index + 1, node.children[c]);
            } else {
                return false;
            }
        }
    }
}

Complexity

  • Time Complexity:

    • addWord(word): O(M), where M is the length of the word.
    • search(word): In the worst case, O(N * 26^L), where N is the number of words in the Trie, and L is the length of the search word (due to exploring all possible branches with dots). However, in practice, it's often much faster.
  • Space Complexity: O(N * M), where N is the number of words added, and M is the average length of a word (to store the Trie).