Design Add and Search Words Data Structure medium

Problem Statement

Design a data structure that supports adding new words and finding if a string matches any previously added string.

Implement the WordDictionary class:

  • WordDictionary() Initializes the object.
  • void addWord(word) Adds word to the data structure.
  • bool search(word) Returns true if there is any string in the data structure that matches word or false otherwise. word may contain dots '.' where dots can be matched with any letter.

Example 1

Input
["WordDictionary","addWord","addWord","addWord","search","search","search","search"]
[[],["bad"],["dad"],["mad"],["pad"],["bad"],[".ad"],["b.."]]
Output
[null,null,null,null,false,true,true,true]
Explanation
WordDictionary wordDictionary = new WordDictionary();
wordDictionary.addWord("bad");
wordDictionary.addWord("dad");
wordDictionary.addWord("mad");
wordDictionary.search("pad"); // return False
wordDictionary.search("bad"); // return True
wordDictionary.search(".ad"); // return True
wordDictionary.search("b.."); // return True

Example 2

Input
["WordDictionary","addWord","search"]
[[],["a"],["."]]
Output
[null,null,false]
Explanation
WordDictionary wordDictionary = new WordDictionary();
wordDictionary.addWord("a");
wordDictionary.search("."); // return False

Steps

  1. Data Structure: We'll use a Trie (prefix tree) to efficiently store and search words. Each node in the Trie will represent a character, and the edges will represent transitions to the next character. A boolean flag in each node will indicate whether it represents the end of a word.

  2. addWord(word): Traverse the Trie, adding nodes as needed for each character in the word. Set the end-of-word flag to true for the last node.

  3. search(word): This is the more complex part. We'll recursively traverse the Trie. If we encounter a '.', we need to recursively explore all possible child nodes. If we reach the end of the word and the current node's end-of-word flag is true, we found a match.

Explanation

The Trie efficiently handles prefix searches. Adding a word takes O(word length) time, as we traverse and potentially add nodes. Searching is also efficient, although the time complexity can vary depending on the number of '.' characters. In the worst case (many '.' characters and many words), it can approach O(N*M), where N is the number of words and M is the average word length. However, on average, it's much faster than a linear scan of all words.

Code

class WordDictionary {
    private TrieNode root;

    public WordDictionary() {
        root = new TrieNode();
    }

    public void addWord(String word) {
        TrieNode node = root;
        for (char c : word.toCharArray()) {
            int index = c - 'a';
            if (node.children[index] == null) {
                node.children[index] = new TrieNode();
            }
            node = node.children[index];
        }
        node.isWord = true;
    }

    public boolean search(String word) {
        return searchHelper(word, 0, root);
    }

    private boolean searchHelper(String word, int index, TrieNode node) {
        if (index == word.length()) {
            return node.isWord;
        }

        char c = word.charAt(index);
        if (c == '.') {
            for (TrieNode child : node.children) {
                if (child != null && searchHelper(word, index + 1, child)) {
                    return true;
                }
            }
        } else {
            int charIndex = c - 'a';
            if (node.children[charIndex] != null) {
                return searchHelper(word, index + 1, node.children[charIndex]);
            }
        }
        return false;
    }


    private static class TrieNode {
        TrieNode[] children;
        boolean isWord;

        public TrieNode() {
            children = new TrieNode[26];
            isWord = false;
        }
    }
}

Complexity

  • Time Complexity:

    • addWord: O(L), where L is the length of the word.
    • search: Worst case O(N * M), where N is the number of words in the dictionary and M is the average word length. Average case is significantly better.
  • Space Complexity: O(N * L), where N is the number of words and L is the average word length (to store the Trie).