Implement Trie (Prefix Tree) medium

Problem Statement

Implement a trie with insert, search, and startsWith methods.

A trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spell checking.

Implement the Trie class:

  • Trie() Initializes the trie object.
  • void insert(string word) Inserts the string word into the trie.
  • bool search(string word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.
  • bool startsWith(string prefix) Returns true if there is any string in the trie that starts with the given prefix, and false otherwise.

Example 1:

Trie trie = new Trie();
trie.insert("apple");
trie.search("apple");   // return True
trie.search("app");     // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app");     // return True

Example 2:

Trie trie = new Trie();
trie.insert("app");
trie.insert("ap");
trie.search("ap"); // return True

Steps:

  1. Trie Node Structure: We'll create a structure for each node in the trie. Each node will contain:

    • A boolean isWord to indicate if the node represents the end of a valid word.
    • An array of 26 pointers (children), one for each letter of the alphabet (assuming only lowercase letters).
  2. insert(word): Iterate through the characters of the word. For each character:

    • If a child node for that character exists, move to that child.
    • Otherwise, create a new node for that character and add it as a child.
    • Finally, mark the last node as isWord = true.
  3. search(word): Iterate through the characters of the word. If at any point a child node for the current character doesn't exist, return false. If the iteration completes successfully, check the isWord flag of the final node.

  4. startsWith(prefix): Similar to search, iterate through the characters of the prefix. If at any point a child node for the current character doesn't exist, return false. If the iteration completes, return true (because a word starting with the prefix exists).

Explanation:

The Trie data structure excels at prefix-based operations because it organizes words based on their prefixes. Each path from the root to a node represents a prefix, and each path ending at a node marked isWord = true represents a complete word. This structure allows for efficient searching and prefix checking because we only need to traverse down the tree following the letters of the word or prefix.

Code:

#include <iostream>
#include <string>
#include <vector>

class TrieNode {
public:
    bool isWord;
    std::vector<TrieNode*> children;
    TrieNode() : isWord(false), children(26, nullptr) {}
};

class Trie {
public:
    Trie() {
        root = new TrieNode();
    }

    void insert(std::string word) {
        TrieNode* curr = root;
        for (char c : word) {
            int index = c - 'a';
            if (curr->children[index] == nullptr) {
                curr->children[index] = new TrieNode();
            }
            curr = curr->children[index];
        }
        curr->isWord = true;
    }

    bool search(std::string word) {
        TrieNode* curr = root;
        for (char c : word) {
            int index = c - 'a';
            if (curr->children[index] == nullptr) {
                return false;
            }
            curr = curr->children[index];
        }
        return curr->isWord;
    }

    bool startsWith(std::string prefix) {
        TrieNode* curr = root;
        for (char c : prefix) {
            int index = c - 'a';
            if (curr->children[index] == nullptr) {
                return false;
            }
            curr = curr->children[index];
        }
        return true;
    }

private:
    TrieNode* root;
};

Complexity:

  • Time Complexity:

    • insert: O(m), where m is the length of the word.
    • search: O(m), where m is the length of the word.
    • startsWith: O(m), where m is the length of the prefix.
  • Space Complexity: O(N * M), where N is the number of words inserted and M is the maximum length of a word. This is because in the worst case, we might need to store all the characters of all words in the Trie. Note that the space complexity can be significantly less than this if there is a lot of prefix sharing amongst words.