Alien Dictionary hard

Problem Statement

There is a list of words in an alien language which are lexicographically sorted. The task is to find the order of letters in this alien language. If there are multiple valid solutions, return any of them. If the given input is invalid, return "".

Example 1

Input: words = ["wrt","wrf","er","ett","rftt"] Output: "wertf"

Example 2

Input: words = ["z","x"] Output: "zx"

Steps

  1. Build the graph: Create a directed graph where each node represents a letter. An edge from letter 'a' to letter 'b' means 'a' comes before 'b' in the alien language. We build this graph by comparing adjacent words in the input list. If we find a mismatch between two words, we add a directed edge from the mismatched character in the first word to the mismatched character in the second word.

  2. Detect cycles: Check if there are any cycles in the graph. A cycle indicates a contradiction in the lexicographical order (e.g., a -> b and b -> a), rendering the input invalid. We can use topological sort to detect cycles.

  3. Topological Sort: If no cycles are found, perform a topological sort on the graph. Topological sort arranges nodes in such a way that for every directed edge from node A to node B, node A appears before node B in the ordering. The result of the topological sort gives us the order of letters.

  4. Handle Invalid Input: If a cycle is detected or the graph doesn't include all letters, return "".

Explanation

Let's trace Example 1: words = ["wrt","wrf","er","ett","rftt"]

  1. Graph Construction:

    • Comparing "wrt" and "wrf", we get 't' -> 'f'.
    • Comparing "wrf" and "er", we get 'w' -> 'e' and 'r' -> 'e'.
    • Comparing "er" and "ett", we get 'r' -> 't'.
    • Comparing "ett" and "rftt", we get 'e' -> 'r'.
  2. Cycle Detection: We check for cycles in the graph using topological sort. No cycles are found in this example.

  3. Topological Sort: Topological sort will process nodes in an order that respects the dependencies in the graph. A possible order could be 'w', 'e', 'r', 't', 'f'. This is because 'w' has no incoming edges, followed by 'e', and so on.

  4. Result: The topological sort results in "wertf", which is the order of letters in the alien language.

Code

function alienOrder(words: string[]): string {
    const adj: Map<string, Set<string>> = new Map();
    const inDegree: Map<string, number> = new Map();

    // Build the graph and in-degree map
    for (const word of words) {
        for (const char of word) {
            inDegree.set(char, 0);
        }
    }
    for (let i = 0; i < words.length - 1; i++) {
        const word1 = words[i];
        const word2 = words[i + 1];
        const minLen = Math.min(word1.length, word2.length);
        for (let j = 0; j < minLen; j++) {
            if (word1[j] !== word2[j]) {
                if (!adj.has(word1[j])) adj.set(word1[j], new Set());
                if (!adj.get(word1[j])!.has(word2[j])) {
                    adj.get(word1[j])!.add(word2[j]);
                    inDegree.set(word2[j], inDegree.get(word2[j])! + 1);
                }
                break; // Only consider the first difference
            }
            if (word1.length > word2.length && j === minLen - 1) return ""; // Invalid input: shorter word appearing after longer word
        }
    }


    const queue: string[] = [];
    for (const [char, degree] of inDegree.entries()) {
        if (degree === 0) queue.push(char);
    }

    const result: string[] = [];
    while (queue.length > 0) {
        const char = queue.shift()!;
        result.push(char);
        if (adj.has(char)) {
            for (const neighbor of adj.get(char)!) {
                inDegree.set(neighbor, inDegree.get(neighbor)! - 1);
                if (inDegree.get(neighbor)! === 0) {
                    queue.push(neighbor);
                }
            }
        }
    }

    if (result.length !== inDegree.size) return ""; // Cycle detected

    return result.join("");
}

Complexity

  • Time Complexity: O(V + E), where V is the number of unique characters and E is the number of edges in the graph. Building the graph, detecting cycles, and performing topological sort all take linear time in the size of the graph.

  • Space Complexity: O(V + E), to store the graph (adjacency list) and the in-degree map. The queue used in topological sort also takes linear space in the worst case.