Alien Dictionary hard

Problem Statement

There is a dictionary containing words from an alien language. Unlike English, the order of letters in this language is not known to us. We are given a list of strings words from the alien language's dictionary, where the strings in words are sorted lexicographically by the rules of this new language. Derive the order of letters in this alien language. If there is no solution, return "" (empty string). If there are multiple valid solutions, any of them is accepted.

Example 1

Input: words = ["wrt","wrf","er","ett","rftt"] Output: "wertf"

Example 2

Input: words = ["z","x"] Output: "zx"

Steps

  1. Build the Graph: Create a directed graph where each node represents a letter from the alien language. An edge from node 'a' to node 'b' indicates that 'a' comes before 'b' in the alien language's order. We iterate through pairs of consecutive words in the words list. For each pair, we compare the words character by character until we find a difference. If a difference exists (e.g., word1[i] != word2[i]), we add a directed edge from word1[i] to word2[i] in the graph, indicating that word1[i] comes before word2[i].

  2. Detect Cycles: A cycle in the graph indicates a contradiction (e.g., a < b and b < a). If a cycle exists, there's no valid ordering of the letters, so we return "". We use topological sort to detect cycles.

  3. Topological Sort: Perform a topological sort on the graph. Topological sort arranges the nodes (letters) in an order such that for every directed edge from node 'a' to node 'b', 'a' appears before 'b' in the sorted order. This sorted order represents the alien language's letter order. We use Kahn's algorithm for topological sort.

Explanation

The algorithm leverages graph theory to solve the problem. By representing the letter ordering constraints as a directed graph, we can efficiently determine if a valid ordering exists and, if so, find one using topological sorting. The use of a HashMap for efficient neighbor lookups during graph construction and the use of an in-degree array and a queue to implement Kahn's algorithm makes the solution highly efficient.

Code

import java.util.*;

class Solution {
    public String alienOrder(String[] words) {
        Map<Character, Set<Character>> graph = new HashMap<>();
        Map<Character, Integer> inDegree = new HashMap<>();

        // Build the graph and calculate in-degrees
        for (String word : words) {
            for (char c : word.toCharArray()) {
                inDegree.put(c, 0);
            }
        }

        for (int i = 0; i < words.length - 1; i++) {
            String word1 = words[i];
            String word2 = words[i + 1];
            int minLen = Math.min(word1.length(), word2.length());
            boolean diffFound = false;
            for (int j = 0; j < minLen; j++) {
                char c1 = word1.charAt(j);
                char c2 = word2.charAt(j);
                if (c1 != c2) {
                    if (!graph.containsKey(c1)) graph.put(c1, new HashSet<>());
                    if (!graph.get(c1).contains(c2)) {
                        graph.get(c1).add(c2);
                        inDegree.put(c2, inDegree.get(c2) + 1);
                    }
                    diffFound = true;
                    break;
                }
            }
            // Check for invalid lexicographical order (e.g., ["abc", "ab"])
            if (!diffFound && word1.length() > word2.length()) return "";
        }


        // Topological Sort using Kahn's algorithm
        Queue<Character> queue = new LinkedList<>();
        for (char c : inDegree.keySet()) {
            if (inDegree.get(c) == 0) {
                queue.offer(c);
            }
        }

        StringBuilder sb = new StringBuilder();
        while (!queue.isEmpty()) {
            char c = queue.poll();
            sb.append(c);
            if (graph.containsKey(c)) {
                for (char neighbor : graph.get(c)) {
                    inDegree.put(neighbor, inDegree.get(neighbor) - 1);
                    if (inDegree.get(neighbor) == 0) {
                        queue.offer(neighbor);
                    }
                }
            }
        }

        // Check for cycles (if the length of sb is less than the number of unique characters)
        return sb.length() == inDegree.size() ? sb.toString() : "";
    }
}

Complexity

  • Time Complexity: O(V + E), where V is the number of unique letters and E is the number of edges in the graph. Building the graph and performing topological sort both take linear time with respect to the number of vertices and edges.
  • Space Complexity: O(V + E), primarily due to the space used to store the graph and the in-degree array. The queue used in topological sort also contributes to the space complexity.