Alien Dictionary hard
Problem Statement
There is a dictionary containing words from an alien language. Unlike English, the order of letters in this language is not known to us. We are given a list of strings words
from the alien language's dictionary, where the strings in words
are sorted lexicographically by the rules of this new language. Derive the order of letters in this alien language. If there is no solution, return "" (empty string). If there are multiple valid solutions, any of them is accepted.
Example 1
Input: words = ["wrt","wrf","er","ett","rftt"] Output: "wertf"
Example 2
Input: words = ["z","x"] Output: "zx"
Steps
-
Build the Graph: Create a directed graph where each node represents a letter from the alien language. An edge from node 'a' to node 'b' indicates that 'a' comes before 'b' in the alien language's order. We iterate through pairs of consecutive words in the
words
list. For each pair, we compare the words character by character until we find a difference. If a difference exists (e.g.,word1[i] != word2[i]
), we add a directed edge fromword1[i]
toword2[i]
in the graph, indicating thatword1[i]
comes beforeword2[i]
. -
Detect Cycles: A cycle in the graph indicates a contradiction (e.g., a < b and b < a). If a cycle exists, there's no valid ordering of the letters, so we return "". We use topological sort to detect cycles.
-
Topological Sort: Perform a topological sort on the graph. Topological sort arranges the nodes (letters) in an order such that for every directed edge from node 'a' to node 'b', 'a' appears before 'b' in the sorted order. This sorted order represents the alien language's letter order. We use Kahn's algorithm for topological sort.
Explanation
The algorithm leverages graph theory to solve the problem. By representing the letter ordering constraints as a directed graph, we can efficiently determine if a valid ordering exists and, if so, find one using topological sorting. The use of a HashMap for efficient neighbor lookups during graph construction and the use of an in-degree array and a queue to implement Kahn's algorithm makes the solution highly efficient.
Code
import java.util.*;
class Solution {
public String alienOrder(String[] words) {
Map<Character, Set<Character>> graph = new HashMap<>();
Map<Character, Integer> inDegree = new HashMap<>();
// Build the graph and calculate in-degrees
for (String word : words) {
for (char c : word.toCharArray()) {
inDegree.put(c, 0);
}
}
for (int i = 0; i < words.length - 1; i++) {
String word1 = words[i];
String word2 = words[i + 1];
int minLen = Math.min(word1.length(), word2.length());
boolean diffFound = false;
for (int j = 0; j < minLen; j++) {
char c1 = word1.charAt(j);
char c2 = word2.charAt(j);
if (c1 != c2) {
if (!graph.containsKey(c1)) graph.put(c1, new HashSet<>());
if (!graph.get(c1).contains(c2)) {
graph.get(c1).add(c2);
inDegree.put(c2, inDegree.get(c2) + 1);
}
diffFound = true;
break;
}
}
// Check for invalid lexicographical order (e.g., ["abc", "ab"])
if (!diffFound && word1.length() > word2.length()) return "";
}
// Topological Sort using Kahn's algorithm
Queue<Character> queue = new LinkedList<>();
for (char c : inDegree.keySet()) {
if (inDegree.get(c) == 0) {
queue.offer(c);
}
}
StringBuilder sb = new StringBuilder();
while (!queue.isEmpty()) {
char c = queue.poll();
sb.append(c);
if (graph.containsKey(c)) {
for (char neighbor : graph.get(c)) {
inDegree.put(neighbor, inDegree.get(neighbor) - 1);
if (inDegree.get(neighbor) == 0) {
queue.offer(neighbor);
}
}
}
}
// Check for cycles (if the length of sb is less than the number of unique characters)
return sb.length() == inDegree.size() ? sb.toString() : "";
}
}
Complexity
- Time Complexity: O(V + E), where V is the number of unique letters and E is the number of edges in the graph. Building the graph and performing topological sort both take linear time with respect to the number of vertices and edges.
- Space Complexity: O(V + E), primarily due to the space used to store the graph and the in-degree array. The queue used in topological sort also contributes to the space complexity.