Accounts Merge - LeetCode 721 Solution in python

Accounts Merge medium

Problem Statement

Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest are emails representing emails of the account. Now, we need to merge all accounts with the same emails. The output should be a list of accounts, where each account contains the name and each email address in a single list. Each email should be included in only one account.

Example 1

Input: accounts = [["John","johnsmith@mail.com","john00@mail.com"],["John","john00@mail.com","john11@mail.com"],["Mary","mary@mail.com"]] Output: [["John","john00@mail.com","john11@mail.com","johnsmith@mail.com"],["Mary","mary@mail.com"]] Explanation: The first two accounts are merged because they both contain "john00@mail.com".

Example 2

Input: accounts = [["Gabe","Gabe0@m.co","Gabe3@m.co","Gabe1@m.co"],["Jona","Jona0@m.co","Jona1@m.co"],["Gabe","Gabe3@m.co","Gabe0@m.co","Gabe5@m.co"]] Output: [["Gabe","Gabe0@m.co","Gabe1@m.co","Gabe3@m.co","Gabe5@m.co"],["Jona","Jona0@m.co","Jona1@m.co"]]

Steps

Create an Email to Account Mapping: We'll use a dictionary to map each email address to its corresponding account name. This helps us identify accounts linked by shared emails.
Perform Depth-First Search (DFS): We'll iterate through the accounts. For each account, if we encounter an email that's already in our email-to-account mapping, it means we've found a connection to another account. We'll use DFS to traverse and merge all connected accounts.
Merge Accounts: During the DFS traversal, we'll collect all emails associated with the connected accounts.
Construct Result: Finally, we'll construct the result list by combining the account name with the merged emails.

Explanation

The core idea is to build a graph where each email is a node, and an edge exists between two emails if they belong to the same account. We use the email-to-account mapping to discover these connections and then use DFS to efficiently merge connected components (accounts). The DFS ensures that we don't revisit already processed accounts, avoiding cycles and redundancy.

Code

def accountsMerge(accounts):
    email_to_name = {}
    email_to_account = {}

    for account in accounts:
        name = account[0]
        for email in account[1:]:
            email_to_name[email] = name
            email_to_account[email] = email

    def dfs(email, visited, component):
        visited.add(email)
        component.append(email)
        for neighbor_email in email_to_account:
            if email_to_account[neighbor_email] == email and neighbor_email not in visited:
                dfs(neighbor_email, visited, component)

    result = []
    visited = set()
    for account in accounts:
        for email in account[1:]:
            if email not in visited:
                component = []
                dfs(email, visited, component)
                component.sort()  # Sort for consistent output
                name = email_to_name[email]
                result.append([name] + component)
    return result

Complexity

Time Complexity: O(NMlogM), where N is the number of accounts and M is the maximum number of emails in an account. The DFS takes O(M) time in the worst case, and we perform it at most N times. Sorting emails takes O(M log M) time.
Space Complexity: O(N*M), mainly due to the email_to_name, email_to_account, and the visited set which store all the emails. The component list during DFS also contributes to this space complexity.