Find the Duplicate Number medium

Problem Statement

Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] (inclusive), prove that at least one duplicate number must exist. Find the duplicate number.

You must not modify the array (assume the array is read only). You must use only constant, O(1) extra space. Your runtime complexity should be less than O(n^2).

Example 1

Input: nums = [1,3,4,2,2]
Output: 2

Example 2

Input: nums = [3,1,3,4,2]
Output: 3

Steps and Explanation

This problem can be efficiently solved using Floyd's Tortoise and Hare algorithm, also known as the cycle detection algorithm. The idea is to treat the array as a linked list where nums[i] points to the next node at index nums[i]. Since there's a duplicate, there must be a cycle in this linked list.

  1. Cycle Detection: We use two pointers, slow and fast. slow moves one step at a time, and fast moves two steps at a time. If there's a cycle, they will eventually meet.

  2. Finding the Cycle Start: Once the slow and fast pointers meet, we reset slow to the beginning of the array (index 0). We move both slow and fast one step at a time. The point where they meet again is the start of the cycle, which represents the duplicate number.

Why does this work? Imagine the cycle starts at index x and has length y. When slow and fast meet, fast has traveled a distance that is a multiple of y more than slow. Resetting slow to the beginning and moving both pointers one step at a time ensures that they will meet at the start of the cycle after traversing a distance equal to the distance from the beginning of the array to the start of the cycle.

Code (TypeScript)

function findDuplicate(nums: number[]): number {
    let slow = nums[0];
    let fast = nums[0];

    // Cycle detection
    do {
        slow = nums[slow];
        fast = nums[nums[fast]];
    } while (slow !== fast);

    // Finding the cycle start
    slow = nums[0];
    while (slow !== fast) {
        slow = nums[slow];
        fast = nums[fast];
    }

    return slow;
};

Complexity

  • Time Complexity: O(n). We traverse the array at most a couple of times.
  • Space Complexity: O(1). We use only constant extra space for variables.

This solution effectively leverages Floyd's Tortoise and Hare algorithm to find the duplicate number within the given constraints of constant space and sub-quadratic time complexity. It's a clever and efficient approach to a challenging problem.