Find the Index of the First Occurrence in a String easy

Problem Statement

Given two strings, haystack and needle, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example 1

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2

Input: haystack = "aaaaa", needle = "bba"
Output: -1

Steps

  1. Handle Edge Cases: Check for empty needle. If needle is empty, the needle is always found at index 0. If haystack is shorter than needle, needle cannot be found.

  2. Iterate through haystack: Use a sliding window approach. The window size is equal to the length of needle.

  3. Compare substrings: In each iteration, compare the current window (substring of haystack) with needle.

  4. Return index: If a match is found, return the starting index of the window (which is the index of the first occurrence of needle).

  5. Return -1: If the loop completes without finding a match, return -1.

Explanation

The solution uses a straightforward string matching algorithm. It avoids more complex algorithms like Knuth-Morris-Pratt (KMP) or Rabin-Karp, as those are generally overkill for this specific problem unless dealing with extremely large inputs and performance is paramount. The sliding window approach provides a clear and easy-to-understand solution.

Code

def strStr(haystack: str, needle: str) -> int:
    """
    Finds the index of the first occurrence of needle in haystack.

    Args:
        haystack: The string to search in.
        needle: The string to search for.

    Returns:
        The index of the first occurrence of needle in haystack, or -1 if needle is not found.
    """
    len_needle = len(needle)
    len_haystack = len(haystack)

    # Handle edge cases
    if len_needle == 0:
        return 0
    if len_haystack < len_needle:
        return -1

    for i in range(len_haystack - len_needle + 1):
        if haystack[i:i + len_needle] == needle:
            return i

    return -1

Complexity

  • Time Complexity: O(M*N), where M is the length of haystack and N is the length of needle. In the worst case, we might compare the needle with almost all substrings of haystack.
  • Space Complexity: O(1). The algorithm uses a constant amount of extra space. We're not creating any new data structures whose size scales with the input.