Signa - Become a Better Engineer

When working with arrays and needing to check for the existence of elements frequently, hash sets (Python's set data structure) become incredibly powerful. The key insight is that hash sets provide O(1) average-case lookup time, compared to O(n) for lists.

The challenge with finding consecutive sequences efficiently lies in avoiding redundant work. A naive approach might sort the array first (O(n log n)) or use nested loops (O(n²)). However, we can leverage hash sets to achieve O(n) time complexity through a clever observation: we only need to start counting a sequence from its beginning.

Think about it this way: if we have numbers [4, 1, 3, 2] and we want to find consecutive sequences, we could start counting from any number. But if we start counting from 2, we'd get sequence length 3 (2,3,4). If we start from 3, we'd get length 2 (3,4). Starting from 1 gives us the full sequence length 4 (1,2,3,4). The key insight is that we should only start counting when we find the true beginning of a sequence.

How do we identify the beginning? A number n is the start of a consecutive sequence if n-1 is not present in our set. This ensures we only count each sequence once, from its natural starting point.

The algorithm becomes: convert the array to a set, then for each number that could be a sequence start, count how many consecutive numbers follow it. Since each number is visited at most twice (once as a potential start, once as a follower), the total time complexity remains O(n).

Longest Consecutive Sequence in O(n) Time

Lesson

Hash Sets for Efficient Sequence Detection

Key Takeaways