Top K Elements

Top K Elements

Maintain a heap of size k to track the k largest/smallest/most-frequent elements.

When to use

K largest/smallest, k closest points, k most frequent, kth-X in a stream.

Complexity
time O(n log k)space O(k)

Each of n elements does at most one push and one pop on a size-k heap, each costing log k. The heap itself never exceeds size k.

Related production techniques

The same idea applied in production systems — read these once you know the pattern to see where it lands in the real world.

stream →38295617min-heap of k987
Maintain a heap of size k. Each new element replaces the heap's min iff it's larger.
How it works

Maintain a heap of size k. For 'top k largest', use a min-heap: when its size exceeds k, pop the smallest, keeping only the k largest seen so far. Pushing all n elements gives O(n log k) — strictly better than sorting (O(n log n)) when k << n, and easily streamable since you only ever hold k items.

Quickselect is the alternative when you can mutate the input array and want average O(n): partition like quicksort but only recurse into the side containing the kth element. Worst case is O(n^2), mitigated by random pivot selection (or median-of-medians for guaranteed O(n), though that's rarely worth it in interviews).

For 'k most frequent' problems, combine with a hash map (count occurrences) then run top-k on the count map. Bucket sort is an O(n) alternative when counts are bounded by n: bucket[count] = list of values with that count, then walk buckets from high to low.

Key insight

A size-k heap turns 'top k of n elements' from O(n log n) into O(n log k), and crucially makes the problem streamable — you only ever hold k items in memory regardless of n.

Common pitfalls
  • Using a max-heap for 'top k largest' and popping after inserting all n elements — works but is O(n log n), defeating the purpose; the right choice is a min-heap of size k.
  • For 'k closest points to origin', forgetting that you can compare squared distances and avoid sqrt entirely.
  • In quickselect, recursing on both sides instead of only the one containing k, losing the O(n) average.
  • For 'k most frequent', sorting all counts when bucket sort gives O(n) and a heap of size k gives O(n log k).
  • Returning the heap directly without sorting when the problem requires the top k in order.
Variations
  • Top k largest/smallest: min-heap or max-heap of size k respectively.
  • K most frequent elements: hash map + heap of size k, or bucket sort.
  • K closest points to origin: heap of size k keyed by squared distance.
  • Kth largest in a stream: heap of size k, with O(log k) updates per insert.
  • Quickselect: partition-based, average O(n) but requires in-place mutation.
Template
// import java.util.*;
PriorityQueue<Integer> heap = new PriorityQueue<>(); // min-heap of size k
for (int num : nums) {
    heap.offer(num);
    if (heap.size() > k) heap.poll();
}
// heap now holds the k largest elements; heap.peek() is the kth largest

Problems using this pattern

3

Easiest first · Blind 75 above NeetCode 150

Practice these problems
Solve mode — formulate the approach, then reveal. 3 problems.