My Favorite LeetCode Question
Data structures and Algorithm interviews can always be hit-or-miss. Sometimes the questions are very intuitive, and sometimes they seem impossible unless you have seen the problem before. To me, a good DS&A interview question is one that strikes a perfect balance between knowledge of data structures and clever thinking.
This is why my favorite question on LeetCode is #295 Find Median from Data Stream
Problem statement
Implement a class that supports the following operations:
void addNum(int num)
ingest an integer from the data streamint findMedian()
return the median of all the integers seen so far
Note that the stream of integers is unordered.
Initial Thoughts
When I approach a new problem, I first like to run through a few examples.
If we have processed [4,2,5]
, the median of this is 4
because these integers in sorted order is [2,4,5]
and the middle element is 4
.
If we have processed [8,2]
, the median of this is 5
because these integers in sorted order is [2,8]
, and because this list is even, we take (2 + 8) / 2 = 5
After looking at these examples, it is clear that we need to evaluate the integers seen so far in sorted order, and then find the middle element.
Approach 1
One idea would be to maintain a list of integers we have seen so far, and then sort it whenever we need to find the median.
- When
addNum(int num)
is called, we can appendnum
to the end of our list. - When
findMedian()
is called, we can sort the list and then return the middle.
Using this approach, addNum(int num)
can be done in O(1)
time because we are just appending to the end of the list. findMedian()
will take O(n log n)
because we have to sort the list.
Time complexity
addNum(int num)
isO(1)
findMedian()
isO(n log n)
Approach 2
Sorting the list each time findMedian()
is called is killing our runtime. What if we kept our list sorted and each time addNum(int num)
was called, we inserted it into its sorted position? Then we could easily implement findMedian()
in O(1)
time.
Inserting into the sorted list will take us O(n)
time.
O(log n)
to find insertion positionO(n)
to shift subsequent elements
Time complexity
addNum(int num)
isO(n)
findMedian()
isO(1)
Approach 3 (Optimal)
In approach 2, we maintain the sorted order of the entire list. However, we don’t really need to know the order of the whole list.
Looking at an example, [1, 2, 3, 4, 5]
, we can see that the answer is 3
, because it is in the middle. We can look at this as [1, 2]
[3]
[4,5]
. We have everything to the left of the middle in one list, and everything to the right of the middle in another list.
The solution here is to maintain two heaps. A max heap that stores everything to the left of the middle. And a min heap that stores everything to the right of the middle. We will keep the size of each equal, and alternate adding to them. We can find the median by looking at the root of the max heap and the root of the min heap.
Time complexity
addNum(int num)
isO(log n)
findMedian()
isO(1)