Longest Well Performing Interval
In this post, I share some classical ways to optimize Brute force solutions for array problems through Longest Well-Performing Interval.
The problem is the following:
We are given
hours
, a list of the number of hours worked per day for a given employee.A day is considered to be a tiring day if and only if the number of hours worked is (strictly) greater than
8
.A well-performing interval is an interval of days for which the number of tiring days is strictly larger than the number of non-tiring days.
Return the length of the longest well-performing interval.
Example:
Input: hours = [9,9,6,0,6,6,9]
Output: 3
Explanation: The longest well-performing interval is [9,9,6]
— Leetcode: 1124. Longest Well-Performing Interval
Brute force Solution
The brute force solution is pretty straightforward.
We consider all possible intervals (subarrays), and for each subarray, if it is well-performing, we update our max result.
We know if a subarray is well-performing or not by simply iterating over the array and computing the number of tiring and non tiring days. If the former is strictly higher than the latter, the subarray is a well-performing interval.
Time complexity = O(N³) : We have O(N²) subarrays and for each one we are doing O(N) time to see whether it is well performing or not.
Optimized Brute force
What we are going to optimize is the O(N) time taken to decide whether an interval is well-performing or not.
It can be easily optimized using a prefix array. Just Maintain two prefix arrays, one for tiring days and one for non tiring days.
input = [9,9,6,0,6,6,9]
prefixTiringDays = [1,2,2,2,2,2,3]
prefixNonTiring = [0,0,1,2,3,4,4]
To decide whether a subarray from i to j is well-performing, we compute:
nbTiringDays-nbNonTiring, where
- nbTiringDays = prefixTiringDays[j]-prefixTiringDays[i-1]
- And nbNonTiring = prefixNonTiring[j]-prefixNonTiring[i-1]
If the value is greater than 0 the answer is yes, otherwise it is not a well-performing interval.
We can optimize some space by noticing that
nbTiringDays-nbNonTiring = (prefixTiringDays[j]-prefixNonTiring[j])-(prefixTiringDays[i-1]-prefixNonTiring[i-1])
Because we are only interested in the difference we maintain only one array of differences, where we increment by 1 if the current day is a tiring day and we decrement if it is non tiring.
input = [9,9,6,0,6,6,9]
prefix = [1,2,1,0,-1,-2,-1]
and nbTiringDays-nbNonTiring = prefix[j]-prefix[i-1]
=> Time complexity = O(N²)
O(N) time Solution ?
When asked a O(N) time solution, it is likely that you start thinking about a sliding window/two pointers solution. You will quickly realize that this is not the case for this problem, because you won’t know how and when to move your pointers.
A linear solution requires to take O(1) time for each element of the input. Because we don’t know what to do for the moment, we need to take some examples and try to gain more understanding about what’s going on.
input = [6 , 5, 9, 6, 9, 9, 2]
prefix = [-1, -2, -1, -2, -1, 0, -2]
For each index i, inorder for Subarrays (start, i) to be well-performing, we need: prefix[i]-prefix[start-1] > 0 (Ignore edge case start=0 for the moment).
So the subarrays (start, i) that verify prefix[start-1]<prefix[i] are the well performing subarrays that end in i.
For the example above, when we are at the third 9 (prefix = 0) we consider indices that have prefix < 0, so typically here prefix=-1 and prefix=-2, and we take the most left index (which is in this case the index 0). We then update res = max(res, i-mostLeftIndex).
Because before -2 there is necessarily an occurence of -1(initially we have a difference of 0 and we always increment or decrement by 1), for prefix[i]=0, we just focus on starting cells having -1 (prefix[i]-1) as value. And because we want the smallest index we just focus on its first occurence.
To sum up, for each index i we want the index of the first occurence of prefix[i]-1. A hashtable storing prefixes as key and their first occurence as value will help us take O(1) time for each iteration.
Now for the edge case of subarray starting from 0:
- If prefix[i] > 0 then the subarray from 0 to i is well-performing so we will simply consider it as it is the longest possible.
- If prefix[i] < 0 then subarray from 0 to i is not well performing so we just don’t consider it.
Here is my python code:
def longestWPI(self, hours: List[int]) -> int:
prefix = 0
d = {}
res = 0
for i, h in enumerate(hours):
prefix += 1 if h>8 else -1
if prefix > 0:
res = i+1
elif prefix-1 in d:
res = max(res, i-d[prefix-1])
if not prefix in d:
d[prefix] = i
return res
For prefix, we use a variable instead of an array as we don’t need prefix of previous indices.
Takeaways
- For summation problems like this one, prefix array is helpful to have O(1) sum computation.
- When you have O(N²) time because of considering all possible subarrays, and you are asked a O(N) time there are some classical approaches to think about:
* Two pointers : In this case you should think about the conditions to move pointers.
* Iterate and consider each index the end of your subarray: Then you need to find some O(1) trick specific to the problem inorder to avoid considering all possible startings of the subarray. (like we did here)