The change you made in in column A is perfectly fine. Since we are counting (using whole numbers), there is no difference between >1 and >=2. In the column C verbose descriptions, I used the phrase "at least two consecutive weeks" so I made the column A short description follow that same idea and used ">=2", but ">1" is completely consistent, and in fact is the way I wrote the formula.
I've found the best approach for understanding more complicated formulas involves splitting them into manageable parts and examining what each does. Let's review how the FREQUENCY function works before we develop formulas for Options 5C and 5D. Taking Query 5A as an example, and changing the worksheet name for the source data to simply "Data" so that the formula is a little shorter and easier to follow, we have:
=SUM(--(FREQUENCY(IF((Data!$J$5:$J$65<0)*(Data!$N$5:$N$65>Data!$N$6:$N$66),ROW(Data!$N$5:$N$65)),IF(((Data!$J$5:$J$65>=0)+(Data!$N$5:$N$65<=Data!$N$6:$N$65))>0,ROW( Data!$N$5:$N$65)))>1))
Starting inside, we have a FREQUENCY function that has the general form of FREQUENCY(data_array, bins_array)
This returns an array of the count of values in the data_array that fall within ranges defined by the bins_array. Isolating just the FREQUENCY function we have:
FREQUENCY(IF( (Data!$J$5:$J$65<0) * (Data!$N$5:$N$65>Data!$N$6:$N$66) , ROW(Data!$N$5:$N$65) ) , IF( ( (Data!$J$5:$J$65>=0) + (Data!$N$5:$N$65<=Data!$N$6:$N$65) )>0, ROW( Data!$N$5:$N$65) )
The data_array is formed by the first IF function:
IF( (Data!$J$5:$J$65<0) * (Data!$N$5:$N$65>Data!$N$6:$N$66), ROW(Data!$N$5:$N$65) )
- The first argument in the IF function is the logical_test that establishes which rows in the data table satisfy our filtering criteria. Since we want to consider all rows where the closing was negative, we write (Data!$J$5:$J$65<0), which returns an array of TRUE/FALSE values.
- We also want to consider all rows where the Open Interest was higher than the previous week, so we write (Data!$N$5:$N$65>Data!$N$6:$N$66), which also returns an array of TRUE/FALSE values depending on whether a given week's OI is greater than the previous week's OI.
- Since we want both of these conditions to apply at the same time, we multiply these two arrays together. This effectively is similar to a logical "AND" statement. The multiplication of the TRUE/FALSE arrays coerces TRUE's to values of 1 and FALSE's to values of 0, so the resultant array consists of 1's and 0's. Since this filtering criteria array of 1's and 0's is the logical_test argument in the IF statement, a value of 1 means the particular row corresponding to this array element satisfies all of the desired filtering criteria, and because "1" is interpreted as TRUE, the IF statement will return the ROW number...that's what the second argument appearing in the IF statement does, where we see ROW(Data!$N$5:$N$65). Where the values in the filtering criteria array are 0, the IF statement will return FALSE since the IF statement gives no instruction in that regard.
- The final result of this IF statement for the sample data set is an array that looks like this:
{5;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;18;19;FALSE;21;22;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}
This means that rows 5, 18, 19, 21, 22 are the only ones of interest to us.
The bins_array is formed by the second IF function:
IF( ( (Data!$J$5:$J$65>=0) + (Data!$N$5:$N$65<=Data!$N$6:$N$66) )>0, ROW(Data!$N$5:$N$65) )
- We want this second IF function to return all of the row numbers corresponding to data that do not meet our filtering criteria used in the first IF statement. Then the second IF statement will establish limits (upper and lower row numbers) that will isolate various groupings (or streaks) that are of interest to us. So in the first IF statement, where the first argument is (Data!$J$5:$J$65<0) for our filtering criteria for negative closings, we now use the expression (Data!$J$5:$J$65>=0) to identify the rows that will be part of the bins_array returned by the second IF statement. This expression returns an array of TRUE/FALSE values.
- Similarly, where we have (Data!$N$5:$N$65>Data!$N$6:$N$66) in the first IF statement for the filtering criterion for where the Open Interest was higher than the previous week, we change this to (Data!$N$5:$N$65<=Data!$N$6:$N$66) to identify rows that will be part of the bins_array returned by the second IF statement. This expression also returns an array of TRUE/FALSE values.
- Since either one of the conditions above is sufficient for a row to be included when forming the bins_array, we add the two arrays (+), which is effectively similar to a logical "OR" statement. This arithmetic operation again coerces TRUE's to 1's and FALSE's to 0's, and we obtain a resultant array containing elements of 0, 1, or 2 --- 0 where none of these criteria are met (these should coincide with the rows of interest determined from the first IF statement), 1 where one of the conditions is met, and 2 where both criteria are met. Since we want to identify any of the non-zero elements, we perform a logic check on the resultant array---that's what the >0 does (see red). The result is an array containing only TRUE or FALSE, with TRUE indicating that at least one condition is met. This TRUE/FALSE array is the logical_test array of the second IF function, and it returns ROW(Data!$N$5:$N$65) for any of the TRUE elements, meaning the row numbers corresponding to data that do not meet our filtering criteria used in the first IF statement will be returned. Since the IF statement includes no result to return for FALSE elements, those will remain as FALSE.
- The final array returned from this bins_array expression will look like this:
{FALSE;6;7;8;9;10;11;12;13;14;15;16;17;FALSE;FALSE;20;FALSE;FALSE;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65} Examining the bin limits in this array, we see that rows numbers before 6, between 17 & 20, between 20 & 23, and after 65 will be the groupings to be counted. All other elements in bin_array are consecutive numbers, so there can be no row numbers between them.
Recap of the FREQUENCY function and the resultant array it returns:
- To recap, the two arrays (data_array and bins_array) are used in the FREQUENCY function, where the bins_array establishes range limits for grouping elements in the data_array. Then the FREQUENCY function will count the number of elements in each grouping. Any FALSE array elements are ignored. The resultant array from the FREQUENCY function is:{1;0;0;0;0;0;0;0;0;0;0;0;2;2;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}
- We see that there is one group of 1 and two groups of 2. These are the "streaks" that satisfy our filter criteria.
Operating on the resultant array from the FREQUENCY function:
- Next we subject the resultant array from the FREQUENCY function to a logic check to determine how many of these groupings (or streaks) satisfy our streak length criteria. For this query, we want to know how many streaks have a length of 2 or more. Using some shorthand to strip the entire FREQUENCY function out of the formula and replace it with its "resultant array," we are left with this expression for our formula:
=SUM( --({resultant array}>1) )
- Here we check whether the streaks in the resultant array have a length >1...that's what ({resultant array}>1) does. This will return another array consisting of elements that are either TRUE or FALSE.
- Then we apply the double unary operator (--) to the TRUE/FALSE array to coerce values of 1's and 0's from the TRUE's and FALSE's, and by doing so, we can sum the array elements to determine the total number of streaks satisfying the desired criteria.
Hopefully, this explanation will clarify how to make changes to introduce variations of the filter criteria and streak length. Formulas following this approach for Queries 5C and 5D are shown in the worksheet. As a side note, upon reviewing the arrays in detail, I discovered an error (probably inconsequential) in the earlier formulas where
$N$6:$N$65 was referenced (in multiple places). As that range refers to an offset of one week from the normal data table range of
$N$5:$N$65, to represent the "previous" weeks, the reference for previous weeks should be
$N$6:$N$66 (note the last digit). This correction is also reflected in the sample worksheet below.
MrExcel_20220310_Trie.xlsx |
---|
|
---|
| A | B | C |
---|
2 | | Results | |
---|
3 | Pips+ & OI higher | 6 | Query 1: count how many times the week closed positive AND the Open Interest was higher than the previous week |
---|
4 | Pips- & OI higher | 5 | Query 2: count how many times the week closed negative AND the Open Interest was higher than the previous week |
---|
5 | Pips+ & OI lower | 6 | Query 3: count how many times the week closed positive AND the Open Interest was lower than the previous week |
---|
6 | Pips- & OI lower | 5 | Query 4: count how many times the week closed negative AND the Open Interest was lower than the previous week |
---|
7 | Pips- & OI lower streak >=2 | 2 | Combo Query 5A: count how many times the week closed negative AND there was a streak of at least 2 consecutive weeks where Open Interest was higher (AND applies to all elements in streak) |
---|
8 | Pips+ & OI higher streak >=2 | 1 | Combo Query 5B: count how many times the week closed positive AND there was a streak of at least 2 consecutive weeks where OI was higher (AND applies to all elements in streak) |
---|
9 | Pips+ & OI lower streak >=2 | 2 | Combo Query 5C: count how many times the week closed positive AND there was a streak of at least 2 consecutive weeks where Open Interest was lower (AND applies to all elements in streak) |
---|
10 | Pips- & OI lower streak >=2 | 1 | Combo Query 5D: count how many times the week closed negative AND there was a streak of at least 2 consecutive weeks where OI was lower (AND applies to all elements in streak) |
---|
11 | | | |
---|
12 | Pips+ | 12 | Query 6: count how many times bars closed higher |
---|
13 | Pips- | 10 | Query 7: count how many times bars closed lower |
---|
14 | OI Higher Streaks >1 | 2 | Query 8: count how many streaks of at least 2 consecutive weeks occur where Open Interest was higher than the previous week |
---|
15 | OI Lower Streaks >1 | 4 | Query 9: count how many streaks of at least 2 consecutive weeks occur where Open Interest was lower than the previous week |
---|
16 | | 1 | Query 10: count how many streaks of exactly 2 consecutive weeks occur where Open Interest was higher than the previous week |
---|
17 | | 0 | Query 11: count how many streaks of exactly 3 consecutive weeks occur where Open Interest was higher than the previous week |
---|
18 | | 1 | Query 12: count how many streaks of at least 3 consecutive weeks occur where Open Interest was higher than the previous week |
---|
19 | | 0 | Query 12: count how many streaks of exactly 4 consecutive weeks occur where Open Interest was higher than the previous week |
---|
20 | | 0 | Query 12: count how many streaks of exactly 5 consecutive weeks occur where Open Interest was higher than the previous week |
---|
21 | | 1 | Query 12: count how many streaks of exactly 6 consecutive weeks occur where Open Interest was higher than the previous week |
---|
|
---|