Using FILTER/UNIQUE on text, numbers

danhendo888 · Mar 16, 2024

I download the below spreadsheet daily from our software. It shows details of invoices paid/booked.
*It doesn't show it below but the first three columns have the little green ribbon on the top left corner of the cell and is aligned to the left of the cell and has General formatting.



Company Number	Invoice Number	Order Number	Order Date	Order Tax	Order Subtotal	Currency	Payment Date
1111111111	ABC123 y	8190090120	3/12/2024	10.00	100.00	AUD
1111111111	ABC123 y	8190090120	3/12/2024	50.00	500.00	AUD
1111111111	ABC123 y	8190090120	3/12/2024	0.00	0.00	AUD	3/14/2024
1111111111	ABC123 y	8190090120	3/12/2024	0.00	0.00	AUD	3/14/2024

2222222222	111222000	8190090000	1/1/2024	0.00	10,000.00	AUD

Below is my own spreadsheet. I type in the invoice number and retrieve the Order Number, Order Total, and Payment Date from the downloaded file.
The first column has the little green ribbon at the top left corner also.
The last three columns is the output I wish to achieve



Company Number	Invoice Number	Order Number	Order Total	Payment Date
1111111111	ABC123	8190090120	660	3/14/2024
2222222222	111222000	8190090000	10,000.00

Order Number:
To retrieve the Order Number, I use FILTER and UNIQUE. It retrieves it successfully when the invoice number in my spreadsheet has text formatting e.g. ABC123
But it doesn't work when the invoice number in my spreadsheet has number formatting because the invoice numbers in the downloaded file does not have number formatting. How can I make the formula work for both text and numbers.

Also, how can I retrieve the correct order number when the correct invoice number is "ABC123" but it was incorrectly booked as "ABC123 y"

Order Total:
I use two SUMIFS to add the Order Tax and Order Subtotal columns together.
Is there a more efficient way to add these? Or is this okay to use.

Payment Date:
How can I retrieve the payment date for invoice ABC123? The date does not necessarily show in all rows in the extracted data

danhendo888 · Mar 20, 2024

KRice said:
Is your query data in A:B always complete?...that is, you always have a company number and an invoice number on each row? Or do you sometimes have blanks? I think that's why the (INDEX(d, , 1) = cn) * inside FILTER isn't working quite right.

Yes, correct. There will be a company number and an invoice number on each row. And there will be no blanks.

KRice said:
So are you saying that you would like to look for an exact match with inputted company number and a somewhat fuzzy match with inputted invoice number and do the same thing?

Yes, correct. So that if there are two companies (e.g. Tesla, Ford) that have both issued an identical invoice number (ABC123), and my input columns equals cn=Tesla, in=ABC123, the output should display only the invoice details for Tesla and not both Tesla and Ford.

KRice · Mar 20, 2024

Okay, thanks. So the customer "number" is not always a number? It should be considered as a text data type then?

KRice · Mar 21, 2024

I've revisited the TEXTJOIN formula and have a version that requires exact matching for Company and fuzzy matching for Invoice Number. I also stepped through each part to address some nagging error conditions, and in doing so, discovered a problem with the original "res" function that uses HSTACK. One of the arguments was "r" so that the "cn" and "in" values would be captured, but that's a problem when more than one order number satisfies the filter criteria. Recall that in "res", we are stacking vertical arrays of uon, son, and pdon (unique order number, sum of common unique order numbers, and latest payment date among the common unique order numbers)...so these arrays have the same length as the number of unique order numbers. Therefore, HSTACK(r... or HSTACK(cn,in... doesn't work. We need a vertical array of each of those values, so some new variables are found in the formula: We determine the count of unique order numbers for each by-row query of customer number/invoice number...that count is variable "con". Then four vertical arrays are formed, each having a length of "con":

an array of delimiters to be used at the end of each row in each order block, called "era", because an array of "//" delimiters is needed for the same reason just mentioned,
an array of blanks used for error trapping, just in case either "cn" or "in" is blank, called "blnks",
an array of "cn" values, called "cna", and
an array of "in" values, called "ina".

Then by changing the error trapping to deal with certain situations (nothing found by FILTER in "f" returns a blank array, and known values 0 or "none" are used by "son" and "pdon", respectively), the formula should be more robust. I also revised the text string manipulation in "sr" somewhat to sequentially tidy up the end of row delimiters that are concatenated with column delimiters during the TEXTJOIN operation.

Here is the data set I've been using for testing:

MrExcel_20240315.xlsx

A

B

C

D

E

F

G

H

1

Company Number

Invoice Number

Order Number

Order Date

Order Tax

Order Subtotal

Currency

Payment Date

2

Tesla

ABC123 y

8190090120

3/12/2024

10

100

AUD

3

Ford

ABC123 y

8190090144

3/12/2024

50

500

AUD

3/13/2024

4

Ford

ABC123

8190090120

3/12/2024

5

10

AUD

3/14/2024

5

1133Opel

ABC123 q

8190090222

3/12/2024

6

20

AUD

3/20/2024

6

1133Opel

ABC123 y

8190090999

3/12/2024

20

99

AUD

3/18/2024

7

Fiat

ABC123

8190090222

3/12/2024

7

100

AUD

3/15/2024

8

777

111222z

8190090000

1/1/2024

3

33

AUD

9

777

111222 x

8190092222

1/1/2024

4

44

AUD

3/16/2024

10

VW44

11144z

8190090111

1/1/2024

5

55

AUD

3/17/2024

11

VW44

11144

8190090888

1/1/2024

6

66

AUD

12

333

111552 g

8190090000

1/1/2024

7

77

AUD

3/8/2024

13

888

111222

14

888

111222

ord1

1.5

12.5

AUD

3/19/2024

15

Tesla

ABC123 zzz

8190090120

3/12/2024

10

100

AUD

16

Ford

ABC123 y

8190090144

3/13/2024

5

55

AUD

3/17/2024

Daily

...and the revised formula:

MrExcel_20240315.xlsx

A

B

X

Y

Z

AA

AB

AC

1

Company Number

Invoice Number

Company Number

Invoice Number

Order Number

Order Total

Payment Date

2

Ford

ABC123

Ford

ABC123

8190090120

15.00

3/14/2024

3

777

111222

Ford

ABC123

8190090144

610.00

3/17/2024

4

333

111552

777

111222

8190090000

36.00

none

5

VW44

11144

777

111222

8190092222

48.00

3/16/2024

6

Tesla

ABC123

333

111552

8190090000

84.00

3/8/2024

7

xyz

VW44

11144

8190090111

60.00

3/17/2024

8

1133Opel

ABc123

VW44

11144

8190090888

72.00

none

9

Fiat

ABc123

Tesla

ABC123

8190090120

220.00

none

10

888

111222

xyz

-

none

11

1133Opel

ABc123

8190090222

26.00

3/20/2024

12

1133Opel

ABc123

8190090999

119.00

3/18/2024

13

Fiat

ABc123

8190090222

107.00

3/15/2024

14

888

111222

ord1

14.00

3/19/2024

15

888

111222

-

none

MySummary2

Cell Formulas
Range	Formula
Y2:AC15	Y2	=LET(daily, Daily!A:H, lrowd, LOOKUP(2, 1 / (INDEX(daily, , 2) <> ""), ROW(INDEX(daily, , 2))), d, Daily!A2:INDEX(Daily!A:H, lrowd, ), mine, A:B, lrowm, LOOKUP(2, 1 / (INDEX(mine, , 2) <> ""), ROW(INDEX(mine, , 2))), m, A2:INDEX(A:B, lrowm, ), intres, BYROW(m, LAMBDA(r, LET( cn, INDEX(r, , 1), in, INDEX(r, , 2), f, FILTER(d, (INDEX(d, , 1) = cn) * ISNUMBER(SEARCH(in, INDEX(d, , 2))), EXPAND("",1,COLUMNS(d),"")), uon, SORT(UNIQUE(INDEX(f, , 3))), con, COUNTA(uon), era, EXPAND("//",con,,"//"), blnks, EXPAND("",con,,""), cna, IFERROR(EXPAND(cn,con,,cn),blnks), ina, IFERROR(EXPAND(in,con,,in),blnks), son, BYROW(uon, LAMBDA(r, IFERROR(SUM(CHOOSECOLS(f, 5, 6) * (CHOOSECOLS(f, 3) = r)), 0))), pdon, BYROW(uon, LAMBDA(r, IFERROR(1 / (1 / MAX(CHOOSECOLS(f, 8) * (CHOOSECOLS(f, 3) = r))), "none"))), res, HSTACK(cna,ina,uon, son, pdon,era), TEXTJOIN("\|",0, res) ))), sr, DROP(TEXTSPLIT(CONCAT(SUBSTITUTE(SUBSTITUTE(intres, "\|//\|", "//"),"\|//","//")), "\|", "//",0),-1), finres, HSTACK(CHOOSECOLS(sr, 1,2,3), --CHOOSECOLS(sr,4), IFERROR(--CHOOSECOLS(sr,5),"none")), finres )
Dynamic array formulas.

danhendo888 · Mar 22, 2024

1)
How is cna, IFERROR(EXPAND(cn,con,,cn),blnks) able to determine the correct number of times each cn is displayed in Column Y?
cn is the first column of m. each cn appears only once in m. Also, con > cn.
I'm unsure how these produce Ford to appear twice, 777 to appear twice etc in Column Y

2)
intres, BYROW(m, LAMBDA(r,
. . .
son, BYROW(uon, LAMBDA(r, IFERROR(SUM(CHOOSECOLS(f, 5, 6) * (CHOOSECOLS(f, 3) = r)), 0))),
. . .

Regarding the BYROW within a BYROW
son is a one column array produced by feeding uon into (f, 5, 6)
Is m then fed into son? The part unclear to me is that m is just the cn and in, how does it know what to do when it is fed into son?

KRice · Mar 22, 2024

Good questions. There is a lot happening in this formula. The outer BYROW is feeding row-by-row from m...so a single row of your inputs in columns A:B (e.g., {"Ford","ABC123"}). Ford is assigned to cn, and ABC123 is assigned to in. Then f is the FILTER function is extract from d (the main data on the Daily worksheet) all entries that satisfy the filter criteria (exact match for cn and fuzzy match for in). Now within this subset of data (f), we may have multiple order numbers associated with the combination of cn and in, but for the final output we need to aggregate by order number, so we begin a series of calculations that draw information from f.

Since the intermediate results need to list by order numbers that are unique to the cn/in combination, we first determine which unique order numbers are present in f...that's uon. So for {"Ford","ABC123"}, there are 3 rows of data in Daily, and among these three lines, there are two unique order numbers (two entries for 8190090144, and one entry for 8190090120). So the number of rows of uon determines the number of output rows for this cn/in combination. Therefore con = COUNTA(uon) is the count needed, and we use con to construct four arrays of the correct size (for era, blnks, cna, ina). In this example, then, con=2...there are two unique order numbers, and ultimately we want vertical column arrays that are two rows high so that we can assemble these vertical arrays for the intermediate results (intres). Recall that Excel interprets array structure based on commas as column separators and semicolons as row separators. So in the event that an array of blanks might be needed, blnks, EXPAND("",con,,"") produces an array {"";""} in this case, and this is treated as a one column array having two rows and all elements are blanks. If you're not familiar with EXPAND, the 1st argument represents the starting array (in this case a single value), and we expand the size of the starting array to a total size of so many rows (2nd argument) and so many columns (3rd argument), and these newly inserted array elements would show up as errors unless we pad them out with some value (4th argument)...so to repeat the same value, we use our starting value as the pad value.
Then cna, is formed in a similar manner... IFERROR(EXPAND(cn,con,,cn),blnks) produces {"Ford";"Ford"}, unless cn is blank, which will cause EXPAND to generate an error, so we trap the error by using the blnks array just mentioned. The same idea is applied to generate the ina and era column arrays. This is a rather long-winded answer to question 1, but it explains how we can take a single cn/in input and expand the "cn" and "in" values the correct number of times "con" so that they can be merged with the unique order numbers and other summary results.

Now within "intres", yes, we feed "m" into the outermost BYROW, and this row {"Ford","ABC123"} (note the comma indicating a two-column array having only one row) is assigned to "r"...and we use that "r" in the assignment of the "cn" and "in" variables. But then we come to "son" and "pdon" where we enter new BYROW functions. An important note here: the "r" that you see within a LAMBDA function is entirely self-contained within that LAMBDA. In other words, the "r" seen inside the "son" expression is a row-by-row assignment of "uon" and that "r" is recognized only out to its LAMBDA where the input row of uon is mapped to the variable r. This "r" is not the same r found in the outer intres (BYROW(m, LAMBDA(r,..." expression, and its not the same r found in the pdon expression. This is a convention I tend to use to distinguish rows (r) from columns (c), but the internal assignment for the mapping inside a LAMBDA is entirely arbitrary. For example, you could change "son" and "pdon" to
son, BYROW(uon, LAMBDA(x, IFERROR(SUM(CHOOSECOLS(f, 5, 6) * (CHOOSECOLS(f, 3) = x)), 0))),
pdon, BYROW(uon, LAMBDA(y, IFERROR(1 / (1 / MAX(CHOOSECOLS(f, 8) * (CHOOSECOLS(f, 3) = y))), "none"))),
without any loss of functionality, as they mean exactly the same thing as in the current formula.
Again, this is probably a long-winded answer to question 2, but it is uon that is fed row-by-row into the son and pdon calculators in order to perform the aggregation sum and identify the max date for each unique order number. Each of these (son and pdon) produce a single result for each row (indeed, BYROW will only deliver a single result), so son is a vertical array of sums that correspond to each unique order number fed to BYROW; and pdon is also a vertical array of dates that correspond to each unique order number fed to it.

Let me know if this clears up the confusion.

danhendo888 · May 12, 2024

Hi Kirk,
Thank you for the explanation and apologies for not asking these questions sooner when these formulas would have been fresh in our minds.

1) Is there a way/trick to calculate parts of the formula e.g. f, cna or ina? I'm fine with deleting parts of the formula just to calculate for these items and then CTRL+Z to restore the formula. I tried this but still cannot calculate for f, cna, ina.

2) Regarding the outer BYROW, I can see that it is being applied to cn and in. And that it is not applied in uon. But I believe it is being applied to f, cna, ina since these are using cn and in. Is this correct?
I'm trying to understand what the formula would look like for f or son or pdon, had you decided not to use the outer BYROW and instead put BYROW in each instance where it is required (like in son and pdon).

3) Suppose I wanted to recreate f in cell J14 below. How would you update the existing cell?

4) lrowd, LOOKUP(2, 1 / (INDEX(daily, , 2) <> "")
Can you please explain your formula for the lookup_vector and why we look for 2?

MySummary2.xlsx

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

1

Daily

MySummary

2

Company Number

Invoice Number

Order Number

Order Date

Order Tax

Order Subtotal

Currency

Payment Date

Company Number

Invoice Number

3

Tesla

ABC123 y

8190090120

3/12/2024

10

100

AUD

Ford

ABC123

4

Ford

ABC123 y

8190090144

3/12/2024

50

500

AUD

3/13/2024

777

111222

5

Ford

ABC123

8190090120

3/12/2024

5

10

AUD

3/14/2024

6

1133Opel

ABC123 q

8190090222

3/12/2024

6

20

AUD

3/20/2024

7

1133Opel

ABC123 y

8190090999

3/12/2024

20

99

AUD

3/18/2024

8

Fiat

ABC123

8190090222

3/12/2024

7

100

AUD

3/15/2024

9

777

111222z

8190090000

1/1/2024

3

33

AUD

Ford

ABC123 y

8190090144

3/12/2024

50

500

AUD

3/13/2024

10

777

111222 x

8190092222

1/1/2024

4

44

AUD

3/16/2024

Ford

ABC123

8190090120

3/12/2024

5

10

AUD

3/14/2024

11

VW44

11144z

8190090111

1/1/2024

5

55

AUD

3/17/2024

Ford

ABC123 y

8190090144

3/13/2024

5

55

AUD

3/17/2024

12

VW44

11144

8190090888

1/1/2024

6

66

AUD

13

333

111552 g

8190090000

1/1/2024

7

77

AUD

3/8/2024

14

888

111222

#N/A

15

888

111222

ord1

1.5

12.5

AUD

3/19/2024

16

Tesla

ABC123 zzz

8190090120

3/12/2024

10

100

AUD

17

Ford

ABC123 y

8190090144

3/13/2024

5

55

AUD

3/17/2024

Daily

Cell Formulas
Range	Formula
J9:Q11	J9	=FILTER(A3:H17, (INDEX(A3:H17, , 1) = J3) * ISNUMBER(SEARCH(K3, INDEX(A3:H17, , 2))), EXPAND("",1,COLUMNS(A3:H17),""))
J14	J14	=FILTER(A3:H17, (INDEX(A3:H17, , 1) = J3:J4) * ISNUMBER(SEARCH(K3:K4, INDEX(A3:H17, , 2))), EXPAND("",1,COLUMNS(A3:H17),""))
Dynamic array formulas.

KRice · May 12, 2024

Let's start with the last question...the easiest. The approach shown for finding the last row number containing data looks for 2 because 2 cannot exist in the expressions 1/(INDEX(daily, , 2)<>"" or 1/(INDEX(mine, , 2)<>"". The logical part of those expressions <>"" (is not blank) results in a TRUE or FALSE, and TRUEs/FALSEs are coerced to 1/0's when used with a mathematical operation. So 1/TRUE is 1/1 or 1, and 1/FALSE is 1/0, which produces a #DIV/0! error. LOOKUP is one of those functions that handles errors, and if it cannot find the searched for value of 2 (or 3 or 4, etc.), it will by default return the last non-error index. There are several other ways the last data row could be determined, so there is nothing crucial about using this particular approach.

Regarding the other questions, let's examine #2:

danhendo888 said:
2) Regarding the outer BYROW, I can see that it is being applied to cn and in. And that it is not applied in uon. But I believe it is being applied to f, cna, ina since these are using cn and in. Is this correct?
I'm trying to understand what the formula would look like for f or son or pdon, had you decided not to use the outer BYROW and instead put BYROW in each instance where it is required (like in son and pdon).

The outermost BYROW belongs to the intres variable, and it takes the "m" array as its argument, so it is m that is being operated on row-by-row. Each row of m, when it is to be operated on, is assigned to the r variable (a convention I tend to favor to remind myself that r is a (r)ow from some array). If we examine the main formula for "m", we see that m is simply your query inputs for Company Number and Invoice Number. Therefore, for convenience, every time a new row is operated on in m, we assign its first element (the Company Number) to variable cn, and its second element (the Invoice Number) to variable in. Then we execute the FILTER function to extract all entries in the main data where we match "cn" and also find the "in" text string somewhere in the recorded invoice numbers. So FILTER will potentially return a multi-row array, should there be more than one data table entry that satisfies these criteria. This potentially multirow f array is operated on by determining the unique combinations of customer number/invoice number present in the main data table that meet the search criteria, and then sorting the results based on the 3rd column (the Order Number) for convenience...that's what uon does. So it would not be correct to say that BYROW was being applied to f, cna, and ina. Rather the outer BYROW sequentially takes a row of m, and uses that particular combination of customer number and invoice number to extract from the main data table all desired matching/fuzzy matching entries to create the f array. Then to compile the order sums and payment dates, we need to operate on uon row by row.

Let me know if this helps to clear up two of the questions. I'm out of time at the moment, but will follow up about the other questions later.

danhendo888 · May 13, 2024

Feels much clearer now, thank you, except for one thing about f.

The formula is pointing to cn and in which are arrays. But when I copy/pasted just the FILTER formula (to cell J14) and hardcoded the cell ranges using the array range for cn/in, it doesn't seem to work. It only works when I pointed to one Company Number and one Invoice Number (cell J9). What am I overlooking here?
(Please ignore question 3 as it is related to the below)

Regarding question 1, I'm curious as to how you would check that your formula is working while working on it.
For instance, to calculate lrowd, I can type lrowd and delete the rest of the formula as per below. But suppose I wanted to calculate for f or ina, how would you do that?
=LET(daily, Sheet1!A:H, lrowd, LOOKUP(2, 1 / (INDEX(daily, , 2) <> ""), ROW(INDEX(daily, , 2))), lrowd)

KRice · May 13, 2024

danhendo888 said:
The formula is pointing to cn and in which are arrays.

They are not arrays. We have this construction in the formula:

Excel Formula:

 m, A2:INDEX(A:B, lrowm, ),
 intres, BYROW(m, LAMBDA(r,
                       LET( cn,   INDEX(r, , 1),
                            in,   INDEX(r, , 2),

This means that m is a two-column array consisting of your input query pairs (customer number in column A and invoice number in column B). And we pass m to the BYROW LAMBDA helper function where it is used to evaluate the expression(s) found inside the LAMBDA, but importantly, where "it" refers to each row of m taken one row at a time. And when a row is taken on each iteration, the row is treated as a horizontal array and assigned to the dummy variable r. On the first iteration, r would be {"Ford", "ABC123"}, the second iteration {777, 111222}, and so on. The INDEX functions then assign single values to cn and in variables: On the first iteration, cn is Ford, then on the second iteration cn is 777, etc., so cn and in each have only one value at any specific time during this row-by-row analysis.

This means that the f expression is being executed once for every row of m (your A:B query inputs). The f expression operates on the entire source data table d and extracts from it all entries where the customer number in d matches the current single value of cn, AND also where the invoice number in d contains the same text string as the current single value of in. This is important because each of the resulting filter criterion arrays --- (INDEX(d, , 1) = cn) is one, and ISNUMBER(SEARCH(in, INDEX(d, , 2))), EXPAND("",1,COLUMNS(d),"") is another --- must have the same size and dimension. Both are single column vertical arrays whose length matches the length of d, the main data table. If cn were a multi-element array, then FILTER would generate an error because the criterion array's dimension would expand to more than a single column.

I wanted to delay responding to questions 1 and 3 until I had better context, which I think you've provided. If the primary reason is a spot check to confirm the values generated for some of these harder-to-interrogate values, then I would make a few changes to the current formula, but realize that some concessions are being made. On this last point, the intres variable uses the BYROW function, which allows only a single cell output for each row that it operates on, as opposed to a results array that might spill over many cells, perhaps horizontally and vertically, for each row operated on. But to see values for the variables generated inside the intres LET function, you'll need to interrogate at that level and accept the single cell output per row, or change the formula in more significant ways. The implication is that the output may not be as well structured as desired, but the content should still be present for inspection.

The inner LET is shielded from direct interrogation, because an instruction to display f from the outer LET will fail to "see" f in the inner LET. As a workaround, I would instruct the outer LET to display results for intres, and then change the results delivered by intres (that's what the TEXTJOIN function does...it combines multiple, otherwise spilling results into a single output cell) to reflect the desired variable (so instead of applying TEXTJOIN to res, it would be applied to f, uon, con, son, pdon). For example, to see the FILTER results:

within intres... TEXTJOIN("|",0, f ) ))),
then at the end, after finres... finres, HSTACK(CHOOSECOLS(sr, 1,2,3), --CHOOSECOLS(sr,4), IFERROR(--CHOOSECOLS(sr,5),"none")), intres )

Change the f shown above to any of the other inner LET variables to see their value(s) for each row.

MrExcel_20240315.xlsx

A

B

X

Y

1

Company Number

Invoice Number

Company Number

2

Ford

ABC123

Ford|ABC123 y|8190090144|45363|50|500|AUD|45364|Ford|ABC123 |8190090120|45363|5|10|AUD|45365|Ford|ABC123 y|8190090144|45364|5|55|AUD|45368

3

777

111222

777|111222z|8190090000|45292|3|33|AUD||777|111222 x|8190092222|45292|4|44|AUD|45367

4

333

111552

333|111552 g|8190090000|45292|7|77|AUD|45359

5

VW44

11144

VW44|11144z|8190090111|45292|5|55|AUD|45368|VW44|11144|8190090888|45292|6|66|AUD|

6

Tesla

ABC123

Tesla|ABC123 y|8190090120|45363|10|100|AUD||Tesla|ABC123 zzz|8190090120|45363|10|100|AUD|

7

xyz

|||||||

8

1133Opel

ABc123

1133Opel|ABC123 q|8190090222|45363|6|20|AUD|45371|1133Opel|ABC123 y|8190090999|45363|20|99|AUD|45369

9

Fiat

ABc123

Fiat|ABC123 |8190090222|45363|7|100|AUD|45366

10

888

111222

888|111222|||||||888|111222|ord1||1.5|12.5|AUD|45370

MySummary3

Cell Formulas
Range	Formula
Y2:Y10	Y2	=LET(daily, Daily!A:H, lrowd, LOOKUP(2, 1 / (INDEX(daily, , 2) <> ""), ROW(INDEX(daily, , 2))), d, Daily!A2:INDEX(Daily!A:H, lrowd, ), mine, A:B, lrowm, LOOKUP(2, 1 / (INDEX(mine, , 2) <> ""), ROW(INDEX(mine, , 2))), m, A2:INDEX(A:B, lrowm, ), intres, BYROW(m, LAMBDA(r, LET( cn, INDEX(r, , 1), in, INDEX(r, , 2), f, FILTER(d, (INDEX(d, , 1) = cn) * ISNUMBER(SEARCH(in, INDEX(d, , 2))), EXPAND("",1,COLUMNS(d),"")), uon, SORT(UNIQUE(INDEX(f, , 3))), con, COUNTA(uon), era, EXPAND("//",con,,"//"), blnks, EXPAND("",con,,""), cna, IFERROR(EXPAND(cn,con,,cn),blnks), ina, IFERROR(EXPAND(in,con,,in),blnks), son, BYROW(uon, LAMBDA(r, IFERROR(SUM(CHOOSECOLS(f, 5, 6) * (CHOOSECOLS(f, 3) = r)), 0))), pdon, BYROW(uon, LAMBDA(r, IFERROR(1 / (1 / MAX(CHOOSECOLS(f, 8) * (CHOOSECOLS(f, 3) = r))), "none"))), res, HSTACK(cna,ina,uon, son, pdon,era), TEXTJOIN("\|",0, f ) ))), sr, DROP(TEXTSPLIT(CONCAT(SUBSTITUTE(SUBSTITUTE(intres, "\|//\|", "//"),"\|//","//")), "\|", "//",0),-1), finres, HSTACK(CHOOSECOLS(sr, 1,2,3), --CHOOSECOLS(sr,4), IFERROR(--CHOOSECOLS(sr,5),"none")), intres )
Dynamic array formulas.

For the first query inputs (Ford and ABC123), the FILTER function produces this result for f...
Ford|ABC123 y|8190090144|45363|50|500|AUD|45364|Ford|ABC123 |8190090120|45363|5|10|AUD|45365|Ford|ABC123 y|8190090144|45364|5|55|AUD|45368
...so three entries in the main data table d, are found, combined into a single output cell. Later in the formula, this single cell result is parsed using TEXTSPLIT to clean it up for final output, but for interrogation purposes, this might suffice? You'll see the following three entries in the above string:
Ford|ABC123 y|8190090144|45363|50|500|AUD|45364|
Ford|ABC123 |8190090120|45363|5|10|AUD|45365|
Ford|ABC123 y|8190090144|45364|5|55|AUD|45368

danhendo888 · Jul 6, 2024

KRice said:
On this last point, the intres variable uses the BYROW function, which allows only a single cell output for each row that it operates on, as opposed to a results array that might spill over many cells, perhaps horizontally and vertically, for each row operated on. But to see values for the variables generated inside the intres LET function, you'll need to interrogate at that level and accept the single cell output per row

Regarding the above point about the single cell output per row. I must be not understanding something here because I tried hard coding the formula for f in C22 (to view it in isolation)

In C24, it manages to search "ABC123" and filter the data. No issues there.

In C22, I tried to search "ABC123" and "111222" using the same principle using BYROW but it's not working.
What am I missing here?

f test.xlsx

A

B

C

D

E

F

G

H

I

J

1

Company Number

Invoice Number

Order Number

Order Date

Order Tax

Order Subtotal

Currency

Payment Date

2

Tesla

ABC123 y

8190090120

3/12/2024

10

100

AUD

3

Ford

ABC123 y

8190090144

3/12/2024

50

500

AUD

3/13/2024

4

Ford

ABC123

8190090120

3/12/2024

5

10

AUD

3/14/2024

5

1133Opel

ABC123 q

8190090222

3/12/2024

6

20

AUD

3/20/2024

6

1133Opel

ABC123 y

8190090999

3/12/2024

20

99

AUD

3/18/2024

7

Fiat

ABC123

8190090222

3/12/2024

7

100

AUD

3/15/2024

8

777

111222z

8190090000

1/1/2024

3

33

AUD

9

777

111222 x

8190092222

1/1/2024

4

44

AUD

3/16/2024

10

VW44

11144z

8190090111

1/1/2024

5

55

AUD

3/17/2024

11

VW44

11144

8190090888

1/1/2024

6

66

AUD

12

333

111552 g

8190090000

1/1/2024

7

77

AUD

3/8/2024

13

888

111222

14

888

111222

ord1

1.5

12.5

AUD

3/19/2024

15

Tesla

ABC123 zzz

8190090120

3/12/2024

10

100

AUD

16

Ford

ABC123 y

8190090144

3/13/2024

5

55

AUD

3/17/2024

17

18

Company Number

Invoice Number

19

Ford

ABC123

20

777

111222

21

22

Attempt 1 - in

#CALC!

23

24

Attempt 2 - ABC123

Tesla

ABC123 y

8.2E+09

45363

10

100

AUD

0

25

Ford

ABC123 y

8.2E+09

45363

50

500

AUD

45364

26

Ford

ABC123

8.2E+09

45363

5

10

AUD

45365

27

1133Opel

ABC123 q

8.2E+09

45363

6

20

AUD

45371

28

1133Opel

ABC123 y

8.2E+09

45363

20

99

AUD

45369

29

Fiat

ABC123

8.2E+09

45363

7

100

AUD

45366

30

Tesla

ABC123 zzz

8.2E+09

45363

10

100

AUD

0

31

Ford

ABC123 y

8.2E+09

45364

5

55

AUD

45368

Sheet1

Cell Formulas
Range	Formula
C22	C22	=BYROW(B19:B20,LAMBDA(in,FILTER(A2:H16, ISNUMBER(SEARCH(in, INDEX(A2:H16, , 2))), EXPAND("",1,COLUMNS(A2:H16),""))))
C24:J31	C24	=FILTER(A2:H16,ISNUMBER(SEARCH(B19,INDEX(A2:H16,,2))),EXPAND("",1,COLUMNS(A2:H16),""))
Dynamic array formulas.

Using FILTER/UNIQUE on text, numbers

danhendo888

Board Regular

danhendo888

Board Regular

KRice

Well-known Member

KRice

Well-known Member

danhendo888

Board Regular

KRice

Well-known Member

danhendo888

Board Regular

KRice

Well-known Member

danhendo888

Board Regular

KRice

Well-known Member

danhendo888

Board Regular

Similar threads

Share this page

Using FILTER/UNIQUE on text, numbers

Board Regular

Board Regular

Well-known Member

Well-known Member

Board Regular

Well-known Member

Board Regular

Well-known Member

Board Regular

Well-known Member

Board Regular

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock