Generate combinatorials in Power Query

macfuller

Active Member
Joined
Apr 30, 2014
Messages
319
Office Version
  1. 365
Platform
  1. Windows
From a list I would like to generate combinations of n members from a set of x possibilities in Power Query.

I have found some useful code here on creating combinations of members based on all possible values of n, and here on generating permutations for x, but I regret that I'm not good enough at PQ to figure out how to adapt them.

For background, here is the real-world scenario. We have 19 vendors who have bid on providing 1,728 surgical items. The size of their discount varies depending on what percentage of our business (represented by how many other vendors will supply the same item) they receive. They offer pricing based on single vendor, 3-vendor, 5-vendor, and 8-vendor participation. (Yes, single sourcing is clearly the cheapest route but none of the vendors offer all 1,728 items.) So at each level of vendor participation we would like to know the cheapest combination of vendors.
(These are combinatorials rather than permutations since ABC is equivalent to BAC, CBA, and so on. The order in which vendors A, B, and C appear does not affect pricing.)
By generating these combinations we can then merge the price file from each vendor and use Power Pivot to determine which combinations do not offer all 1,728 items, which ones might offer more than one option/price for a particular item, and rank the combinations based on pricing times our historical usage of each item - taking the MIN price for items offered by each vendor within the combination.

I don't have a need for the same concept for permutations (generating n permutations from a set of x) but I suppose it would come in handy for someone!
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
As a workaround I can use the code from one of the links above. This generates all possible combinations - then within the query I can use Add Column -> Statistics -> Distinct Values to count the number of vendors in each row, then filter the ones that don't match the value of n I'm testing for.

Code:
// From https://social.technet.microsoft.com/Forums/windows/en-US/72bbaf95-9228-4474-a59f-d3c573a75d5b/power-query-to-generating-all-combinations-from-single-column?forum=powerquery

//    hardcoded column in List.Repeat  {[Vendors]} where Vendors is the column name

(t as table) as table =>

    let

        AddIndex = Table.AddIndexColumn(t, "Index", 0, 1),
        Max = Table.RowCount(t),
        ReverseIndeks = Table.AddIndexColumn(AddIndex, "RevIdx", Max, -1),
        Lists = Table.AddColumn(ReverseIndeks, "lists", each 
                          List.Repeat(
                                       List.Combine(
                                                     {
                                                        List.Repeat({[Vendors]}, Number.Power(2,[RevIdx]-1))
                                                       ,List.Repeat(   {null}, Number.Power(2,[RevIdx]-1))
                                                     }
                                                   )
                                       , Number.Power(2, [Index]))
                                     ),
        ResultTable = Table.FromColumns(Lists
[lists])
    in
        ResultTable

It's extra processing instead of generating only the set I want, but it works. However, I would like to parameterize a couple of things and can't figure out that next step.

First, in the query above, I'm not sure how to pass the column name as a parameter instead of hardcoding "Vendors".

The second area is to set the Distinct Values command in the master query to not have to hardcode the column names.

Code:
= Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({[Column5], [Column4], [Column3], [Column2], [Column1]})), Int64.Type)

Here is an example of code where I use the schema information to change all column types after the 3rd column without having to know the column names:
Code:
    Schema = Table.Schema(ChangeColTypes),

//    Columns are zero-based indexes
    FilterRowsForChanges = Table.SelectRows(Schema, each [Position] >= 3),
    PickFirstColumn = Table.SelectColumns(FilterRowsForChanges,{List.First(Table.ColumnNames(FilterRowsForChanges))}),
    MakeHeadersIntoList = Table.ToList(PickFirstColumn),
    ColumnCount = List.Count(MakeHeadersIntoList),
    TransformColumnsLocation = Table.TransformColumnTypes(
            ChangeColTypes, 
                Table.ToRows(
                    Table.FromColumns(
                        {MakeHeadersIntoList,
                             List.Repeat(
                                {type text},
                                 ColumnCount
                                    )
                            }
                 )
                )
            ),

But I'm having a tough time adapting it to the DISTINCT AddColumn that I want to do so I don't have to hardcode columns each time my source data changes. Ditto for the Unpivot portion when I move each vendor column to a single vendor row.
 
Upvote 0
Just some quick hints here:

Code:
= Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({[Column5], [Column4], [Column3], [Column2], [Column1]})), Int64.Type)

can be dynamically replaced by:

Code:
= Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({[B]Record.FieldValues(_)[/B]})), Int64.Type)


Make [Vendors] a variable here:

Code:
 List.Repeat({[Vendors]}, Number.Power(2,[RevIdx]-1)


would look so:


Code:
 List.Repeat({[B]Record.Field(_, MyColumnNameVariable)[/B]}, Number.Power(2,[RevIdx]-1)
 
Upvote 0
Solution
Thanks for the help ImkeF. On the AddColumn change
= Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({Record.FieldValues(_)})), Int64.Type)
I'm getting a value of 1 for every record. I hope I've transcribed it correctly!

Here's my test query code with the fnListCombinations function adapted to accept a parameter as you suggested.
Code:
let
    SourceList = {"Red",
                    "Blue",
                    "Yellow",
                    "Green",
                    "Blue"},

    StartList = Table.FromList(SourceList,Splitter.SplitByNothing(),{"Color"}),
    ChangeToText = Table.TransformColumnTypes(StartList,{{"Color", type text}}),
    ComboList = fnListCombinations(ChangeToText, "Color"),

//  Returns correct count
    DistinctCols = Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({[Column5], [Column4], [Column3], [Column2], [Column1]})), Int64.Type),
//  Returns 1 for each value
    DistinctCol2 = Table.AddColumn(ComboList, "Distinct Values", each List.NonNullCount(List.Distinct({Record.FieldValues(_)})), Int64.Type)

in
    DistinctCol2

I'm wondering where you pick up your intimate knowledge of M. Training course, book, website, or just the school of hard knocks?
 
Upvote 0
Do I understand correctly that you want to evaluate all combinations of allocating 1728 items among 19 numbers, and calculate the total cost for each combination?
 
Upvote 0
ImkeF, removing the curly brackets did the trick. What does adding curly brackets usually do?

shg, I only want to generate combinations with certain numbers of vendors. The =COMBIN(19,8) function in Excel shows 75,582 variations, so I would like to have all 75,582 combinations of any 8 vendors. The solution I've been discussing with ImkeF requires all combinations =COMBIN(19, x) for every value of x to be generated (524,287) and then to filter for only those combinations where x=8. If there is a solution to generate only the single value of x that would be even better.

My vendor file has prices quoted by each vendor for a 3-, 5-, and 8- vendor solution (88,179 possible combinations) for each of the 1,728 items they may carry. I have the item lookup file with all 1,728 items listed and our prior year usage and cost. So I expect to merge the item master file with the price file(s), then merge with the combination file so that I have a row for every vendor/combination/item/price/history combination. I can then create a measure that will take the MIN([Price]) across all vendors in that combination/item set, multiply by the usage, and sum for all items in that combination for a total price. Then sort low-to-high for the measure to find the cheapest vendor combination.

Since few if any combinations may provide all 1,728 items we are still debating internally how to "true up" the values - combinations with only 1,000 items would obviously appear far cheaper than ones with 1,700 items offered. We might set a floor for items depending where the mean or std dev clusters, or we might take the historical price and usage where the items are not offered to bring everyone to the 1,728 level.
 
Upvote 0
I know nothing of power BI, but can certainly provide (VBA) code to generate all combinations of N choose M if that helps.
 
Upvote 0
Curly brackets have basically 2 functions:
- positional index indicator: Select an item from a list or table by putting its position in it: MyTable{0} will return the first row from a table
- list creation: {1, 2} creates a list with 2 items: 1 and 2.

In my case, Record.FieldValues returned a list already, so putting that in another list created nested list which wasn't equal to what you had before so couldn't be processed further.
 
Upvote 0
Thanks to everyone who helped. We got the 3-vendor rankings up and running. The final merged pricing table was 5+ million rows for 969 vendor combinations and 1,728 products but PowerPivot handled it quickly. With 11,628 combinations for 5 vendors and 75,582 for 8 vendors those are going to be whoppers when we get to them... north of 400 million rows for the 8-vendor combos. We'll have to see if it makes sense to leave the tables separate and write some convoluted measures instead of merging everything for simplicity.

I've never tested if a slicer can handle 75,000+ separate values :eeek:. We may have to re-think how we access the combinations for drill-down.
 
Upvote 0

Forum statistics

Threads
1,223,227
Messages
6,170,849
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top