Find empty or constant columns

Redoute

New Member
Joined
Nov 4, 2018
Messages
12
I tried this snippet in Excel/Power Query to find empty and constant columns in my table:

Code:
delcols =
List.Select(
    Table.ColumnNames(previous_step),
    each List.Count(
            List.Distinct(
                Table.Column(previous_step, _))) <= 1),

However, when I use this with about 60 columns and 50.000 rows, Excel stalls and I have to kill it via task manager.

Any suggestions to do this more performant?

Thanks, Redoute
 

Excel Facts

Get help while writing formula
Click the italics "fx" icon to the left of the formula bar to open the Functions Arguments dialog. Help is displayed for each argument.
Hi
My data set is 60 columns and 60.000 rows with numbers. One is full empty and three are with single number.
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Таблица1"]}[Content],
    toRows = Table.Transpose(Source),
    signs = Table.AddColumn(toRows, "sign", each List.NonNullCount(List.Distinct(Record.FieldValues(_))) <= 1)[sign],
    tableInfo = Table.FromColumns({Table.ColumnNames(Source), signs}, {"ColName", "Sign"}),
    delColNames = Table.SelectRows(tableInfo, each [Sign])[ColName]
in
    delColNames
That code returns four column names as a list for 14 seconds on my computer.
Regards,
 
Upvote 0
Great! This takes eight minutes or so for my complete query. My data contains lots of highly varying strings, so distinct-lists will be long.

I thought Power Query organizes data column-wise, so Transpose would be counterproductive. Seems I deduced that wrongly from Pandas or Power Pivot/DAX.
 
Upvote 0
So instead of productive work I continued to think about the problem. First I got rid of full distinct-lists, which brought me to about five minutes:

Code:
delcols =
Table.SelectRows(
    Table.FromColumns({
        Table.ColumnNames(previous_step),
        Table.TransformRows(
            Table.Transpose(previous_step),
            each let
                    l = Record.ToList(_)
                 in
                    not List.Contains(
                        l,
                        List.First(l),
                        (x, y) => x <> y))},
        {"colname", "constant"}),
    each [constant])[colname],

Now I have another approach without Transpose and got down to two minutes:

Code:
delcols =
Record.FieldNames(
    List.Accumulate(
        Table.ToRecords(previous_step),
        previous_step{1},
        (state, current) =>
        Record.RemoveFields(
            state,
            List.Select(
                Record.FieldNames(state),
                each Record.Field(current, _) <> Record.Field(state, _))))),
 
Upvote 0

Forum statistics

Threads
1,223,810
Messages
6,174,763
Members
452,582
Latest member
ruby9c

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top