Power query : set a delete condition such as when column empty cell percentage exceeds 90%

yxz152830

Active Member
Joined
Oct 6, 2021
Messages
395
Office Version
  1. 365
Platform
  1. Windows
Gurus,
I have a table from external source. I know for sure that if a column has only 1 or 2 cells with strings, the column is for remarks or something else. I want to remove them using powery query not vba but I don't know how to set up the criteria. Is my title criterion viable in power query? Or should I just load the table and clean it with vba? Thank you!
 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
This query will transform an excel table named Table1 by deleting all columns where 90% or more of that column's values are null.

Not sure whether that is exactly what you're looking for.

Power Query:
let
    /* UDF */ Countif = (lst as list, value as any) => List.Accumulate(lst,0, (s,c)=> s+ (if c = value then 1 else 0)),
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], 
    lst = Table.ToColumns(Source),
    tcn = Table.ColumnNames(Source),
    lc = List.Count(lst),
    rc = Table.RowCount(Source),
    threshold = 0.90,
    KeepColumn = List.Generate(()=>0,(x)=> x< lc, (x)=> x+1, (x)=> if Countif(lst{x}, null)/rc < threshold then true else false),
    Cols = List.RemoveNulls(List.Generate(()=>0,(x)=> x< lc, (x)=> x+1, (x)=> if KeepColumn{x} then [Columns = lst{x}, ColNames = tcn{x}] else null)),
    tbl1 = Table.FromList(Cols,Record.FieldValues, {"Columns","ColNames"}),
    Result = Table.FromColumns(tbl1[Columns],tbl1[ColNames])
in
    Result
 
Upvote 0
Solution
Here's a better way:

Power Query:
[CODE=pq]
let
    /* UDF */ Countif = (lst as list, value as any) => List.Accumulate(lst,0, (s,c)=> s+ (if c = value then 1 else 0)),
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], 
    lst = Table.ToColumns(Source),
    tcn = Table.ColumnNames(Source),
    lc = List.Count(lst),
    rc = Table.RowCount(Source),
    threshold = 0.90,
    KeepColumn = List.RemoveNulls(List.Generate(()=>0,(x)=> x< lc, (x)=> x+1, (x)=> if Countif(lst{x}, null)/rc < threshold then tcn{x} else null)),
    Result = Table.SelectColumns(Source, KeepColumn)
in
    Result
[/CODE]
 
Upvote 0

Forum statistics

Threads
1,223,690
Messages
6,173,847
Members
452,535
Latest member
berdex

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top