Conditional - remove blanks and shift left in Power Query

obiwilson

New Member
Joined
Jul 25, 2015
Messages
17
Hi,

I have a large (40K) address database to clean up. The initial cleanup involves removing all the blank cells and shifting the data left (columns J to M), based upon what the address starts with (letters or numbers) it needs to be moved back to the either the Business Name column or the Address 1 column. And just to make things complicated, if the address begins with "shop" or "PO Box" it needs to be moved to the Address 1 column.

I have manually achieved this using Ctrl + G and removing blanks, shifting left, sorting a-z and moving Shop/PO Box 1 column across. However, I would like to automate the process up to this point for future cleaning but was unable to achieve the same results within Power Query. I tried some similar solution on the forums but was unable to get the unpivot/pivot solution or M code solutions to work.

Here is a sample of what I'm working with:
Associate IDAssociate Client CodePrefixFirst NameMiddleLast NameSuffixGreetingSpecialtyBusiness NameAddress 1Address 2Address 3CityStateZip Code
70396157I00:26:21TestAccountDoctorEpping Plaza Medical & Dental CentreShop 216 Epping Plaza571-583 High StreetEPPINGVIC3076
91556921I00:26:27TestAccountDoctorChelsea Arcade MedicalShop 10 & 11426 Nepean HighwayCHELSEAVIC3196
90094825I00:26:37TestAccountDoctorBentleigh Medical Centre254 Centre RoadBENTLEIGHVIC3204
57666989I00:30:19TestAccountDoctorStrathdale Medical Centre5-7 Putnam AvenueSTRATHDALEVIC3550
80436127I00:23:56TestAccountDoctorVictorian Rehabilitation Centre499 Springvale RoadGLEN WAVERLEYVIC3150
91437137I00:32:45TestAccountDoctorBentleigh Medical Centre254 Centre RoadBENTLEIGHVIC3204
309085570I00:33:49TestAccountDoctorVictoria Harbour Medical Centre800 Bourke StreetDOCKLANDSVIC3008
63523884I00:10:42TestAccountDoctorShop 49Lonsdale StreetMELBOURNEVIC3000
103394111I00:11:36TestAccountDoctorHighlands Medical CentreShop 22, Stockland Highlands300-332 Grand BoulevardCRAIGIEBURNVIC3064
82192344I00:36:28TestAccountDoctorPO Box 123379-381 Whitehorse RoadBALWYNVIC3103


And this is what I need it to be:
Associate IDAssociate Client CodePrefixFirst NameMiddleLast NameSuffixGreetingSpecialtyBusiness NameAddress 1Address 2Address 3CityStateZip Code
70396157I00:26:21TestAccountDoctorEpping Plaza Medical & Dental CentreShop 216 Epping Plaza571-583 High StreetEPPINGVIC3076
91556921I00:26:27TestAccountDoctorChelsea Arcade MedicalShop 10 & 11426 Nepean HighwayCHELSEAVIC3196
90094825I00:26:37TestAccountDoctorBentleigh Medical Centre254 Centre RoadBENTLEIGHVIC3204
57666989I00:30:19TestAccountDoctorStrathdale Medical Centre5-7 Putnam AvenueSTRATHDALEVIC3550
80436127I00:23:56TestAccountDoctorVictorian Rehabilitation Centre499 Springvale RoadGLEN WAVERLEYVIC3150
91437137I00:32:45TestAccountDoctorBentleigh Medical Centre254 Centre RoadBENTLEIGHVIC3204
309085570I00:33:49TestAccountDoctorVictoria Harbour Medical Centre800 Bourke StreetDOCKLANDSVIC3008
63523884I00:10:42TestAccountDoctorShop 491 Lonsdale StreetMELBOURNEVIC3000
103394111I00:11:36TestAccountDoctorHighlands Medical CentreShop 22, Stockland Highlands300-332 Grand BoulevardCRAIGIEBURNVIC3064
82192344I00:36:28TestAccountDoctorPO Box 123379-381 Whitehorse RoadBALWYNVIC3103


Any help is greatly appreciated!
 

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Power Query:
let
    Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Table1"]}[Content]),
    cols = List.Buffer({"Business Name","Address 1","Address 2","Address 3"}),
    tbl = Table.AddColumn(Source, "Custom", each 
        let 
            lst = List.RemoveNulls(Record.ToList(Record.SelectFields(_,cols))),
            lst1 =  if Text.Contains("0123456789", Text.Start(lst{0}, 1)) or 
                        Text.Lower(Text.Start(lst{0}, 5)) = "shop " or 
                        Text.Lower(Text.Start(lst{0}, 6)) = "po box"
                    then 
                        {null} & List.FirstN(lst,3)
                    else 
                        lst,
            lst2 = lst1 & List.Repeat({null}, 4 - List.Count(lst1))
        in Record.FromList(lst2, cols)),
    tbl1 = Table.RemoveColumns(tbl,cols),
    tbl2 = Table.ExpandRecordColumn(tbl1, "Custom", cols),
    Result = Table.ReorderColumns(tbl2,Table.ColumnNames(Source))
in
    Result
 
Upvote 0
Solution
Power Query:
let
    Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Table1"]}[Content]),
    cols = List.Buffer({"Business Name","Address 1","Address 2","Address 3"}),
    tbl = Table.AddColumn(Source, "Custom", each
        let
            lst = List.RemoveNulls(Record.ToList(Record.SelectFields(_,cols))),
            lst1 =  if Text.Contains("0123456789", Text.Start(lst{0}, 1)) or
                        Text.Lower(Text.Start(lst{0}, 5)) = "shop " or
                        Text.Lower(Text.Start(lst{0}, 6)) = "po box"
                    then
                        {null} & List.FirstN(lst,3)
                    else
                        lst,
            lst2 = lst1 & List.Repeat({null}, 4 - List.Count(lst1))
        in Record.FromList(lst2, cols)),
    tbl1 = Table.RemoveColumns(tbl,cols),
    tbl2 = Table.ExpandRecordColumn(tbl1, "Custom", cols),
    Result = Table.ReorderColumns(tbl2,Table.ColumnNames(Source))
in
    Result
Thanks for your help.

When I run the above on my sample table it works 100% but when run on the actual DB there are many rows which contain blanks:
Hollie Drive Medical Centre5 Hollie DriveMORWELLVIC3840
St Kilda Road Medical Centre1/391 St Kilda RoadMELBOURNEVIC3004
259 St Georges RoadNORTHCOTEVIC3070
Lotus Medical CentreSuite 3/200 Sydney RoadBRUNSWICKVIC3056
Peter MacCallum Cancer Centre305 Grattan StreetMELBOURNEVIC3000
Heaths Road Family Clinic241 Heaths RoadWERRIBEEVIC3030
Neurology Network MelbourneSuite 3 Level 1517 St Kilda RoadMELBOURNEVIC3004
Royal Victorian Eye and Ear Hospital32 Gisborne StreetEAST MELBOURNEVIC3002
Masada Medical Centre26 Balaclava RoadST KILDA EASTVIC3183


It did delete and shift left some which makes me think these blank cells are not actually blank.

Any suggestion to strip out any hidden data prior to applying the above?
 
Upvote 0
Copy one of the "blank" cells that isn't null and paste it into the Value To Find box of the Replace Value dialog box in the Power Query user interface. Type null in the Replace With box.

Once the "blank" cells show as null in PQ user interface then the code above should work
 
Upvote 1
Copy one of the "blank" cells that isn't null and paste it into the Value To Find box of the Replace Value dialog box in the Power Query user interface. Type null in the Replace With box.

Once the "blank" cells show as null in PQ user interface then the code above should work
Thank you!
And may I say, your profile pic is spot on.
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top