Removing Complex Duplicates

IML_56

New Member
Joined
Jan 16, 2020
Messages
11
Office Version
  1. 2019
Platform
  1. Windows
Hello all im not sure if this can be done or not, but i have some data with a lot of rows some times there are duplicates. I want to delete a specific value in that duplicate column while retaining the rest of the information. Can this be done in power query?



My example: I want the duplicate row to have the item in boxes removed but keep everything.
ProjectFundingBoxes
AB$1010
AB$10010
CD$205
EF$1520

Result:
ProjectFundingBoxes
AB$1010
AB$100
CD$205
DE$1520


Thanks,
Ian
 
with Power Query
Rich (BB code):
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Index = Table.AddIndexColumn(Source, "Index", 1, 1),
    Index2 = Table.AddIndexColumn(Source, "Index", 1, 1),
    SortD = Table.Sort(Index2,{{"Boxes", Order.Descending}}),
    RD = Table.Distinct(SortD, {"Boxes"}),
    SortA = Table.Sort(RD,{{"Index", Order.Ascending}}),
    TSC = Table.SelectColumns(SortA,{"Boxes", "Index"}),
    Mrg = Table.NestedJoin(Index,{"Index"},TSC,{"Index"},"Exp",JoinKind.LeftOuter),
    Expand = Table.ExpandTableColumn(Mrg, "Exp", {"Boxes"}, {"Boxes.1"}),
    TSC2 = Table.SelectColumns(Expand,{"Project", "Funding", "Boxes.1"})
in
    TSC2
ProjectFundingBoxesProjectFundingBoxes.1
AB1010AB1010
AB10010AB100
CD205CD205
EF1520EF1520
xx4530xx4530
xx8930xx89
xx34530xx345
dd543125dd543125
www12909www12909


if you need you can change Boxes.1 to Boxes in last step
 
Upvote 0

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
a bit shorter
Rich (BB code):
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Index = Table.AddIndexColumn(Source, "Index", 1, 1),
    SortD = Table.Sort(Index,{{"Boxes", Order.Descending}}),
    RD = Table.Distinct(SortD, {"Boxes"}),
    TSC = Table.SelectColumns(RD,{"Boxes", "Index"}),
    Mrg = Table.NestedJoin(Index,{"Index"},TSC,{"Index"},"Exp",JoinKind.LeftOuter),
    Expand = Table.ExpandTableColumn(Mrg, "Exp", {"Boxes"}, {"Boxes.1"}),
    TSC2 = Table.SelectColumns(Expand,{"Project", "Funding", "Boxes.1"})
in
    TSC2
 
Upvote 0
Try this macro assuming your data is in columns A, B and C:
VBA Code:
Sub RemoveDups()
    Application.ScreenUpdating = False
    Dim LastRow As Long, RngList As Object, rng As Range, key As Variant, fVisRow As Long
    LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    Set RngList = CreateObject("Scripting.Dictionary")
    For Each rng In Range("A2:A" & LastRow)
        If Not RngList.Exists(rng.Value) Then
            RngList.Add rng.Value, Nothing
        End If
    Next
    For Each key In RngList
        If WorksheetFunction.CountIf(Range("A:A"), key) > 1 Then
            With ActiveSheet
                .Cells(1, 1).CurrentRegion.AutoFilter 1, key
                fVisRow = .Range("A2:A" & LastRow).SpecialCells(xlCellTypeVisible).Cells(1, 1).Row
                .Range("C" & fVisRow + 1 & ":C" & LastRow).SpecialCells(xlCellTypeVisible).ClearContents
            End With
        End If
    Next key
    Range("A1").AutoFilter
    Application.ScreenUpdating = True
End Sub
This worked great. Thank you so much.
 
Upvote 0
a bit shorter
Rich (BB code):
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Index = Table.AddIndexColumn(Source, "Index", 1, 1),
    SortD = Table.Sort(Index,{{"Boxes", Order.Descending}}),
    RD = Table.Distinct(SortD, {"Boxes"}),
    TSC = Table.SelectColumns(RD,{"Boxes", "Index"}),
    Mrg = Table.NestedJoin(Index,{"Index"},TSC,{"Index"},"Exp",JoinKind.LeftOuter),
    Expand = Table.ExpandTableColumn(Mrg, "Exp", {"Boxes"}, {"Boxes.1"}),
    TSC2 = Table.SelectColumns(Expand,{"Project", "Funding", "Boxes.1"})
in
    TSC2
Thank you!
 
Upvote 0
You are welcome

here is more flexible
Rich (BB code):
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Index = Table.AddIndexColumn(Source, "Index", 1, 1),
    SortD = Table.Sort(Index,{{"Project", Order.Ascending}}),
    RD = Table.Distinct(SortD, {"Project"}),
    TSC = Table.SelectColumns(RD,{"Project", "Index"}),
    Mrg = Table.NestedJoin(Index,{"Index"},TSC,{"Index"},"Exp",JoinKind.LeftOuter),
    Expand = Table.ExpandTableColumn(Mrg, "Exp", {"Index"}, {"Index.1"}),
    IF = Table.AddColumn(Expand, "Custom", each if [Index.1] = null then null else [Boxes]),
    TSC2 = Table.SelectColumns(IF,{"Project", "Funding", "Custom"})
in
    TSC2
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,789
Messages
6,174,579
Members
452,573
Latest member
Cpiet

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top