VBA Help - Duplicate & Unwanted Data Removal

MoonDancer

New Member
Joined
May 4, 2024
Messages
5
Office Version
  1. 365
Platform
  1. Windows
Hi all,

Ok, so I feel a little guilty coming here to ask for help. I'm an absolute novice with anything Macro or VBA related so I'm pretty much going to be need leading by the hand (so sorry). That is what seems to have come from having big ideas with little to no idea of how to execute what I want.

Some background info on what data I need to cleanup:

Using Power BI, I export a report in Excel format. This data set always has Columns A - Z but the number of rows can vary.

At the very bottom of the sheet, when I use Ctrl + Down, is a Totals Row, followed by a blank row and another row with data in. All 3 of these rows need to be deleted.

Column A usually contains multiple duplicates. All rows with a duplicate number in column A need to be removed.

Next, filters are inserted across the header row (1). In Column J, there will be the following entries:

Approved
Awaiting Service Approval
Awaiting Value Approval
Cancelled
Created
Rejected

I need to remove all rows on the spreadsheet except the ones with "Approved" in column J.

I have tried using Record Macro, but it is only recording the rows I need to execute the Macro on from this data set. If I run it on another import, I am guessing the range will stay the same, whereas, I need it to be variable depending on the total number of rows on any export.

I've tried and tried to attach an image I took (on phone camera) of the steps done after recording the Macro, but it will not attach no matter what I do (file too big all the time).

I've had to try copy and pasting the coding below instead and putting in gaps between each bit (I did keep returning to A1 to ensure I could see where my active cell was at any point).

Sub Test Cleanup Approvals()
Test_Cleanup_Approvals
' Test Cleanup Approvals Macro ' Clean up the data -

Selection.End(xlDown) .Select

Rows ("5423:5425") .Select

Selection.Delete Shift:=xlUp

Rows ("l:1").Select

Selection.AutoFilter

Cells.Select

ActiveSheet.Range ("$A$l:$Z$5425").RemoveDuplicates Columns:=1, Header:= xlYes

Range ("Al") .Select

ActiveWorkbook.Worksheets ("Export") .AutoFilter.Sort.SortFields.Clear

ActiveWorkbook.Worksheets("Export").AutoFilter.Sort.SortFields.Add2 Key:= Range ("J2:J4040"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption :=xlSortNormal

With ActiveWorkbook.Worksheets("Export").AutoFilter.Sort .Header = xlYes .MatchCase = False . Orientation - xlTopToBottom . SortMethod = xlPinYin . Apply

End With

ActiveSheet.Range ("$A$l:$Z$4040").AutoFilter Field:=10, Criterial:=Arrayc "Awaiting Service Approval", "Awaiting Value Approval", "Cancelled", "Created" "Rejected"), Operator:mad:lFiltervalues

Rows ("3856:3856").Select

Range (Selection, Selection.End(xlDown)).Select

Selection.Delete Shift:mad:lUp

Range ("Al") .Select

ActiveSheet.Range ("$A$l:$Z$3855") .AutoFilter Field:=10

End Sub


Can anyone please help me with amending the code to allow for variable rows of data? I'm guessing I can go in and manually change the recorded Macro, or would it be better to type it all in the VBA as a new module?

There is another column that I need to remove lots of unwanted data from but I think perhaps there are too many variables? In this one, the field is populates in the original program with a name. When an action has been taken, an "R" is then usually inserted in the front. Unfortunately the format this is put in can vary wildly from colleague to colleague who happens to be doing it. Some input R - (name), others do R (name), R- (name) and so on. Removing duplicates won't work because entries without that "R" would also be affected if there are multiple with the same name, I.e. "Joe Bloggs". Filtering is an absolute pain as not only are all the "R's" put in with variable formats, but also there could be any one of hundreds of different names in the field too, making it an absolute nightmare and takes ages to delete all of them. There are some others to remove too, but the "R" prefix ones are by far the worst and can be in their 1000s line wise. Not sure if anything can be done or if this has to be done manually?


So sorry for the huge message! Posting in the hope someone will save me! I've tried to code in the VBA but I've not a clue what I'm doing.

Faye


EDIT: Sorry, random angry faces in there. Think it should have been : = x (without spaces)
 
Last edited by a moderator:

Excel Facts

How to show all formulas in Excel?
Press Ctrl+` to show all formulas. Press it again to toggle back to numbers. The grave accent is often under the tilde on US keyboards.
Ah - I've just found out how to post the code in those special images. Hopefully this is easier to read than my first attempt in all bold, italics! Unfortunately I can't see the option to actually edit my original post. Have to excuse me, I've been trying to do this via mobile phone with a neurodiverse almost 5 year old buzzing round me all day. Have now tried to do it on the PC to make my life a little easier, though I still have a child attempting to smack at my keyboard and run off with my mouse..... the joys eh?

VBA Code:
Sub Test Cleanup Approvals()
Test_Cleanup_Approvals
' Test Cleanup Approvals Macro ' Clean up the data -

Selection.End(xlDown) .Select
Rows ("5423:5425") .Select
Selection.Delete Shift:=xlUp
Rows ("l:1").Select
Selection.AutoFilter
Cells.Select
ActiveSheet.Range ("$A$l:$Z$5425").RemoveDuplicates Columns:=1, Header:= xlYes
Range ("Al") .Select
ActiveWorkbook.Worksheets ("Export") .AutoFilter.Sort.SortFields.Clear
ActiveWorkbook.Worksheets("Export").AutoFilter.Sort.SortFields.Add2 Key:= Range ("J2:J4040"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption :=xlSortNormal
With ActiveWorkbook.Worksheets("Export").AutoFilter.Sort .Header = xlYes .MatchCase = False . Orientation - xlTopToBottom . SortMethod = xlPinYin . Apply
End With
ActiveSheet.Range ("$A$l:$Z$4040").AutoFilter Field:=10, Criterial:=Arrayc "Awaiting Service Approval", "Awaiting Value Approval", "Cancelled", "Created" "Rejected"), Operator:mad:lFiltervalues
Rows ("3856:3856").Select
Range (Selection, Selection.End(xlDown)).Select
Selection.Delete Shift:mad:lUp
Range ("Al") .Select
ActiveSheet.Range ("$A$l:$Z$3855") .AutoFilter Field:=10

End Sub
 
Upvote 0
Please try the following on a copy of your workbook.

VBA Code:
Option Explicit
Sub MoonDancer()
    Dim ws As Worksheet, LRow As Long
    Set ws = Worksheets("Export")
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row
    
    ws.Cells(LRow, 1).Offset(-2).Resize(3).EntireRow.Delete
    If ws.AutoFilterMode Then ws.AutoFilter.ShowAllData
    With ws.Range("A1").CurrentRegion
        .RemoveDuplicates Columns:=1, Header:=xlYes
        .AutoFilter 10, "<>Approved"
        If .SpecialCells(xlCellTypeVisible).Address <> .Rows(1).Address Then
            .Offset(1).EntireRow.Delete
        End If
        .AutoFilter
    End With
    
End Sub
 
Upvote 1
Solution
Thank you so much Kevin :love:

With a quick change to the worksheet name in the macro, that ran on test (I've created a bogus data range but in the column where the data had to be exact, just pasted a set number of those entries in so I had a count of what would be left). I've changed the worksheet name to Sheet1 but might we be able to replace that with the "Active Sheet"? That way the Macro could potentially be copied and saved under different names to carry out the same actions but I can change the "Approved" criteria from the filter to any of the others, on any sheet (in any active workbook?). Once I've tested on my home PC, I'll be inputting the code on my work laptop and probably make it available to a colleague if the report needs to be ran in my absence?

Just popping a copy of what you wrote below with the change to worksheet name I made:

VBA Code:
Option Explicit
Sub MoonDancer()
    Dim ws As Worksheet, LRow As Long
    Set ws = Worksheets("Sheet1")
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row
    
    ws.Cells(LRow, 1).Offset(-2).Resize(3).EntireRow.Delete
    If ws.AutoFilterMode Then ws.AutoFilter.ShowAllData
    With ws.Range("A1").CurrentRegion
        .RemoveDuplicates Columns:=1, Header:=xlYes
        .AutoFilter 10, "<>Approved"
        If .SpecialCells(xlCellTypeVisible).Address <> .Rows(1).Address Then
            .Offset(1).EntireRow.Delete
        End If
        .AutoFilter
    End With
    
End Sub

Just so I can understand, that top section is essentially setting the data range (or objects) to look at before we then tell the macro what to do? The LRow as Long bit is saying essentially "however many rows long the data is", would LColumn as Long then allow it to look at the data across as many columns as are on the worksheet if the data to check went across, rather than down?


The bit in my original message about removing anything with an R (or other specified prefix, i.e. XX for PO numbers that need to be cancelled), does that sound possible or are there too many variables with the entries? Obviously I would want it to delete things like R - Rebecca Bloggs / R- Ralph Bloggs / R Rachael Bloggs etc, and not an entry just saying Rebecca Bloggs without one of the prefixes. We are checking if we can pull a narrower data range on date from Power BI to reduce the amount to go through but until then, there are thousands of lines with a prefix. I think we are going to check and ask colleagues to all follow the same format to make it easier going forward soon (hopefully).



Kind Regards,


Faye
 
Upvote 0
The LRow line is finding the last row on the sheet with something in it - irrespective of what column it appears in. If you wanted to delete the R rows (but keep those like "Rebecca... etc.) then you could use the fact (if so) that the R is followed by a space or a hyphen. Don't know what column that is but if, by example, it's column M (= column 13) then the line of code could look like this:

VBA Code:
.AutoFilter 13, "R *", 2, "R-*"

You could also set to the active sheet like this:

VBA Code:
Set ws = ActiveSheet
 
Upvote 1
Ok, thought I would have a try at it myself, based on what you put above for the "R" filter but I've definitely gone wrong with some of the principals around what needs to be filtered and deleted. It ran but it deleted the data I mostly wanted instead. Oops.

For test purposes I did a randomised selection of the following prefixes I have noticed running in the actual data at work ("Real Data" was input here as a test on what I expected to be returned after the others had been deleted, deliberately using an R to see if any proper name starting with "R", i.e as mentioned before like Rebecca, would be left behind):

R - Test Text
R Test Text
R-Test Text
RTest Text
Real Data
R-
R Test Text
R- Test Text
R_Test Text
XX Test Text
*Test Text

This is the coding I tried (I also tried replacing the .Address <> with .Address = : to see if that would make any difference. The column is H (or 8 in numerical):

VBA Code:
Option Explicit
Sub RsRemovalTest()
    Dim ws As Worksheet, LRow As Long
    Set ws = ActiveSheet
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row
    
    ws.Cells(LRow, 1).Offset(-2).Resize(3).EntireRow.Delete
    If ws.AutoFilterMode Then ws.AutoFilter.ShowAllData
    With ws.Range("A1").CurrentRegion
        .RemoveDuplicates Columns:=1, Header:=xlYes
        .AutoFilter 10, "<>Approved"
        If .SpecialCells(xlCellTypeVisible).Address <> .Rows(1).Address Then
            .Offset(1).EntireRow.Delete
        End If
        .AutoFilter 8, "R *", 2, "R-*", 3, "XX*"
        If .SpecialCells(xlCellTypeVisible).Address <> .Rows(1).Address Then
            .Offset(1).EntireRow.Delete
        End If
        .AutoFilter
    End With
    
End Sub


Me and my big ideas :rolleyes: I should stick to bog-standard admin.
 
Upvote 0
I also tried running a different filter other than "<>Approved". Worked fine if I just swapped it for "<>Awaiting Service Approval", but when I thought I would try writing it similar to the R prefixed entries removal format (i.e .AutoFilter 10, "<>Awaiting Service Approval", 2, "<>Awaiting Value Approval", I hit a sticky wicket and it literally deleted everything 😱. The Approved PO report is by far the biggest one with data to sort through, so it's a priority to try to cut through all the rubbish. I know I got something fundamental wrong but just don't know what to do instead.

The other main final reports pulled from the Power BI export would be to:

Remove Duplicates in Column 1 (as standard).

Then by filtering and deleting unwanted data in Column J / 10, return the just following results to work on:
  1. Awaiting Service Approval & Awaiting Value Approval
  2. Created & Rejected
  3. Cancelled (this one I could just replace the "<>Approved" text in the original VBA code you gave me
Probably I'd not want the R's removing on these but if it did come in handy, I could just copy the code from the Approved one and put it into the right place after the filter and delete actions were run on Column J / 10.


After this is done, I'm definitely going to watch some proper tutorials on you-tube, even if I still don't grasp it properly, I'm interested in at least trying to learn something new to apply it to work / home. I might still be better at doing a Trial Balance sheet than ever coding anything though (and I haven't done any of THAT since I took the basic AAT course about 9 years ago, such a waste...)
 
Upvote 0

Forum statistics

Threads
1,225,619
Messages
6,186,045
Members
453,335
Latest member
sfd039

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top