Macro to deduplicate based on A and delete?

cloobless

Board Regular
Joined
Jul 15, 2014
Messages
84
Office Version
  1. 2010
Platform
  1. Windows
Hello -- I've been working on this since this morning (starting in a batch file, then resorting to AutoHotkey). I know it's simple, but...so am I.

I have a .csv file that is automaticallly generated throughout the day in the following format. It is a concatenation of a bunch of other files. When the file is created, I need to very quickly scrub it, and I think the fastest way to do this is using a macro, but I'm open to any suggestion that works.

What I would like to do is the following:
  1. Delete all instances of the header (Music,Description,%Length (1 Track),Category Style: Code) -- there will be multiple instances in the data.
  2. Delete all instances of the footer (Sorted by MusicMaker) -- will be multiple
  3. Delete all duplicates using the first value (Music) as the comparison. So even if the other values are different, delete all but one instance of "BACH".
  4. Collape the list so that empty rows are deleted.
Any help offered is deeply appreciated. Thank you.

The initial data looks like this:
A
Music,Description,%Length (1 Track),Category Style: Code
BACH,"Classical Hits",70.679012,47.7678
LEDZEP,"Rock",14.666667,5912241.4000
RUSH,"Canandian Stuff",197.0538,224130.0920
Sorted by MusicMaker


I would like it to end up looking like this:

BACH
LEDZEP
RUSH
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
I think one problem you will have is how Excel will know that BACH, LEDZEP and RUSH all refer to music. You may need a list on another sheet, for example, of all the strings that refer to music so Excel can refer to that list.
 
Upvote 0
I think one problem you will have is how Excel will know that BACH, LEDZEP and RUSH all refer to music. You may need a list on another sheet, for example, of all the strings that refer to music so Excel can refer to that list.

"Music" is simply the first delimited value in any row. It's value1 in the delimited list.

Value1,Value2,Value3

Etc.
 
Upvote 0
I’m not sure what you mean. Please clarify in detail using a few examples from your data.
 
Upvote 0
Power Query is the way to go. NO VBA Code Required.
it will literally take 2 minutes to set up.

Filter the rows to remove "Music" in header, Blank Cells, and "Sorted by MusicMaker" from the footer.
then click on the column header and remove duplicates.

going forward you just refresh and you will have the new data.
 
Upvote 0
I’m not sure what you mean. Please clarify in detail using a few examples from your data.

The original post contains everything I know how to outline, so I'm not certain what is missing.

1. My .csv file has delimited data. I want to delete duplicate rows based on matching the first value. Only the first value matters for determining a duplicate.
BACH,value2,value3
BACH,value2,value3
----

2. I want to delete all instances of "Music,Description,%Length (1 Track),Category Style: Code" in the data. These are just repeated header rows from the previous files that were concatenated into this working file.

3. I want to delete all instances of "Sorted by MusicMaker". These are just a footer row that was similarly inserted during a prior process.

4. I want to delete blank rows.

What I want to be left with is a list, in column A, consisting of the deduplicated value1 data:
BACH
RUSH
FIONAAPPLE
BEETHOVEN
HENDRIX
 
Upvote 0
Power Query is the way to go. NO VBA Code Required.
it will literally take 2 minutes to set up.

Filter the rows to remove "Music" in header, Blank Cells, and "Sorted by MusicMaker" from the footer.
then click on the column header and remove duplicates.

going forward you just refresh and you will have the new data.
Thank you. Unfortunately, this is an older machine with Excel 2010 running on it, so I don't think that's an option.
 
Upvote 0
Got it. I have to got out now. I’ll have a look at it tomorrow.
 
Upvote 0
This macro assumes that you have a header in cell A1 and your data starts in row 2.
VBA Code:
Sub DeleteDups()
    Application.ScreenUpdating = False
    Dim lRow As Long, v As Variant, i As Long, dic As Object
    On Error Resume Next
    Range("A:A").SpecialCells(xlBlanks).EntireRow.Delete
    On Error GoTo 0
    lRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    Range("A1:A" & lRow).AutoFilter Field:=1, Criteria1:= _
        "=Music,Description,%Length (1 Track),Category Style: Code", Operator:=xlOr, Criteria2:="=Sorted by MusicMaker"
    ActiveSheet.AutoFilter.Range.Offset(1).EntireRow.Delete
    Range("A1").AutoFilter
    lRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    v = Range("A2:A" & lRow).Value
    Set dic = CreateObject("Scripting.Dictionary")
    For i = UBound(v) To LBound(v) Step -1
        If Not dic.exists(Split(v(i, 1), ",")(0)) Then
            dic.Add Split(v(i, 1), ",")(0), Nothing
        Else
            Rows(i + 1).Delete
        End If
    Next i
    Range("A2").Resize(dic.Count) = Application.Transpose(dic.keys)
    Application.ScreenUpdating = True
End Sub
 
Upvote 1
Solution

Forum statistics

Threads
1,223,884
Messages
6,175,171
Members
452,615
Latest member
bogeys2birdies

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top