De-duplication across multiple columns using VBA

SHaun687

New Member
Joined
Apr 18, 2017
Messages
7
Hi,

I am new to VBA and have been trying to right some code that will automate the formatting of some semi-structured data. The big issue I have is with de-duplication. I am trying de duplicate data based on one column and merge additional columns onto one row. For example (simplified data)


-- removed inline image ---


Becomes


-- removed inline image ---


I know I am probably going to have to assign values to the blanks in column A to stop blanks being seen as duplicates. The main problem I have with any code to date is that when the first row in column B, C or D is blank, no data is returned. For example in the above, "Name 1" would show as Name1, X, Blank, Blank.

Any help would be much appreciated! Thanks in advance.
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
Sorry - image did not attach!

Hi,

I am new to <acronym title="visual basic for applications">VBA</acronym> and have been trying to right some code that will automate the formatting of some semi-structured data. The big issue I have is with de-duplication. I am trying de duplicate data based on one column and merge additional columns onto one row. For example (simplified data)

[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]X[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD][/TD]
[TD]Y[/TD]
[TD]3[/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]X[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name2[/TD]
[TD]Z[/TD]
[TD][/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]Name3[/TD]
[TD][/TD]
[TD]5[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name2[/TD]
[TD]Z[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]

Becomes

[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]X[/TD]
[TD]Y[/TD]
[TD]3[/TD]
[/TR]
[TR]
[TD]Name2[/TD]
[TD]Z[/TD]
[TD][/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]Name3[/TD]
[TD][/TD]
[TD]5[/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]


I know I am probably going to have to assign values to the blanks in column A to stop blanks being seen as duplicates. The main problem I have with any code to date is that when the first row in column B, C or D is blank, no data is returned. For example in the above, "Name 1" would show as Name1, X, Blank, Blank.

Any help would be much appreciated! Thanks in advance.
 
Last edited by a moderator:
Upvote 0
Dear Shaun687.

ref the link
[h=3]https://www.mrexcel.com/forum/excel-questions/999447-unique-values-1-column.html[/h]
I hope this will be help to you.
 
Upvote 0
Hi vmjan02,

Thanks for your help - I've just realised my images were not included in the original post for some reason. What I am trying to do is as follows

Name1 A 5
Name1 A Z
Name1 5

Name2
Name2 B

Becomes
Name1 A 5 Z
Name2 B

So it is slightly different to just returning unique values as the values must still be assigned to a variable, in this case the "Name".

Kind regards
Shaun687
 
Upvote 0
Try this:-
Your data in columns "A to D"
Results start "F1".
Code:
[COLOR="Navy"]Sub[/COLOR] MG18Apr55
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] K [COLOR="Navy"]As[/COLOR] Variant, R [COLOR="Navy"]As[/COLOR] Range, Ac [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] Dic  [COLOR="Navy"]As[/COLOR] Object, nDic [COLOR="Navy"]As[/COLOR] Object, nAc [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] c [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
[COLOR="Navy"]Set[/COLOR] Dic = CreateObject("scripting.dictionary")
Dic.CompareMode = vbTextCompare
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
    [COLOR="Navy"]If[/COLOR] Not IsEmpty(Dn.Value) [COLOR="Navy"]Then[/COLOR]
        [COLOR="Navy"]If[/COLOR] Not Dic.exists(Dn.Value) [COLOR="Navy"]Then[/COLOR]
            Dic.Add Dn.Value, Dn
        [COLOR="Navy"]Else[/COLOR]
            [COLOR="Navy"]Set[/COLOR] Dic(Dn.Value) = Union(Dic(Dn.Value), Dn)
        [COLOR="Navy"]End[/COLOR] If
    [COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR]
ReDim ray(1 To Rng.Count, 1 To 2)
[COLOR="Navy"]Set[/COLOR] nDic = CreateObject("scripting.dictionary")
nDic.CompareMode = vbTextCompare
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] K [COLOR="Navy"]In[/COLOR] Dic.keys
    c = c + 1: nAc = 1
    ray(c, 1) = K
    [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] R [COLOR="Navy"]In[/COLOR] Dic(K)
        [COLOR="Navy"]For[/COLOR] Ac = 1 To 3
           [COLOR="Navy"]If[/COLOR] Not nDic.exists(R.Offset(, Ac).Value) And Not R.Offset(, Ac).Value = "" [COLOR="Navy"]Then[/COLOR]
               nAc = nAc + 1
               nDic.Add (R.Offset(, Ac).Value), Nothing
               [COLOR="Navy"]If[/COLOR] nAc > UBound(ray, 2) [COLOR="Navy"]Then[/COLOR] ReDim Preserve ray(1 To Rng.Count, 1 To UBound(ray, 2) + 1)
               ray(c, nAc) = R.Offset(, Ac).Value
           [COLOR="Navy"]End[/COLOR] If
       [COLOR="Navy"]Next[/COLOR] Ac
    [COLOR="Navy"]Next[/COLOR] R
[COLOR="Navy"]Next[/COLOR] K
Range("F1").Resize(c, UBound(ray, 2)).Value = ray
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Last edited by a moderator:
Upvote 0
Hi Mick,

That's brilliant - thank you so much! Just one more question, how could I alter the above so that data remains in the column it was found. For example

Name1 X
Name1 Z

is returned as Name1 X Z as opposed to
Name1 X Z

No worries if this could end up being more hassle than it's worth as I can apply formatting at this stage rather than before the de-duplication. Your code has been of huge help thanks!
 
Upvote 0
Do you mean this:-
[TABLE="width: 144"]
<colgroup><col width="64" style="width: 48pt;" span="3"> <tbody>[TR]
[TD="class: xl63, width: 64, bgcolor: transparent"] This:-[/TD]
[TD="class: xl63, width: 64, bgcolor: transparent"] [/TD]
[TD="class: xl63, width: 64, bgcolor: transparent"] [/TD]
[/TR]
[TR]
[TD="class: xl64, bgcolor: transparent"]Name1[/TD]
[TD="class: xl63, bgcolor: transparent"]A[/TD]
[TD="class: xl63, bgcolor: transparent, align: right"]5[/TD]
[/TR]
[TR]
[TD="class: xl64, bgcolor: transparent"]Name1[/TD]
[TD="class: xl63, bgcolor: transparent"]A[/TD]
[TD="class: xl63, bgcolor: transparent"]Z[/TD]
[/TR]
[TR]
[TD="class: xl64, bgcolor: transparent"]Name1[/TD]
[TD="class: xl63, bgcolor: transparent, align: right"]5[/TD]
[TD="class: xl63, bgcolor: transparent"]k[/TD]
[/TR]
[TR]
[TD="class: xl63, bgcolor: transparent"] [/TD]
[TD="class: xl63, bgcolor: transparent"] [/TD]
[TD="class: xl63, bgcolor: transparent"] [/TD]
[/TR]
[TR]
[TD="class: xl65, bgcolor: transparent"]Returns this:-[/TD]
[TD="class: xl63, bgcolor: transparent"] [/TD]
[TD="class: xl63, bgcolor: transparent"] [/TD]
[/TR]
[TR]
[TD="class: xl64, bgcolor: transparent"]Name1[/TD]
[TD="class: xl63, bgcolor: transparent"]A, 5[/TD]
[TD="class: xl63, bgcolor: transparent"]Z, K[/TD]
[/TR]
</tbody>[/TABLE]

Or something else ???
 
Upvote 0
Yes pretty much that, although the data unfortunately contains a lot of blanks so likely to be more like the following (blank column deliberate)

[TABLE="class: grid, width: 200"]
<tbody>[TR]
[TD]This:-[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]A[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]A[/TD]
[TD]5[/TD]
[TD][/TD]
[TD]Z[/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]K[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Returns:-[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]Name1[/TD]
[TD]A[/TD]
[TD]5[/TD]
[TD][/TD]
[TD]Z,K[/TD]
[/TR]
</tbody>[/TABLE]

Hope that makes sense?
 
Upvote 0
Try this:-
Results now start column "K", alter wher shown shown.
As your example now shows 5 columns, I've changed the possible columns to 6, Change this where shown in code to suit.
Code:
[COLOR="Navy"]Sub[/COLOR] MG18Apr06
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] K [COLOR="Navy"]As[/COLOR] Variant, R [COLOR="Navy"]As[/COLOR] Range, Ac [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] Dic  [COLOR="Navy"]As[/COLOR] Object, nDic [COLOR="Navy"]As[/COLOR] Object, nAc [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] c [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
[COLOR="Navy"]Set[/COLOR] Dic = CreateObject("scripting.dictionary")
Dic.CompareMode = vbTextCompare
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
    [COLOR="Navy"]If[/COLOR] Not IsEmpty(Dn.Value) [COLOR="Navy"]Then[/COLOR]
        [COLOR="Navy"]If[/COLOR] Not Dic.exists(Dn.Value) [COLOR="Navy"]Then[/COLOR]
            Dic.Add Dn.Value, Dn
        [COLOR="Navy"]Else[/COLOR]
            [COLOR="Navy"]Set[/COLOR] Dic(Dn.Value) = Union(Dic(Dn.Value), Dn)
        [COLOR="Navy"]End[/COLOR] If
    [COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR]
ReDim ray(1 To Rng.Count, 1 To 6) '[COLOR="Green"][B] Change this "6" (number of columns in Data) to suit[/B][/COLOR]
[COLOR="Navy"]Set[/COLOR] nDic = CreateObject("scripting.dictionary")
nDic.CompareMode = vbTextCompare
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] K [COLOR="Navy"]In[/COLOR] Dic.keys
    c = c + 1
    ray(c, 1) = K
    [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] R [COLOR="Navy"]In[/COLOR] Dic(K)
        [COLOR="Navy"]For[/COLOR] Ac = 2 To UBound(ray, 2)
           [COLOR="Navy"]If[/COLOR] Not nDic.exists(R(, Ac).Value) And Not R(, Ac).Value = "" [COLOR="Navy"]Then[/COLOR]
               nDic.Add (R(, Ac).Value), Nothing
               ray(c, Ac) = ray(c, Ac) & IIf(ray(c, Ac) = _
               "", R(, Ac).Value, " ," & R(, Ac).Value)
           [COLOR="Navy"]End[/COLOR] If
       [COLOR="Navy"]Next[/COLOR] Ac
    [COLOR="Navy"]Next[/COLOR] R
[COLOR="Navy"]Next[/COLOR] K
'[COLOR="Green"][B]Change range "K" to suit[/B][/COLOR]
[COLOR="Navy"]With[/COLOR] Range("K1").Resize(c, UBound(ray, 2))
    .Value = ray
    .Columns.AutoFit
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Last edited by a moderator:
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top