Collapse duplicates and only keep those values for which the values align

noname91

New Member
Joined
Jul 5, 2018
Messages
12
Office Version
  1. 2016
Platform
  1. Windows
Hello all,

I'm trying to achieve something fairly straightforward, but might be a bit tricky to explain. I've been using multiple table copies and vlookup to do this, but I'm hoping there is a faster way to go about it using a macro.

I will explain with a simple example:

I have a group of duplicate numbers in column A. In column B, I have names of fruits (three only for this simple example). Columns C to E correspond to whether a row has a specific fruit (called Has_Apple, Has_Mango and Has_Cherry respectively), and contains a value of 1 corresponding to the column that refers to the row having that fruit. So if a row in column B contains the string Apple, Has_Apple for that row will be 1, while the other two columns will be 0.

What is needed is to determine if a group/collection of duplicate numbers has 1s occurring for all three columns, and if so, collapse the group of duplicates. The value of the fruit in column B isn't important, but you can imagine it containing the names of all three fruits. The columns C to E will predictably contain all 1s only for that collapsed row representing the entire collection. If, however, 1s are NOT occurring for ALL THREE columns, then the entire group of duplicates should be discarded.

As I mentioned, I was using multiple tables and vlookups for this before, but I'm not sure how this is going to be doable by macro. I've attached the screenshots with the original dataset, and a results set which shows what the result of such a macro would look like. Version of Excel is 2016, from Microsoft Office Professional Plus 2016.

Thanks for any help!
 

Attachments

  • Sheet_1.png
    Sheet_1.png
    34 KB · Views: 21
  • Result.png
    Result.png
    28.4 KB · Views: 23
It is beautiful indeed lol.

Works great and does the job without issue.

Was scanning the code and playing around with it a bit, and just wondering if you could highlight a few things. One, which part of the code is making the connection between the fruit name in column B and the value (1 or 0) in the correct corresponding column? Two, where it says "For each fruit in fruits", where is "fruit" (singular) being declared or does it not have to be declared because of an action already done earlier?

In any case, thanks for your help with this!
 
Upvote 0

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
I hope I can explain with my limited English :)

Here code loops through all the rows(i) of source table, for each row it checks starting from 3rd to last column(j). If any of the columns returns 1, adds the fruit key to the "fruits" dictionary and skips to next row.
VBA Code:
  For i = 2 To UBound(myArray, 1)
    'Discard the group part for now
    For j = 3 To UBound(myArray, 2)
      If myArray(i, j) = 1 Then
        If Not fruits.Exists(myArray(i, 2)) Then
          fruits.Add myArray(i, 2), 0
        End If
        Exit For
      End If
    Next
  Next

Here it loops through fruit keys. Because the keys are previously defined, VBA can pull them automatically.
VBA Code:
For each fruit In Fruits
This means each key (fruit name) in Fruits dictionary.

I hope all makes sense.
 
Last edited by a moderator:
Upvote 0
VBA Code:
For i = 2 To UBound(myArray, 1) 'For each row of source table excluding headers. That's why starts from 2
    'Discard the group part for now
    For j = 3 To UBound(myArray, 2) 'For each column of source table starting from 3rd
      If myArray(i, j) = 1 Then 'If the value in that column of that row is 1
        If Not fruits.Exists(myArray(i, 2)) Then 'If 2nd column(fruit) of row(i) does not exist in the dictionary
          fruits.Add myArray(i, 2), 0 'Add to dictionary. By this way I am always adding unique names to dictionary
        End If
        Exit For
      End If
    Next
  Next
I hope this step by step explaination is more useful.
 
Upvote 0
^Thanks, got the first one!

Though still not clear on defining the variable "fruit". How does VBA know what "fruit" is? My expectation is that the code would say something like "Dim fruit As Variant" or something similar like how you've defined "fruits", but this is not anywhere in the code far as I can tell. I'm assuming that the singular variable name "fruit" simply refers to the column B name which is also "Fruit" although with different capitalization on the letter F. Sorry, just trying to see how VBA knows what this variable is without it being declared like the other variables.
 
Last edited:
Upvote 0
If you use For Each ... In something VBA will know its type automatically. You can even write your own name instead of ...
Exercise:
VBA Code:
For Each rng In Range("A1:A3")
  MsgBox rng.Address
Next
In this example we say "In a Range". So, VBA knows that we are looking a Range type. "rng" must be a Range, no Variant, nothing else. Now try:
VBA Code:
For Each flashbond In Range("A1:A3")
  MsgBox flashbond.Address
Next
See? It doesn't matter what you name it. It is some kind of declaring a variable on the fly.

In coding there are naming convensions. One of them is, if you use some kind of collection, you must name it in plural. If you are reffering to a single item, you must name it singular. This will help to next user to understand the code easier.

That's why I used "fruit". Because it is looping through a dictionary, VBA automatically knows it will be a dictionary key. There is no difference than saying For Each columnBItem In Fruits

One extra and irrelavant information, one of the other naming convensions is, if you are going to use a decompound word for your variable, you must always start the first word with a lower case and all others must be caps. Eg. columnBItem

This is best I can do. I hope you can find more explainatory sources on the internet.
 
Last edited by a moderator:
Upvote 0
Ah, that's what's going on. Great explanation, I get it now.

Thanks again for all your help on this!
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top