Is it possible to delete all BUT duplicate rows?

monsefoster

New Member
Joined
Oct 29, 2013
Messages
40
I'm trying to find a way to delete all BUT the repeated rows in a file, I know it's possible to delete the duplicates on a file in PHPExcel that would be done something like this:

Code:
<code style="margin: 0px; padding: 0px; border: 0px; vertical-align: baseline; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; white-space: inherit;">
<code style="margin: 0px; padding: 0px; border: 0px; vertical-align: baseline; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; white-space: inherit;">$worksheet = $objPHPExcel->getActiveSheet();        foreach ($worksheet->getRowIterator() as $row) {            $rowIndex = $row->getRowIndex();            foreach ($worksheet->getRowIterator($rowIndex + 1) as $testRow) {                if ($testRow == $row) {                    $worksheet->removeRow($rowIndex);                }            }</code></pre></code>

However I need to do pretty much the opposite, delete all the rows that doesn't have any duplicates in the file.
I'd like to know how to accomplish this in VBA in order to give me an idea on what needs to be done and then trying to do it using this "library" called PHPExcel. I have some VBA knowledge however I can't seem the find the right approach.

**Why do I need this*: The duplicate rows are the intersection between M x M sets with data. It's way easier to do it directly on the file that having to considerate the M x M possible results.*
I'd greatly appreciate any guidance you might offer me.
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
Hi,
I'm having a bit of trouble digesting the code you posted (doesn't look familiar to me) so this may well be something you've already tried, but what if you simply amend this: if ($testRow == $row) to this: if Not ($testRow == $row)

Wouldn't that, in effect, reverse the logic of what it's doing now?
 
Upvote 0
The code is on PHP to work with the Library Called 'PhpExcel', basically what it does is check the current row with the row below in order to see if they match and if they do, then deletes one.

I tried what you said and it only leaves the last row (and is not even duplicated).

Thank you for your response.
 
Upvote 0
I'm not familiar with PHP coding. Are you just looking for how it would be written in vba?

(In either case, you'll want to go through your rows from the bottom up.
That way if you have more than 2 consecutive rows meeting your criteria, they'll all get included.)
 
Upvote 0
I'm not familiar with PHP coding. Are you just looking for how it would be written in vba?

(In either case, you'll want to go through your rows from the bottom up.
That way if you have more than 2 consecutive rows meeting your criteria, they'll all get included.)

Yes, I was looking for a solution in VBA to later translate it to PHP code.

I'm able to delete duplicate rows but how can I delete the others that aren't duplicated? leaving only the duplicated. If I try the approach (current and current + 1) comparison, that would work if the row is below the other one but what if it isn't?
 
Upvote 0
OK, as I understand it, you want to delete all rows that are NOT duplicates, and retain only those that are.
This assumes by your mentioning "check the current row with the row below" that all duplicate rows will be consecutive. (Yes?)
If those assumptions are correct then something like this should work.
There are slicker ways to do this but I tried to keep it relatively intuitive (and well commented) to try and make it a bit easier to translate to what you want to end up using. (I saw references in your PHP code to Active Sheet and this will only work on whichever sheet is active at the time of execution.)
And it will leave row 1 whether or not it's a duplicate row, because it's looking at the current row AND the row above that, so it will error out on row 1 if we don't stop at row 2 as there's no row above row 1 to look at.

So here goes. Let us know how close (or not) it gets you to where you want to be. (It goes into a standard module in the vb editor.)
Code:
Sub RetainOnlyDuplicateRows_Click()

'''Dimension the variables. _
   (LstRw will be the variable used for "Last Row" and _
    LstCol will be the variable used for "Last Column" and _
    Rng will be used as the variable for each cell in the (later) specified range.
Dim LstRw As Long, LstCol As Long, ThsRw As Long
''' ConCatThsRw and ConcatNxtRw will be variables used to represent entire rows worth of data as a single string.
Dim ConCatThsRw As String, ConcatNxtRw As String


'''Define LstRw as the last row on the sheet that contains data
LstRw = Cells.Find(What:="*", After:=[A1], SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
''''Define LstCol as the last column on the sheet that contains data
LstCol = Cells.Find(What:="*", After:=[A1], SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column


'''Start looping up column A from the last row to the second row.
For ThsRw = LstRw To 2 Step -1
  ConCatThsRw = ""
  ConcatNxtRw = ""
  '''Test to see that the current row is not blank first
  If Application.WorksheetFunction.Subtotal(3, Range(Cells(ThsRw, 1), Cells(ThsRw, LstCol))) <> 0 Then
    '''Loop through each row of data, one cell at a time and compare it to the row above.
    For Each cel In Range(Cells(ThsRw, 1), Cells(ThsRw, LstCol))
      ConCatThsRw = ConCatThsRw & cel.Value
      ConcatNxtRw = ConcatNxtRw & cel.Offset(-1).Value
    Next cel
  End If
  '''Test to see if the rows were equal. (If so, move on. If not then delete the current row and then move on.)
  If ConCatThsRw = ConcatNxtRw Then
    ThsRw = ThsRw - 1
  Else
    Rows(ThsRw).EntireRow.Delete
  End If
Next ThsRw

End Sub

Hope it helps.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,236
Messages
6,170,917
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top