Match Text string and Remove Duplicates

Pete2020

Board Regular
Joined
Apr 25, 2020
Messages
68
Office Version
  1. 2016
Platform
  1. Windows
I thank each and every Mr. Excel forum members who are helping hundreds of Productivity Issues daily.

I am looking for a solution to remove similar duplicates by identifying the Pattern which every text ends with a Number.

Based on Column A (primary key) - the macro should run on column B data, by matching the similar text name ends with a Number.It should trim the extra spaces and delete the numbers and special characters around it.

Later macro should delete the Column B duplicates based on column A and results should be populated in column D and E.

Sample data

Book7
ABCDE
1IDRAW DATA PrimaryRemove duplicate and Keep Unique
2CD-001Toggle Method - 1CD-001Toggle Method
3CD-001Toggle Method - 2CD-002Update Action -part
4CD-002Update Action - Part 1CD-003Add Form - Part
5CD-002Update Action - Part 2CD-004Code Implementation
6CD-003Add Form - Part (1)CD-005Create and run a while loop, part
7CD-003Add Form : Part (2)CD-006VBA Framework Lecture
8CD-004Code Implementation ICD-007VBA Project Video
9CD-004Code Implementation IICD-008VBA Command Video
10CD-005Create and run a while loop, part 1CD-009Player Setup P
11CD-005Create and run a while loop, part 2CD-010The registerNewUser method part
12CD-006VBA Framework Lecture:-1 CD-011HTML Form @ Part
13CD-006VBA Framework Lecture:-2 CD-012VBA Function
14CD-007VBA Project Video - iCD-013Let's create Macro Pt
15CD-007VBA Project Video- iiCD-014Source Code Lesson
16CD-008VBA Command Video1CD-015VBA Functions - Chapter
17CD-008VBA Command Video 2CD-016Inbuilt Time Function
18CD-009Player Setup P.1CD-016Inbuilt Date Functions
19CD-009Player Setup P.2CD-016Inbuilt Text Functions
20CD-010The registerNewUser method part1
21CD-010The registerNewUser method part2
22CD-011HTML Form @ part2
23CD-011HTML Form @ part3
24CD-012VBA Functions 01
25CD-012VBA Functions02
26CD-013Let's create Macro Pt.1
27CD-013Let's create Macro Pt.2
28CD-014Source Code Lesson 28
29CD-014Source Code Lesson 29
30CD-015VBA Functions - Chapter-1
31CD-015VBA Functions - Chapter-2
32CD-016Inbuilt Time Function
33CD-016Inbuilt Date Functions
34CD-016Inbuilt Text Functions
Sheet1
 
I encountered errors when running your code. Particularly this line y = Trim(Mid(c.Value, InStrRev(c.Value, " "), 5)) as some rows of the provided sample data (eg row 27 from post #12) do not contain a space character.
Thanks Peter_sss, excellent remarks, the code was for post #1, it is only to correct the code problem in post #3.
Now I see that there are more patterns that the OP had not mentioned in post #1. Glad to know you are here with a great macro to help the OP.

Just a little detail in your code, and I don't know if it's relevant, after the questions you asked the OP.
But in the first example of the OP, he has this data and the expected result:

Libro1
ABCDE
1IDRAW DATA
2CD-003Add Form - Part (1)CD-003Add Form Part
3CD-003Add Form : Part (2)
Hoja1


But your code results in this:

Libro1
ABCDE
1IDRAW DATA
2CD-003Add Form - Part (1)CD-003Add Form Part (1)
3CD-003Add Form : Part (2)CD-003Add Form Part (2)
Hoja1


But if it's an example that no longer exists in the OP data, then forget my comment.

Again I thank you for your time in reviewing my code and for sharing your knowledge.
 
Upvote 0

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
I personally thank @Peter_SSs and @DanteAmor and Mr.Excel forum administrators for giving your time and effort in helping with your codes.

I regret myself in pulling all the relevant patterns from my records. But I appreciate your commitment in resolving the issues through code modification and effective engagement through out this thread.

@Peter_SSs I am learning a lot from you day by day.

Thank You
 
Upvote 0
You're welcome. I assume that you are at or close to what you wanted in relation to this thread? Or maybe still testing?

Just a little detail in your code,...

But your code results in this:
Hi Dante
Thanks for picking that up. (y)
At some point I have lost the correct dealing with numbers in parentheses at the end. :oops:

I'm hoping that this covers all the formats provided so far.

VBA Code:
Sub AggregateParts_v2()
  Dim RX As Object, d As Object
  Dim a As Variant
  Dim i As Long
  
  Const Patt1 As String = "[\,:;\.\-@_#]"
  Const Patt2 As String = "([^a-z0-9][a-z])?(\((\d+|[civxl]+)\))|(\d+|[civxl]+)( *\([a-z]+\)){0,1}$"

  Set d = CreateObject("Scripting.Dictionary")
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.IgnoreCase = True
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  For i = 1 To UBound(a)
    RX.Pattern = Patt1
    a(i, 2) = Application.Trim(RX.Replace(a(i, 2), " "))
    RX.Pattern = Patt2
    If RX.Test(a(i, 2)) Then a(i, 2) = RTrim(RX.Replace(a(i, 2), ""))
    If a(i, 2) Like "* [a-zA-Z]" Then a(i, 2) = Left(a(i, 2), Len(a(i, 2)) - 2)
    d(a(i, 1) & ";" & a(i, 2)) = Empty
  Next i
  Application.ScreenUpdating = False
  With Range("D2").Resize(d.Count)
    .Value = Application.Transpose(d.Keys)
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
  End With
  Application.ScreenUpdating = True
End Sub
 
Upvote 0
You're welcome. I assume that you are at or close to what you wanted in relation to this thread? Or maybe still testing?

Hi Dante
Thanks for picking that up. (y)
At some point I have lost the correct dealing with numbers in parentheses at the end. :oops:

I'm hoping that this covers all the formats provided so far.

VBA Code:
Sub AggregateParts_v2()
  Dim RX As Object, d As Object
  Dim a As Variant
  Dim i As Long
 
  Const Patt1 As String = "[\,:;\.\-@_#]"
  Const Patt2 As String = "([^a-z0-9][a-z])?(\((\d+|[civxl]+)\))|(\d+|[civxl]+)( *\([a-z]+\)){0,1}$"

  Set d = CreateObject("Scripting.Dictionary")
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.IgnoreCase = True
  a = Range("A2", Range("B" & Rows.Count).End(xlUp)).Value
  For i = 1 To UBound(a)
    RX.Pattern = Patt1
    a(i, 2) = Application.Trim(RX.Replace(a(i, 2), " "))
    RX.Pattern = Patt2
    If RX.Test(a(i, 2)) Then a(i, 2) = RTrim(RX.Replace(a(i, 2), ""))
    If a(i, 2) Like "* [a-zA-Z]" Then a(i, 2) = Left(a(i, 2), Len(a(i, 2)) - 2)
    d(a(i, 1) & ";" & a(i, 2)) = Empty
  Next i
  Application.ScreenUpdating = False
  With Range("D2").Resize(d.Count)
    .Value = Application.Transpose(d.Keys)
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
  End With
  Application.ScreenUpdating = True
End Sub

Almost Closed. Your help is more than expected.

Macro,is working fine on my 60k rows
 
Upvote 0

Forum statistics

Threads
1,223,895
Messages
6,175,257
Members
452,625
Latest member
saadat28

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top