Editing Data and Comparing lists

mesolomo

New Member
Joined
Jun 15, 2009
Messages
38
I'm currently working on sorting a column of data from a PDF file into separate columns to later compare to another sheet. I'm having a few problems getting the data into a workable format, and also don't know how to compare to the other file. These are the specific issues, if someone can help me!

Ultimately, I want the data in columns that look like this:
Name, City, State, Zip, $123.00, mm/dd/yy

Currently it looks like this:

A
1 Name
2 City State Zip
3 $123.00 mm/dd/yy

The sheets I'm working with are thousands of rows long, with these three rows of data repeating for every person. I have two problems with them, however:

1. I ultimately need each of those pieces of information into a seperate cell, in separate columns. I've been transposing the data then doing "Text to Columns" to get it from three separate rows in Column A to one row in Columns A, B, C etc.

The problem with this is the names aren't always uniform. Sometimes there's Mr. or Mrs. in front of them, sometimes they have a middle initial, or an "esq." behind them, which completely messes up the "Text to Columns." Is there a better way to break them up into separate cells?


2. Secondly, not all of the names are individuals- some are businesses or charities. I want to remove all of these (and the two lines of data that go with each of them), so I only have individual people. Do you have any idea of how to make excel know the difference between, for example, "Mr. John Smith" and "Smith and Sons Realty"?

3. Lastly Once I have all of this data spread out into the different columns
(Last Name, First Name, City, State, Zip, Date, Amount), I need to compare it to another list. Basically, I've got one list of data configured in a similar way, and I want to identify any repeats between the two lists. I know that excel has a way to delete repeats, but what I need is a way to delete everything BUT repeats. Any ideas?

Thank you for your help!
 
mesolomo,

I like to build things in pieces.

Is the following screenshot correct for Sheet "2007-8"; your data in column A, to the moved data in columns C, D, E, and F?


Excel Workbook
ABCDEF
1Dr. Nancy Petersmeyer, PhysicianPetersmeyerNancyPhiladelphia, PA 19118-4030$500.00 5/2/2007
2Philadelphia, PA 19118-4030RickardJenniferBryn Mawr, PA 19010-2859$500.00 5/2/2007
3$500.00 5/2/2007BowserAlanBedford, NY 10506-1521$2,500.00 5/2/2007
4Jennifer Jean Rickard, EducatorSandmanAlanPhiladelphia, PA 19103-6607$1,000.00 5/2/2007
5Bryn Mawr, PA 19010-2859FoulkeBrucePhiladelphia, PA 19115-1603$500.00 5/2/2007
6$500.00 5/2/2007KochBrucePhiladelphia, PA 19119-1846$500.00 5/2/2007
7Mr. Alan Bowser, Financial ServicesMacTavishCameronPhiladelphia, PA 19144-3929$500.00 5/2/2007
8Bedford, NY 10506-1521HayneDavidPhiladelphia, PA 19103-6522$1,500.00 5/2/2007
9$2,500.00 5/2/2007LipsonDavidBryn Mawr, PA 19010-1648$2,500.00 5/2/2007
10Mr. Alan Sandman, PhysicianPincusDavidPhiladelphia, PA 19106$500.00 5/2/2007
11Philadelphia, PA 19103-6607StoneDavidNew York, NY 10025-2051$500.00 5/2/2007
12$1,000.00 5/2/2007
13Mr. Bruce Foulke, CEO
14Philadelphia, PA 19115-1603
15$500.00 5/2/2007
16Mr. Bruce Koch, CPA
17Philadelphia, PA 19119-1846
18$500.00 5/2/2007
19Mr. Cameron MacTavish, Architect
20Philadelphia, PA 19144-3929
21$500.00 5/2/2007
22Mr. David Hayne, Manager
23Philadelphia, PA 19103-6522
24$1,500.00 5/2/2007
25Mr. David Lipson, President & Publisher
26Bryn Mawr, PA 19010-1648
27$2,500.00 5/2/2007
28Mr. David Pincus, CEO
29Philadelphia, PA 19106
30$500.00 5/2/2007
31Mr. David Stone, Higher Educatoin
32New York, NY 10025-2051
33$500.00 5/2/2007
2007-8




Can you supply more examples of different data for column A, for testing?
 
Upvote 0

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
You must keep hitting F8....
OK, run this and see what is happening in the result in Sheet2
Rich (BB code):
Sub test()
Dim a, b(), i As Long, n As Long
a = Range("a1").CurrentRegion.Value
ReDim b(1 To UBound(a, 1) \ 3 + 1, 1 To 7)
On Error Resume Next
For i = 1 To UBound(a, 1) Step 3
    n = n + 1
    b(n, 1) = Split(Split(a(i, 1), " ",2)(1), ",")(0)
    b(n, 2) = Split(a(i + 1, 1), ",")(0)
    b(n, 3) = Split(a(i + 2, 1))(1)
    b(n, 4) = Split(a(i + 2, 1))(0)
    b(n, 5) = Year(DateValue(b(n, 3))) & "-" & _
          Year(DateValue(b(n, 3))) + 1
    b(n, 6) = "Sestak"
Next
Sheets("sheet2").Cells(1).Resize(n, 7).Value = b
End Sub
 
Upvote 0
Hiker, that is EXACTLY what I need! How do I get it?!!??


Seiyer, theres no error this time on the macro, but it didn't change anything.
 
Upvote 0
Hiker, just saw your note about more data. Here is a similar cluster of unsorted column A data (this one starts off with a PAC and several other organizations and businesses, instead of a name, which is a perfect example of the companies and organizations that I want to eliminate completely from the list). Let me know if there's any other information that I can give you and thank you SO much for your help!

Excel Workbook
A
11515 PAC
2Philadelphia, PA 191021918
3$100.00 6/22/2007
4Ace Ina Foundation
5Philadelphia, PA 19106
6$1,000.00 9/14/2007
7AGC Partners
8Philadelphia, PA 191032208
9$15,000.00 6/21/2007
10Ahmeenah Young
11Philadelphia, PA 191193108
12$500.00 9/17/2007
13Akubure Ahuchaogu
14Harleysville, PA 194383072
15$250.00 8/11/2007
16Albert Taus & Associates
17Philadelphia, PA 191152525
18$500.00 8/30/2007
19Alexandra Golaski
20Philadelphia, PA 191512405
21$125.00 9/17/2007
22Alfred Farlino
23Philadelphia, PA 191304015
24$200.00 9/25/2007
25Alfred McClymont
26Pennsauken, NJ 081102312
27$1,000.00 10/1/2007
28Amoroso's Brothers Partnership
29Lansdowne, PA 190508226
30$2,500.00 9/24/2007
31Amy Williams
32Philadelphia, PA 191433529
33$100.00 10/7/2007
34Anand Mittal
35Lower Gwynedd, PA 190022034
36$1,000.00 7/25/2007
37Aramark Political Action Committee
38Philadelphia, PA 191072934
39$5,000.00 9/27/2007
40Archie Epperson
41Philadelphia, PA 191194003
42$250.00 8/11/2007
43Arthur Haywood
44Philadelphia, PA 191503519
45$1,000.00 10/16/2007
46B And B PAC
47Harrisburg, PA 171109484
48$500.00 8/15/2007
49Ballard Spahr Andrews & Ingersoll LLP
50Philadelphia, PA 191037507
51$20,000.00 10/12/2007
52Bank of America Corp PAC
53Atlanta, GA 303083615
54$5,000.00 8/10/2007
55Bao Zhu Zhang
56Philadelphia, PA 191301502
57$100.00 9/6/2007
58Barbara Sheehan
59West Chester, PA 193806441
60$100.00 10/2/2007
61Barrack Rodos and Bacine State PAC
62Philadelphia, PA 191192935
63$10,000.00 8/27/2007
64Beatty Lincke
65West Chester, PA 193823021
66$100.00 6/27/2007
67Beatty Lincke
68West Chester, PA 193823021
69$1,000.00 7/10/2007
70BenePac
71Philadelphia, PA 191063619
72$500.00 10/12/2007
73Benjamin Rosin
74Margate City, NJ 084021066
75$500.00 10/20/2007
76Bennett Bricklin and Saltzburg LLP
77Philadelphia, PA 191032301
78$1,000.00 9/17/2007
79Beverly Lomax
80Hilltown, PA 189270145
81$2,500.00 10/1/2007
82Beverly Lomax
83Hilltown, PA 189270145
84$2,500.00 10/1/2007
85Bipac/MFC
86Pittsburgh, PA 15258
87$9,000.00 6/15/2007
88Biyi Oyefule M.D.
89Philadelphia, PA 191531912
90$500.00 8/11/2007
91Blanka Zizka
92Philadelphia, PA 191075915
93$100.00 10/8/2007
94Blue PAC
95Philadelphia, PA 19103
96$500.00 9/6/2007
97Brian Shovlin
98Philadelphia, PA 191291440
99$150.00 6/20/2007
100Bruce Bachtle
101Perkasie, PA 189442457
102$250.00 9/26/2007
103Bunleng Taing
104Elkins Park, PA 19027
105$100.00 9/7/2007
106C. Gary Shepherd
107Philadelphia, PA 191444506
108$1,000.00 8/11/2007
109C. Richard Horrow Esq.
110Philadelphia, PA 19103
111$250.00 7/18/2007
112C.G.F. Associates
113Palmyra, NJ 08065
114$5,000.00 7/24/2007
115Campaign Pennsylvania
116Media, PA 190630244
117$5,000.00 6/21/2007
118Carmen Lineberger
119Philadelphia, PA 191441323
120$250.00 9/20/2007
121Carol Defries
122Philadelphia, PA 191453912
123$100.00 7/25/2007
124Carpenters' Legislative Improvement Committee
125Washington, DC 200012153
126$5,000.00 9/24/2007
127Carpenters Local Union 2012 PAC Fund
128Seaford, DE 199730044
129$2,000.00 9/5/2007
2007-5
 
Upvote 0
Seiya,

I just tried it again, and it worked in Sheet 2. All the data looks great, except the names are still weirdly formatted. Some are sort of chopped up, and the ones that aren't still have all of the "Mr."s and "Esq."s and such that I need to get rid of so I can sort it against the other list by the last names. Is there something more we can do to the names?
 
Upvote 0
Then I need to see the results of the weirdly formatted data with its original data.
 
Upvote 0
Code:
Sub test()
Dim a, b(), c(), i As Long, n As Long, x
a = Range("a1").CurrentRegion.Value
ReDim b(1 To UBound(a, 1) \ 3 + 1, 1 To 7)
ReDim c(1 To UBound(a, 1) \ 3 + 1, 1 To 4)
On Error Resume Next
For i = 1 To UBound(a, 1) Step 3
    n = n + 1
    If a(i, 1) Like "*.*,*" Then
        b(n, 1) =  Split(Trim(Split(a(i, 1), ".",2)(1)), ",")(0)
    Else
        b(n, 1) = a(i, 1)
    End If
    b(n, 1) = Split(Split(a(i, 1), " ",2)(1), ",")(0)
    b(n, 2) = Split(a(i + 1, 1), ",")(0)
    b(n, 3) = Split(a(i + 2, 1))(1)
    b(n, 4) = Split(a(i + 2, 1))(0)
    b(n, 5) = Year(DateValue(b(n, 3))) & "-" & _
          Year(DateValue(b(n, 3))) + 1
    b(n, 6) = "Sestak"
    c(n, 1) = Split(b(n, 1))(0) : c(n, 2) = Split(b(b, 1))(1)
    c(n, 3) = a(i + 1, 1) : c(n, 4) = a(i + 2, 1)
Next
Range("c1").Resize(n, 4).Value = c
Sheets("sheet2").Cells(1).Resize(n, 7).Value = b
End Sub
 
Last edited by a moderator:
Upvote 0
Seiya, first off, thank you so much for doing all this work on this!

I put in the most recent macro, and it says "Type Mismatch". When I went through line by line, it gave the same error as soon as I hit F8, regardless of where I clicked at first.

Here is the screenshot for the macro before that. Aside from the first few rows, it's definitely a big improvement. Ultimately, though I need to have the names in two columns, last name and first name. Is there a way to make the macro do that? Thanks!

Here's the screenshot:
Excel Workbook
ABCDEF
1Limited PartnershipPhiladelphia5/2/2007$10,000.002007-2008Sestak
2S. 19th Street PartnershipPhiladelphia5/9/2007$2,000.002007-2008Sestak
3& Ingersoll AssociatesPhiladelphia5/9/2007$5,000.002007-2008Sestak
4Good Government CommitteeDallas5/9/2007$2,000.002007-2008Sestak
5ChenBellmawr5/4/2007$500.002007-2008Sestak
6Taus & AssociatesPhiladelphia5/21/2007$2,500.002007-2008Sestak
7Sanford for 4th District CouncilPhiladelphia5/11/2007$500.002007-2008Sestak
8GrewalPhiladelphia5/10/2007$5,000.002007-2008Sestak
9WilliamsPhiladelphia5/1/2007$100.002007-2008Sestak
10DiSalvoPhiladelphia5/2/2007$250.002007-2008Sestak
11MarionAtco5/6/2007$100.002007-2008Sestak
12MarionAtco5/10/2007$100.002007-2008Sestak
13of America Corp PACAtlanta5/10/2007$2,000.002007-2008Sestak
14LPClifton5/1/2007$5,000.002007-2008Sestak
15Fang LiuPhiladelphia5/3/2007$400.002007-2008Sestak
16Rome LLPPhiladelphia5/8/2007$8,000.002007-2008Sestak
17WillmesPhiladelphia5/9/2007$50.002007-2008Sestak
18WillmesPhiladelphia5/10/2007$50.002007-2008Sestak
19LuckmanPhiladelphia5/9/2007$150.002007-2008Sestak
20Lars BeckPhiladelphia5/4/2007$100.002007-2008Sestak
21City Owners AssociationPhiladelphia5/21/2007$750.002007-2008Sestak
22Pennsylvanians for Effective GovernmentLafayette Hill5/11/2007$2,500.002007-2008Sestak
23& TikellisHaverford5/1/2007$5,000.002007-2008Sestak
24Fang PuBroomall5/4/2007$1,000.002007-2008Sestak
25PappajohnPhiladelphia5/9/2007$100.002007-2008Sestak
26ButlerWayne5/3/2007$500.002007-2008Sestak
27AssociatesPhiladelphia5/10/2007$2,500.002007-2008Sestak
28Painting and FinishingPhiladelphia5/5/2007$200.002007-2008Sestak
29To Elect Alice Dubow For JudgePhiladelphia5/9/2007$2,000.002007-2008Sestak
30To Elect Alice Dubow For JudgePhiladelphia5/7/2007$1,000.002007-2008Sestak
31To Elect C. Darnell Jones II for PennsylvaniaPhiladelphia5/9/2007$1,000.002007-2008Sestak
32To Elect Ellen Green-CeislerPhiladelphia5/7/2007$1,500.002007-2008Sestak
33College PAC - Faculty FederationPhiladelphia5/2/2007$1,000.002007-2008Sestak
Sheet2
 
Upvote 0
So, I just finished a list I was doing manually (which took many, many, many hours, so given that it's approximately 1% of the total I need to do, I'm really hoping a macro might be able to do this for me).

Anyway, I thought I'd post part of the list I did so you could see what I'm going for. Ideally, this is how it would look:

Excel Workbook
ABCDEFG
1SoleboAbayomiPhiladelphiaPA19144$100.0010/28/2006
2Okoli-UmeweniAdaoraPhiladelphiaPA19115$250.0010/28/2006
3AdesanyaAdenekanAmblerPA19002$250.0010/29/2006
4PlumerAlvinPhiladelphiaPA19147$250.009/26/2006
5FerberAndreBala CynwydPA19004$200.006/6/2006
6BrownCarmhielPhiladelphiaPA19107$100.0011/12/2006
7TollCharlesPhiladelphiaPA19103$100.005/17/2006
8OrjiChristopherCollingdalePA19023$100.0010/28/2006
9OrjiChristopherPhiladelphiaPA19151$100.0010/28/2006
10McGillCorneliaBlue BellPA19422$100.0010/21/2006
11BorrelliDarrylTrappePA19426$150.0011/5/2006
12FinemanDavidMerion StationPA19066$500.005/30/2006
13FordhamDodiPhiladelphiaPA19103$100.0011/11/2006
14KnauerDouglasBristolPA19007$100.005/27/2006
15BannettAaronPhiladelphiaPA19103$300.008/27/2006
16BannettAaronPhiladelphiaPA19103$150.006/20/2006
17FeinAdamPhiladelphiaPA19103$500.007/26/2006
18FeinAdamPhiladelphiaPA19103$250.0012/29/2006
19JacobyAdriennePhiladelphiaPA19106$125.0012/4/2006
20AdesanyaAfolusoAmblerPA19002$250.0010/29/2006
21TerzianAllenHaverfordPA19041$500.004/27/2006
22IkemeAntyCoatesvillePA19320$150.0010/28/2006
23BennettArlenePhiladelphiaPA19119$100.0010/11/2006
24StoreyBayardPhiladelphiaPA19130$1,000.009/2/2006
25StoreyBayardPhiladelphiaPA19130$2,500.0012/17/2006
26StoreyBayardPhiladelphiaPA19130$500.006/8/2006
27StoreyBayardPhiladelphiaPA19130$1,000.0010/26/2006
28BorislowBernardPhiladelphiaPA19106$500.008/28/2006
29OkechukwuChikeGibbsboroNJ8026$250.0010/28/2006
30OkechukwuChikeGibbsboroNJ8026$500.0012/18/2006
31BarronDanaPhiladelphiaPA19119$250.006/18/2006
32IvillDennisPhiladelphiaPA19147$500.0012/28/2006
33IvillDennisPhiladelphiaPA19147$500.0011/9/2006
34OnwugbuforEmekaPennsaukenNJ8109$250.0010/28/2006
35HodgesEricVillanovaPA19085$100.005/22/2006
36HodgesEricVillanovaPA19085$2,400.0011/29/2006
37AbroEugeneArdmorePA19003$100.0010/17/2006
38YaffeHaroldPhiladelphiaPA19102$250.0012/28/2006
39YaffeHaroldPhiladelphiaPA19102$250.0011/15/2006
40YaffeHaroldPhiladelphiaPA19102$300.005/9/2006
41EnwerejiJamesPhiladelphiaPA19126$100.0010/27/2006
42BrownLancePhiladelphiaPA19131$100.005/4/2006
43BrunoLeonardPhiladelphiaPA19118$1,000.005/5/2006
44BrunoLeonardPhiladelphiaPA19118$1,000.009/5/2006
45WilkofMarciaPhiladelphiaPA19146$108.0012/17/2006
46VictorMarkBryn MawrPA19010$500.004/28/2006
47ChisumMelvinPhiladelphiaPA19129$250.0010/26/2006
48PelliniMichaelMalvernPA19355$500.009/12/2006
2006-7
 
Upvote 0

Forum statistics

Threads
1,223,228
Messages
6,170,871
Members
452,363
Latest member
merico17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top