Editing Data and Comparing lists

mesolomo

New Member
Joined
Jun 15, 2009
Messages
38
I'm currently working on sorting a column of data from a PDF file into separate columns to later compare to another sheet. I'm having a few problems getting the data into a workable format, and also don't know how to compare to the other file. These are the specific issues, if someone can help me!

Ultimately, I want the data in columns that look like this:
Name, City, State, Zip, $123.00, mm/dd/yy

Currently it looks like this:

A
1 Name
2 City State Zip
3 $123.00 mm/dd/yy

The sheets I'm working with are thousands of rows long, with these three rows of data repeating for every person. I have two problems with them, however:

1. I ultimately need each of those pieces of information into a seperate cell, in separate columns. I've been transposing the data then doing "Text to Columns" to get it from three separate rows in Column A to one row in Columns A, B, C etc.

The problem with this is the names aren't always uniform. Sometimes there's Mr. or Mrs. in front of them, sometimes they have a middle initial, or an "esq." behind them, which completely messes up the "Text to Columns." Is there a better way to break them up into separate cells?


2. Secondly, not all of the names are individuals- some are businesses or charities. I want to remove all of these (and the two lines of data that go with each of them), so I only have individual people. Do you have any idea of how to make excel know the difference between, for example, "Mr. John Smith" and "Smith and Sons Realty"?

3. Lastly Once I have all of this data spread out into the different columns
(Last Name, First Name, City, State, Zip, Date, Amount), I need to compare it to another list. Basically, I've got one list of data configured in a similar way, and I want to identify any repeats between the two lists. I know that excel has a way to delete repeats, but what I need is a way to delete everything BUT repeats. Any ideas?

Thank you for your help!
 

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
Here is what I've got now:

Excel Workbook
ABCDEFGH
1Dr. Nancy Petersmeyer, Physician
2Philadelphia, PA 19118-4030
3$500.00 5/2/2007
4Jennifer Jean Rickard, Educator
5Bryn Mawr, PA 19010-2859
6$500.00 5/2/2007
7Mr. Alan Bowser, Financial Services
8Bedford, NY 10506-1521
9$2,500.00 5/2/2007
10Mr. Alan Sandman, Physician
11Philadelphia, PA 19103-6607
12$1,000.00 5/2/2007
13Mr. Bruce Foulke, CEO
14Philadelphia, PA 19115-1603
15$500.00 5/2/2007
16Mr. Bruce Koch, CPA
17Philadelphia, PA 19119-1846
18$500.00 5/2/2007
19Mr. Cameron MacTavish, Architect
20Philadelphia, PA 19144-3929
21$500.00 5/2/2007
22Mr. David Hayne, Manager
23Philadelphia, PA 19103-6522
24$1,500.00 5/2/2007
25Mr. David Lipson, President & Publisher
26Bryn Mawr, PA 19010-1648
27$2,500.00 5/2/2007
28Mr. David Pincus, CEO
29Philadelphia, PA 19106
30$500.00 5/2/2007
31Mr. David Stone, Higher Educatoin
32New York, NY 10025-2051
33$500.00 5/2/2007
2007-8


What I want it to look like is:
Excel Workbook
ABCD
1MittalAnandLower Gwynedd, PA 190022034$2,600.00 6/6/2008
2QuackenbosBarbaraMaplewood,J 070402708$300.00 6/6/2008
3MessermanRobertPhiladelphia, PA 191073609$1,000.00 6/10/2008
4ZwillmanBlairWoodbridge,J 070951155$100.00 6/6/2008
5TinkelmanCarlPhiladelphia, PA 191036218$100.00 6/6/2008
6ButlerChristopherWayne, PA 190871957$500.00 6/6/2008
7CashmanChristopherLafayette Hill, PA 194441747$500.00 6/6/2008
8ChimiclesChristopherPhiladelphia, PA 191076063$1,000.00 6/6/2008
9RyanDennisWayne, PA 19087$500.00 6/10/2008
10MramorAndrewHavertown, PA 190835517$100.00 6/6/2008
11StoreyBayardPhiladelphia, PA 191303202$2,600.00 6/5/2008
12WolfBryanArdmore, PA 190032509$500.00 6/10/2008
13AdamsCarolynPhiladelphia, PA 191442721$1,000.00 6/6/2008
14TecoskyDavidPhiladelphia, PA 191521815$100.00 6/6/2008
15StefanowiczElainePhiladelphia, PA 191283208$100.00 6/6/2008
16EmmettGaryPhiladelphia, PA 191036516$100.00 6/5/2008
17FernandezHappyPhiladelphia, PA 191042066$1,000.00 6/6/2008
18BrunoLeonardHuntingdon Valley, PA 190066357$2,000.00 6/5/2008
19RossmanLouisBala Cynwyd, PA 190042335$100.00 6/6/2008
20BellMadelinePenn Valley, PA 190721420$500.00 6/10/2008
21FinkPaulMerion Station, PA 190661717$200.00 5/20/2008
22CarrollPeterWillow Grove, PA 190904210$100.00 6/6/2008
23SquillaroRichardPhiladelphia, PA 191454638$100.00 6/6/2008
24MarkiewiczStanleyBlackwood,J 080125541$100.00 6/6/2008
25AltschulerStevenPhiladelphia, PA 191035777$1,000.00 6/12/2008
26GambaThomasPhiladelphia, PA 191036228$200.00 6/6/2008
27KohlerThomasPhiladelphia, PA 19103$100.00 6/6/2008
2008-4



Then, I need to compare it to a list that looks like this:
Excel Workbook
ABCDEFG
2ADELBERG KENNETHDEVON*3/31/200924002009-2010SestakThrough Committee
3ADELBERG KENNETHDEVON*3/25/200924002009-2010SestakThrough Committee
4ALCHIN, JOHN R.PHILADELPHIA*3/14/200910002009-2010SestakIndividual
5ALDERFER R. BRENTWAYNE*3/30/20095002009-2010SestakThrough Committee
6ANDERSON, HARRIET V.NEWTOWN SQUARE*3/25/20092502009-2010SestakIndividual
7ANDERSON, THOMAS J.ARLINGTON*3/30/200910002009-2010SestakIndividual
8ARMAO, CHRISTOPHER S.RESTON*3/23/20095002009-2010SestakIndividual
9ATTERBURY ELIZABETH G.MALVERN*3/24/20095002009-2010SestakThrough Committee
10BABIN, JOHN H.MEDIA*3/27/200924002009-2010SestakIndividual
11BABIN, MARYANN C.MEDIA*3/27/200924002009-2010SestakIndividual
12BARUCH, ANN R.HAVERFORD*3/31/200924002009-2010SestakIndividual
13BENOLIEL, PETER AWAYNE*3/25/20095002009-2010SestakIndividual
14BENTER WILLIAMPITTSBURGH*3/31/200910002009-2010SestakThrough Committee
15BERGER, DANIELPHILADELPHIA*3/31/200924002009-2010SestakIndividual
16BERGER, DANIELPHILADELPHIA*3/31/200924002009-2010SestakIndividual
17BERGER, HARRIETPHILADELPHIA*3/31/200924002009-2010SestakIndividual
18BERGER, HARRIETPHILADELPHIA*3/31/200924002009-2010SestakIndividual
19BERNSTEIN CLAUDE D.F.NEW CANAAN*3/30/20095002009-2010SestakThrough Committee
20BEST WILLIAM J.HUNTINGDON VALLEY*3/28/200924002009-2010SestakThrough Committee
21BEVAN JOHN SVILLANOVA*3/31/20095002009-2010SestakThrough Committee
22BEVAN, JOHN SVILLANOVA*3/10/20095002009-2010SestakIndividual
23BILLMERS RICHARD IRALANGHORNE*3/29/20092502009-2010SestakThrough Committee
24BLUM, HOWARD A.ELKINS PARK*3/10/20095002009-2010SestakIndividual
25BOGLE, JOHN C.VALLEY FORGE*3/31/200924002009-2010SestakIndividual
26BOULTER, BEAUARLINGTON*3/23/200910002009-2010SestakIndividual
09-10



What i'm ultimately looking for is a list of the names (and the data in the columns next to each name) of all the people that appear on at least 3 lists like this.

Any ideas?
 
Upvote 0
Don't understand about Col.F & G
try this anyway.
The result will be on Sheet2
Code:
Sub test()
Dim a, b(), i As Long, n As Long
a = Range("a1").CurrentRegion.Value
ReDim b(1 To UBound(a, 1) \ 3 + 1, 1 To 7)
For i = 1 To UBound(a, 1) Step 3
    n = n + 1
    b(n, 1) = Split(Split(a(i, 1), " ",2)(1), ",")(0)
    b(n, 2) = Split(a(i + 1, 1), ",")(0)
    b(n, 3) = Split(a(i + 2, 1))(1)
    b(n, 4) = Split(a(i + 2, 1))(0)
    b(n, 5) = Year(DateValue(b(n, 3))) & "-" & _
          Year(DateValue(b(n, 3))) + 1
    b(n, 6) = "Sestak"
Next
Sheets("sheet2").Cells(1).Resize(n, 7).Value = b
End Sub
 
Upvote 0
If you only have to do it a few times I would suggest, instead of just Paste, trying Edit/Paste Special and select Text when you paste it into the workbook. This will format it it into columns as you require.
 
Upvote 0
When I run that macro on the list like the first one, it tells me "subscript out of range."

It gives me the option to debug, but I have no idea what to do.

Let me know if you have any ideas!

Thanks!
 
Upvote 0
Can you step through the code ?

When you are in VBE
1) Click somewhere on the code.
2) As you hit F8, the code will execute one line.

Can you find the line that gives you an error ?
 
Upvote 0
Do I highlight a line in yellow (using f8) then go to the excel doc and run the macro?

I opened VBE and pressed F8 all the way down and nothing looked strange (although I don't know what to look for)
 
Upvote 0
1) Click somewhere on the code.
2) As you hit F8, the code will execute one line.

Then hit F8 again and again until the code gives you an error.
 
Upvote 0
It goes all the way down, then when it reaches the bottom it moves up to the first value that starts with "b" under the "For" heading, and just travels up and down that set. No error comes up.
 
Upvote 0

Forum statistics

Threads
1,223,227
Messages
6,170,849
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top