Separating Names from Single to Multiple Columns

ProvPC

Board Regular
Joined
Nov 2, 2014
Messages
62
I have an Excel spreadsheet containing roughly 3500 rows. The data from these rows needs to be copied into various tables in an Access database - but first, I need to manipulate the data a wee bit to fit the table structure in the database. There are a couple of things I'll need to get sorted out, but here's the first:

Column B in the spreadsheet contains a list of complete names of people, along with spouse names, in a format similar to Mouse, Mickey & Minnie. The good news is that the format is pretty consistent with the placement of commas and ampersands. The bad news is that the names themselves are formatted in a variety of ways. Some examples include:

Mouse, Mickey & Minnie
Dog, Pluto
Duck, Donald Douglas & Daisy
Dawg, Goofy & Clarabelle Mae Cow
Cat, Sylvester & Tweety Bird (Tricky, in the sense that "Bird" could be Tweety's last name, or Mrs. Cat's middle name. I'll have to check these manually)

The trouble is that the database table requires that the names be separated out into Head of House First Name, HoH Middle Initial, HoH Last Name, Spouse First Name, Spouse Middle Initial, and Spouse Last Name.

Though the vast majority of the data is represented by the first two examples above, there are enough variances to make manually fixing them somewhat of a chore. What I'm hoping for is a set of formulas that can look at column B, and extract from it the appropriate information required to fill in other columns that I'll create for that purpose. Here's my (very, very rough) idea of what should go where:

Column B (Complete Name) - the existing column with the data as it currently exists
Column C (Head of House Last Name) - the word before the comma
Column D (Head of House First Name) - the word after the comma (Bonus points if any Jr or Sr present can be placed in this column as well!)
Column E (Head of House Middle Initial) - the second word after the comma, but before the ampersand. Empty if none exists.
Column F (Spouse First Name) - the first word after the ampersand. Empty if none exists.
Column G (Spouse Middle Initial) - the second word after the ampersand. Empty if none exists.
Column H (Spouses Last Name) - the third word after the ampersand. Empty if column F is empty. Otherwise, the contents of Column C.

I tried doing this myself as a series a steps using the information found in another thread as a guide (thanks lancerj017). but I was only marginally successful. I was able to accomplish Column C, but column D didn't work if there wasn't a spouse, and column F would sometimes show the "&" and sometimes not, though I could not determine any difference in either the original data or the formula I was using (which was simply copied down the column).

Any help would be greatly appreciated!
 

Excel Facts

Enter current date or time
Ctrl+: enters current time. Ctrl+; enters current date. Use Ctrl+: Ctrl+; Enter for current date & time.
Looking at the formatting of the code on this message board... it might be an error without the line return after Sub SeparateNames()

Could you try hitting enter to put "Dim Names as Variant" on a separate line?

Okay - I did as you suggested, and I get an out of range error. The debug opens with this line highlighted:
.Range("C" & i).Value = Names(0)
I have no idea what that means, or whether I'm applying it correctly.
 
Upvote 0
It might help, but I have uploaded a file with my code above and your examples to help you see how it works. You can add more names below the example ones, or even new names instead of the ones I have there.

Thank you so much! This works beautifully, and if you decide to stop here and wash your hands of things, I'll still be indebted to you for having taken the time to create a solution that works for nearly all of the entries.
If you've a mind to take it one or two steps further, there are a couple of things that aren't working out quite right yet.

1) Although the column is labeled "middle initial", if you could change (or tell me what to change) to put the whole name in there for now, it would be helpful. I just need that until I get the "is this a different last name, or a middle name" thing figured out for specific individuals. For example, with "Cat, Sylvester & Tweety Bird", I'll have to determine whether "Bird" is her last name or middle name. That will be easier if the whole name appears there until it's sorted out. Then (I think) I can beg or borrow a formula to eliminate all but the first letter.

2) From what I've seen in the data on hand, in all (or very nearly all) of the spouses having three names after the ampersand, the third name is actually the spouses last name.
For example, with "Dawg, Goofy & Clarabelle Mae Cow", although partnered with Mr. Dawg, Clarabelle Mae has chosen to retain her last name, so "Cow" should actually appear in the last name field, rather than Dawg. Mae should appear as her middle name.

3) Can you think of any way to overcome the obstacles afforded by Jr and Sr titles? In a couple of the names I copied to your spreadsheet, Jr was interpreted as the Middle Initial, and so "Doe, John Jr" becomes "Doe, John J." after the macro runs. In these cases, the first name should actually read "John Jr". If there's not really any convenient any way around this, it's not a big issue to handle manually - but here again, having the "initial" field contain the whole name would be helpful for now - just until I get the data all cleaned up.

I think those are actually the only issues I can see at this point! There are some structure problems with the names in the original data (sometimes last names with prefixes are listed as "de Montoya" and sometimes as "Montoya de", for example) but there's not enough of those to worry about - they can easily be cleaned up when I spot them.

Thank you again! It's a great start!
 
Upvote 0
Give this a try:
Excel Workbook
ABCDEFGH
1*Mouse, Mickey & MinnieMouseMickey*Minnie*Mouse
2*Dog, PlutoDogPluto****
3*Duck, Donald Douglas & DaisyDuckDonaldDouglasDaisy*Duck
4*Dawg, Goofy & Clarabelle Mae CowDawgGoofy*ClarabelleMaeCow
5*Cat, Sylvester & Tweety BirdCatSylvester*TweetyBird*
Sheet
 
Upvote 0
Ok I managed to get some success with the following code. It will not catch everything, but I tried to account for as many of your conditions as I can. (they will still need to be checked, like you said, vis a vis middle names which should be last names and 2 or more word last names... The resulting columns would be:
Head Of House First name, Middle name, Last name, Suffix, Spouse First name, Middle Name, Last name, Suffix

Code:
Sub SeparateNames()
Dim Names As Variant
Dim HOHNameParts As Variant, SpouseNameParts As Variant
Dim UnSorted As String, LastNames As String
Dim HOHFullName As String, SpouseFullName As String
Dim lastRow As Integer, i As Integer, j As Integer
Dim R As Range




With ActiveSheet
    lastRow = .Cells(.Rows.Count, "B").End(xlUp).Row
    .Range("C2:J" & lastRow).ClearContents
    
    For i = 2 To lastRow
        HOHFullName = vbNullString
        SpouseFullName = vbNullString
        UnSorted = vbNullString
    
        Set R = .Range("B" & i)
        UnSorted = R.Value
        UnSorted = Replace(UnSorted, ", ", " & ")
        Names = Split(UnSorted & sAnd & sAnd & sAnd, " & ")


        HOHFullName = Names(1) & " " & Names(0)
        If UBound(Names) > 1 Then SpouseFullName = Names(2) & " " & Names(0)
        
        If Len(HOHFullName) <> Len(Replace(HOHFullName, "Jr ", "")) Then HOHFullName = Replace(HOHFullName, "Jr ", "") & " Jr"
        If Len(HOHFullName) <> Len(Replace(HOHFullName, "Sr ", "")) Then HOHFullName = Replace(HOHFullName, "Sr ", "") & " Sr"
        If Len(SpouseFullName) <> Len(Replace(SpouseFullName, "Jr ", "")) Then SpouseFullName = Replace(SpouseFullName, "Jr ", "") & " Jr"
        If Len(SpouseFullName) <> Len(Replace(SpouseFullName, "Sr ", "")) Then SpouseFullName = Replace(SpouseFullName, "Sr ", "") & " Sr"
        
        HOHNameParts = Split(HOHFullName, " ")
        SpouseNameParts = Split(SpouseFullName, " ")
        
        R.Offset(, 1).Value = HOHNameParts(0)
        Select Case UBound(HOHNameParts)
        Case 1
            R.Offset(, 3).Value = HOHNameParts(1)
        Case 2
            If Len(HOHNameParts(2)) = 2 Then
                R.Offset(, 3).Value = HOHNameParts(1)
                R.Offset(, 4).Value = HOHNameParts(2)
            Else
                R.Offset(, 2).Value = HOHNameParts(1)
                'If you want to use just the initial for middle names, use this instead of what's above.  R.Offset(,2).Value = Left(HOHNameParts(1),1) & "."
                R.Offset(, 3).Value = HOHNameParts(2)
            End If
        Case Is >= 3
            R.Offset(, 2).Value = HOHNameParts(1)
            'If you want to use just the initial for middle names, use this instead of what's above.  R.Offset(,2).Value = Left(HOHNameParts(1),1) & "."
            If Len(HOHNameParts(UBound(HOHNameParts))) = 2 Then
                For j = 2 To UBound(HOHNameParts) - 1
                    LastNames = LastNames & HOHNameParts(j) & " "
                Next j
                R.Offset(, 3).Value = Trim(LastNames)
                R.Offset(, 4).Value = HOHNameParts(UBound(HOHNameParts))
            Else
                For j = 2 To UBound(HOHNameParts)
                    LastNames = LastNames & HOHNameParts(j) & " "
                Next j
                R.Offset(, 3).Value = Trim(LastNames)
            End If
        End Select
        LastNames = vbNullString
        
        If Len(Join(SpouseNameParts)) <> 0 Then
        If SpouseNameParts(0) <> "" Then
            R.Offset(, 5).Value = SpouseNameParts(0)
            Select Case UBound(SpouseNameParts)
            Case 1
                R.Offset(, 7).Value = SpouseNameParts(1)
            Case 2
                If Len(SpouseNameParts(2)) = 2 Then
                    R.Offset(, 7).Value = SpouseNameParts(1)
                    R.Offset(, 8).Value = SpouseNameParts(2)
                Else
                    R.Offset(, 6).Value = SpouseNameParts(1)
                    'If you want to use just the initial for middle names, use this instead of what's above.  R.Offset(,6).Value = Left(SpouseNameParts(1),1) & "."
                    R.Offset(, 7).Value = SpouseNameParts(2)
                End If
            Case Is >= 3
                R.Offset(, 6).Value = SpouseNameParts(1)
                'If you want to use just the initial for middle names, use this instead of what's above.  R.Offset(,6).Value = Left(SpouseNameParts(1),1) & "."
                If Len(SpouseNameParts(UBound(SpouseNameParts))) = 2 Then
                    For j = 2 To UBound(SpouseNameParts) - 1
                        LastNames = LastNames & SpouseNameParts(j) & " "
                    Next j
                    If LastNames <> Names(0) Then LastNames = Replace(LastNames, Names(0), "")
                    R.Offset(, 7).Value = Trim(LastNames)
                    R.Offset(, 8).Value = SpouseNameParts(UBound(SpouseNameParts))
                Else
                    For j = 2 To UBound(SpouseNameParts)
                        LastNames = LastNames & SpouseNameParts(j) & " "
                    Next j
                    If LastNames <> Names(0) Then LastNames = Replace(LastNames, Names(0), "")
                    R.Offset(, 7).Value = Trim(LastNames)
                End If
            End Select
            LastNames = vbNullString
        End If
        End If
    Next i
End With


End Sub
 
Upvote 0
Give this a try:

This works very nearly flawlessly - thank you for your contribution! There is one little sniggle that confuses me, but maybe it will make sense to you. I copied the formulas in the positions you indicated, then simply used the fill handle to copy the formula down throughout the worksheet.

Nearly everything functioned exactly as expected (thank you!), but in several places throughout the worksheet, the name in column H is missing the first few letters. "Beazer" became "azer", "Villegas" became "egas", etc. The odd thing is that it's actually pretty rare - most names copy over exactly correctly. I carefully compared the formula for the few that didn't work with the ones above and below that worked perfectly (in case something got copied incorrectly), but didn't spot anything odd there, or with the names in original column. Odd, no?

Anyway, thanks for helping out! Between you and tygrrboi, I think I've got this stage pretty well worked out!
 
Upvote 0
Ok I managed to get some success with the following code. It will not catch everything, but I tried to account for as many of your conditions as I can. (they will still need to be checked, like you said, vis a vis middle names which should be last names and 2 or more word last names... The resulting columns would be:
Head Of House First name, Middle name, Last name, Suffix, Spouse First name, Middle Name, Last name, Suffix

Thanks, tygrrboi! This gives me the full names again, and that helps quite a bit. The only (very minor) thing that didn't quite work out is the Jr / Sr designation. It doesn't need to be fixed, by the way - I'm just mentioning it here because I thought you might like to know how it came out.

The thing is, I'm not sure how to clearly report what happened, so I'll just explain it this way. In every case in the spreadsheet where there was a "Jr" or "Sr" suffix, and there was no spouse involved, the macro correctly added the Jr or Sr designation to the suffix column. By coincidence, in each of these cases in the spreadsheet where there was no spouse involved, the Jr or Sr designation was not followed by a period. I don't know if that's relevant or not, but I found it interesting.

In each places in the spreadsheet where there was a "Jr" or "Sr" suffix, and a spouse was involved, the macro treated the suffix as a middle name and moved it to that column. By coincidence, in each of these cases in the spreadsheet, the Jr or Sr designation was followed by a period.

So there you go! I don't know if it was the period or the spouse that affected things (I'd bet on the spouse, but what do I know?). There's no need to address it - in looking over the results of the macro, I came to realize there's less than a dozen such designations involved! Super easy to fix manually, now that I know.

Thank you again for all your help! I think I'm well on my way for this part, but I'll doubtless be back for part 2!
 
Upvote 0
This works very nearly flawlessly - thank you for your contribution! There is one little sniggle that confuses me, but maybe it will make sense to you. I copied the formulas in the positions you indicated, then simply used the fill handle to copy the formula down throughout the worksheet.

Nearly everything functioned exactly as expected (thank you!), but in several places throughout the worksheet, the name in column H is missing the first few letters. "Beazer" became "azer", "Villegas" became "egas", etc. The odd thing is that it's actually pretty rare - most names copy over exactly correctly. I carefully compared the formula for the few that didn't work with the ones above and below that worked perfectly (in case something got copied incorrectly), but didn't spot anything odd there, or with the names in original column. Odd, no?

Anyway, thanks for helping out! Between you and tygrrboi, I think I've got this stage pretty well worked out!

In H1 use this modified formula:

=IFERROR(TRIM(CHOOSE(LEN(MID(B1,FIND("&",B1),200))-LEN(SUBSTITUTE(MID(B1,FIND("&",B1),200)," ",""))+1,"",C1,"",TRIM(RIGHT(SUBSTITUTE(B1," ",REPT(" ",50)),50)))),"")
 
Upvote 0
In H1 use this modified formula:

=IFERROR(TRIM(CHOOSE(LEN(MID(B1,FIND("&",B1),200))-LEN(SUBSTITUTE(MID(B1,FIND("&",B1),200)," ",""))+1,"",C1,"",TRIM(RIGHT(SUBSTITUTE(B1," ",REPT(" ",50)),50)))),"")

That did the trick! It works perfectly now. Thank you!
 
Upvote 0

Forum statistics

Threads
1,224,878
Messages
6,181,529
Members
453,053
Latest member
DavidKele

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top