extrapolate full social security based on last 4 digits

y3tter

Board Regular
Joined
Nov 11, 2012
Messages
147
I have an employee worksheet that I am using to get open enrollment processed. Our insurance provider provided us with a a sheet with employee names and their redacted social security number(ex xxx-xx-1234). I am needing to have the full SS number displayed. I have a another sheet that lists employee names and their full SS number. How can I do a lookup to match the last four digits of the redacted number to and return the full SS number?
 

Excel Facts

When they said...
When they said you are going to "Excel at life", they meant you "will be doing Excel your whole life".
If the names are the same in both files you can do a lookup on the name otherwise you would need to lookup the last 4. Note that the last for of a SSN is not unique so it is possible that you could have an employee with 123-45-6789 and another employee with 231-13-6789. What is in the file that you get from the insurance provider is it only 4 digits or xxx-xx-1234?
 
Upvote 0
Right, but we do not have enough employees for the 1/10,000 chance of occurrence. The issue with employee name lookup is that there are several relatives that work together and their beneficiaries.

The sheet I got from insurance is xxx-xx-1234, but I did text to columns to remove the xxx-xx- and am just left with the last four digits.
 
Upvote 0
Excel 2010[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TH][/TH]
[TH]A
[/TH]
[TH]B
[/TH]
[TH]C
[/TH]
[/TR]
[TR]
[TD="align: center"]1
[/TD]
[TD]File form insurance
[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]2
[/TD]
[TD]Beth Green
[/TD]
[TD="align: right"]1234
[/TD]
[TD]123-13-1234
[/TD]
[/TR]
[TR]
[TD="align: center"]3
[/TD]
[TD]Rick Grimes
[/TD]
[TD="align: right"]3214
[/TD]
[TD]312-74-3214
[/TD]
[/TR]
[TR]
[TD="align: center"]4
[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]5
[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]6
[/TD]
[TD]your file
[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]7
[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD]helper column
[/TD]
[/TR]
[TR]
[TD="align: center"]8
[/TD]
[TD]Beth Green
[/TD]
[TD]123-13-1234
[/TD]
[TD="align: right"]1234
[/TD]
[/TR]
[TR]
[TD="align: center"]9
[/TD]
[TD]Rick Grimes
[/TD]
[TD]312-74-3214
[/TD]
[TD="align: right"]3214
[/TD]
[/TR]
</tbody>[/TABLE]
Sheet4

[TABLE="width: 85%"]
<tbody>[TR]
[TD]
Worksheet Formulas[TABLE="class: grid, width: 100%"]
<tbody>[TR]
[TH="width: 10"]Cell
[/TH]
[TH="align: left"]Formula
[/TH]
[/TR]
[TR]
[TH="width: 10"]C8
[/TH]
[TD="align: left"]=RIGHT(B8,4)
[/TD]
[/TR]
[TR]
[TH="width: 10"]C9
[/TH]
[TD="align: left"]=RIGHT(B9,4)
[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[/TR]
</tbody>[/TABLE]

[TABLE="width: 85%"]
<tbody>[TR]
[TD]
Array Formulas[TABLE="class: grid, width: 100%"]
<tbody>[TR]
[TH="width: 10"]Cell
[/TH]
[TH="align: left"]Formula
[/TH]
[/TR]
[TR]
[TH="width: 10"]C2
[/TH]
[TD="align: left"]{=INDEX($B$8:$B$9,MATCH(B2,RIGHT($C$8:$C$9,4)+0,0))}
[/TD]
[/TR]
[TR]
[TH="width: 10"]C3
[/TH]
[TD="align: left"]{=INDEX($B$8:$B$9,MATCH(B3,RIGHT($C$8:$C$9,4)+0,0))}
[/TD]
[/TR]
</tbody>[/TABLE]
Entered with Ctrl+Shift+Enter. If entered correctly, Excel will surround with curly braces {}.
Note: Do not try and enter the {} manually yourself[/TD]
[/TR]
</tbody>[/TABLE]
 
Last edited:
Upvote 0
Excel 2010[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TH][/TH]
[TH]A[/TH]
[TH]B[/TH]
[TH]C[/TH]
[/TR]
[TR]
[TD="align: center"]1[/TD]
[TD]File form insurance[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]2[/TD]
[TD]Beth Green[/TD]
[TD="align: right"]1234[/TD]
[TD]123-13-1234[/TD]
[/TR]
[TR]
[TD="align: center"]3[/TD]
[TD]Rick Grimes[/TD]
[TD="align: right"]3214[/TD]
[TD]312-74-3214[/TD]
[/TR]
[TR]
[TD="align: center"]4[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]5[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]6[/TD]
[TD]your file[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
[TR]
[TD="align: center"]7[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD]helper column[/TD]
[/TR]
[TR]
[TD="align: center"]8[/TD]
[TD]Beth Green[/TD]
[TD]123-13-1234[/TD]
[TD="align: right"]1234[/TD]
[/TR]
[TR]
[TD="align: center"]9[/TD]
[TD]Rick Grimes[/TD]
[TD]312-74-3214[/TD]
[TD="align: right"]3214[/TD]
[/TR]
</tbody>[/TABLE]
Sheet4

[TABLE="width: 85%"]
<tbody>[TR]
[TD]Worksheet Formulas[TABLE="class: grid, width: 100%"]
<tbody>[TR]
[TH="width: 10"]Cell[/TH]
[TH="align: left"]Formula[/TH]
[/TR]
[TR]
[TH="width: 10"]C8[/TH]
[TD="align: left"]=RIGHT(B8,4)[/TD]
[/TR]
[TR]
[TH="width: 10"]C9[/TH]
[TD="align: left"]=RIGHT(B9,4)[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[/TR]
</tbody>[/TABLE]

[TABLE="width: 85%"]
<tbody>[TR]
[TD]Array Formulas[TABLE="class: grid, width: 100%"]
<tbody>[TR]
[TH="width: 10"]Cell[/TH]
[TH="align: left"]Formula[/TH]
[/TR]
[TR]
[TH="width: 10"]C2[/TH]
[TD="align: left"]{=INDEX($B$8:$B$9,MATCH(B2,RIGHT($C$8:$C$9,4)+0,0))}[/TD]
[/TR]
[TR]
[TH="width: 10"]C3[/TH]
[TD="align: left"]{=INDEX($B$8:$B$9,MATCH(B3,RIGHT($C$8:$C$9,4)+0,0))}[/TD]
[/TR]
</tbody>[/TABLE]
Entered with Ctrl+Shift+Enter. If entered correctly, Excel will surround with curly braces {}.
Note: Do not try and enter the {} manually yourself[/TD]
[/TR]
</tbody>[/TABLE]


Perfect, thank you SO MUCH!!
 
Upvote 0
Right, but we do not have enough employees for the 1/10,000 chance of occurrence.
You have fallen prey to some bad math! That is not how statistics work.
Actually, if you have 118 people, you have a 50% chance that two will have the same last four digits.
So, if you have over 118 people, you have more than a 50% chance that you will duplicates!
At 214 people, the percentage goes up to 90%!

So, I would recommend by matching on more than just the last four digits. Maybe add last name to your match too.
 
  • Like
Reactions: shg
Upvote 0
If you want to avoid the helper column you can used this instead.
Code:
=INDEX($B$8:$B$9,MATCH("*-"&B3,$B$8:$B$9,0))
 
Upvote 0
You have fallen prey to some bad math! That is not how statistics work.
Actually, if you have 118 people, you have a 50% chance that two will have the same last four digits.
So, if you have over 118 people, you have more than a 50% chance that you will duplicates!
At 214 people, the percentage goes up to 90%!

So, I would recommend by matching on more than just the last four digits. Maybe add last name to your match too.


Cool, thanks for the info! Out of 655, there were no duplicates. Could I have just won the proverbial statistical lottery? Statistical analysis vs probability...
 
Upvote 0
Out of 655, there were no duplicates. Could I have just won the proverbial statistical lottery? Statistical analysis vs probability...

Are you sure?

Scott T's formula relies on there being no duplicates of the last 4 digits. If there are duplicates, the formula will return the first match, which may not be the correct one.

If the last 4 digits are totally random (which is not exactly true), then extrapolating out Joe4's analysis you have of the order of 1 in a billion chance of having 655 different 4 digit sequences.

Based on those odds, my bet is that you do have duplicates.
 
Upvote 0
Are you sure?

Scott T's formula relies on there being no duplicates of the last 4 digits. If there are duplicates, the formula will return the first match, which may not be the correct one.

If the last 4 digits are totally random (which is not exactly true), then extrapolating out Joe4's analysis you have of the order of 1 in a billion chance of having 655 different 4 digit sequences.

Based on those odds, my bet is that you do have duplicates.

Yes, data was sorted by last 4 digits and then reviewed for duplicates. Scott T's formula was the best option due to not every family falling into the traditional model(stepchildren, unmarried, children with different address). Joe4 scared me with his reminder of the birthday paradox. Apparently social security numbers aren't completely random and rely on dob, place of birth, etc. Still, I was probably lucky to not have witnessed any duplications.
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,284
Members
452,630
Latest member
OdubiYouth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top