I'm still trying to figure out an easy solution to this one (previous thread there: http://www.mrexcel.com/forum/power-...ookup-something-like-vlookup.html#post4088238). That one gave me a lot of good ideas and got me looking into Power Query, which is an awesome tool. I was able to do a search lookup to get what I wanted there, but am now wondering if there's a better way still.
So, here's the issue I'm having:
I have many rows of transaction data from a corporate credit card. The data includes some things that help identify who the vendor is, including:
Supplier - Name
Supplier - Alternate Supplier DBA Name
Supplier - Tax Identification Numbe (TIN)
Supplier - Dun & Bradstreet Number
Supplier - ID
The issue is there is poor consistency in the data. So if I want to look up a vendor and see all it's transaction data, it's near impossible. E.g. Dell has Supplier - Name as: Dmi* Dell Bus Online, Dmi* Dell Higher Educ and Dell Sales & Service. Sometimes vendors all have the same TIN or other identifier, but that's not consistent either. And other than supplier name, none of the fields are filled in 100%, so there are blanks.
What I want to do is create an easy way to add a "normalized name". So for the above Dell examples, it would know that each of those names/TINs/etc. is for Dell.
I'm not sure the best way to go about this and looking for ideas. What I've been trying so far is to have a separate table that contains the same information above plus a column for normal name. I'm running into a fair amount of duplication though as some vendors have the same TIN but different D&B numbers or some other field. I'm not sure if those should go on here or not. Trying to keep maintenance eon this table as simple as possible.
My thinking then is to have /something/ do a lookup of the source data to find the associated normal name. E.g.:
1) Compare the Source Data TIN to the Normalization Table TIN. If it finds a match, associate it with the normal name
2) No match on #1, compare the Source Data D&B to the Normalization Table D&B. If it finds a match, associate it with the normal name.
3) If no match on #2, do it again for the ID#, alternate name, source name, etc.
4) If it still doesn't find a match either leave the name blank or put something generic in (e.g. MISSING).
I would like to do this as a relationship in Power Pivot with the RELATED formula. But that doesn't work since there are blanks and duplicate values. But having the normal names in a seperate table and be linked back to the transaction data is idle - since I'll have other transaction sources in the future and would rather merge them together based on the normal name.
I'm familiar with both Power Pivot and Power Query (though not an expert). I'm open to using other tools, but if they involve scripting/coding, I'd need help writing them as that's outside my knowledge.
I have some sample data available (transaction data + normalization table) at: https://dl.dropboxusercontent.com/u/7340513/Sample Data.xlsx
Thoughts? Thanks!
So, here's the issue I'm having:
I have many rows of transaction data from a corporate credit card. The data includes some things that help identify who the vendor is, including:
Supplier - Name
Supplier - Alternate Supplier DBA Name
Supplier - Tax Identification Numbe (TIN)
Supplier - Dun & Bradstreet Number
Supplier - ID
The issue is there is poor consistency in the data. So if I want to look up a vendor and see all it's transaction data, it's near impossible. E.g. Dell has Supplier - Name as: Dmi* Dell Bus Online, Dmi* Dell Higher Educ and Dell Sales & Service. Sometimes vendors all have the same TIN or other identifier, but that's not consistent either. And other than supplier name, none of the fields are filled in 100%, so there are blanks.
What I want to do is create an easy way to add a "normalized name". So for the above Dell examples, it would know that each of those names/TINs/etc. is for Dell.
I'm not sure the best way to go about this and looking for ideas. What I've been trying so far is to have a separate table that contains the same information above plus a column for normal name. I'm running into a fair amount of duplication though as some vendors have the same TIN but different D&B numbers or some other field. I'm not sure if those should go on here or not. Trying to keep maintenance eon this table as simple as possible.
My thinking then is to have /something/ do a lookup of the source data to find the associated normal name. E.g.:
1) Compare the Source Data TIN to the Normalization Table TIN. If it finds a match, associate it with the normal name
2) No match on #1, compare the Source Data D&B to the Normalization Table D&B. If it finds a match, associate it with the normal name.
3) If no match on #2, do it again for the ID#, alternate name, source name, etc.
4) If it still doesn't find a match either leave the name blank or put something generic in (e.g. MISSING).
I would like to do this as a relationship in Power Pivot with the RELATED formula. But that doesn't work since there are blanks and duplicate values. But having the normal names in a seperate table and be linked back to the transaction data is idle - since I'll have other transaction sources in the future and would rather merge them together based on the normal name.
I'm familiar with both Power Pivot and Power Query (though not an expert). I'm open to using other tools, but if they involve scripting/coding, I'd need help writing them as that's outside my knowledge.
I have some sample data available (transaction data + normalization table) at: https://dl.dropboxusercontent.com/u/7340513/Sample Data.xlsx
Thoughts? Thanks!