Power Query/Pivot - Complex Lookup Table

jives00

New Member
Joined
Feb 27, 2015
Messages
7
I'm still trying to figure out an easy solution to this one (previous thread there: http://www.mrexcel.com/forum/power-...ookup-something-like-vlookup.html#post4088238). That one gave me a lot of good ideas and got me looking into Power Query, which is an awesome tool. I was able to do a search lookup to get what I wanted there, but am now wondering if there's a better way still.

So, here's the issue I'm having:

I have many rows of transaction data from a corporate credit card. The data includes some things that help identify who the vendor is, including:

Supplier - Name
Supplier - Alternate Supplier DBA Name
Supplier - Tax Identification Numbe (TIN)
Supplier - Dun & Bradstreet Number
Supplier - ID

The issue is there is poor consistency in the data. So if I want to look up a vendor and see all it's transaction data, it's near impossible. E.g. Dell has Supplier - Name as: Dmi* Dell Bus Online, Dmi* Dell Higher Educ and Dell Sales & Service. Sometimes vendors all have the same TIN or other identifier, but that's not consistent either. And other than supplier name, none of the fields are filled in 100%, so there are blanks.

What I want to do is create an easy way to add a "normalized name". So for the above Dell examples, it would know that each of those names/TINs/etc. is for Dell.

I'm not sure the best way to go about this and looking for ideas. What I've been trying so far is to have a separate table that contains the same information above plus a column for normal name. I'm running into a fair amount of duplication though as some vendors have the same TIN but different D&B numbers or some other field. I'm not sure if those should go on here or not. Trying to keep maintenance eon this table as simple as possible.

My thinking then is to have /something/ do a lookup of the source data to find the associated normal name. E.g.:
1) Compare the Source Data TIN to the Normalization Table TIN. If it finds a match, associate it with the normal name
2) No match on #1, compare the Source Data D&B to the Normalization Table D&B. If it finds a match, associate it with the normal name.
3) If no match on #2, do it again for the ID#, alternate name, source name, etc.
4) If it still doesn't find a match either leave the name blank or put something generic in (e.g. MISSING).

I would like to do this as a relationship in Power Pivot with the RELATED formula. But that doesn't work since there are blanks and duplicate values. But having the normal names in a seperate table and be linked back to the transaction data is idle - since I'll have other transaction sources in the future and would rather merge them together based on the normal name.

I'm familiar with both Power Pivot and Power Query (though not an expert). I'm open to using other tools, but if they involve scripting/coding, I'd need help writing them as that's outside my knowledge.

I have some sample data available (transaction data + normalization table) at: https://dl.dropboxusercontent.com/u/7340513/Sample Data.xlsx

Thoughts? Thanks!
 

Excel Facts

Spell Check in Excel
Press F7 to start spell check in Excel. Be careful, by default, Excel does not check Capitalized Werds (whoops)
Well that's... ugly :)

We should just assume that Power Pivot is NOT the place for this. Power Query is much better... but it is still going to be somewhere between hard... and a disaster :)

I don't think EITHER of these techs support regular expressions yet which isn't going to help your cause. And you are going to match Dell in Jimmys' Mortadella Shop.

How many suppliers are we dealing with, and how often are they added? I want to say I would do this as a linked table... and basically manually map SupplierId to a Name I specify... grind through it once, then expect folks to update the linked table as necessary.
 
Upvote 0
Grind through once and update as needed is what we planned/expected to do, so that's alright.

There's about 40k vendors right now - although that will drop down as many are duplicates (same TIN, or source name, etc). Updates will be daily, but only a small number added each time. Our business logic will probably be to use the same as the source name by default unless it's something we care about.

How would the linked table work? That (just from the name) sounds like what I was thinking, but didn't know how to do it correctly.
 
Upvote 0

Forum statistics

Threads
1,224,078
Messages
6,176,244
Members
452,716
Latest member
Elo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top