# Power Query - Alternative to hard-coding a translation table?



## macfuller (Jan 18, 2018)

I frequently use a handy function (from Chris Webb's site as I recall) to translate data codes from source data into human-readable values. Like many people I get data from ERP and OLTP sources with inscrutable identifiers.
The translation values are hard-coded in the function.  So if my data has "N" and I want the output column to return "North"



```
(Lookup as text) => 
let
    values = {
    {"N", "North"},
    {"E", "East"},
    {"S", "South"},
    {"W", "West"},
    {Lookup, "Other"}
     },
     Result = List.First(List.Select(values, each _{0}=Lookup)){1}
in
    Result
```

And it's called within the primary query by
    AddNewDirectionCol = Table.AddColumn(RenameOldDirection, "Direction", each fnTranslateCompass([CompassPoint]), type text)


I can have 30+ rows for the translation values, with separate translation tables for separate queries.  These values change more often than I'd like.  I thought that if I maintain the data in a local table within the spreadsheet I can modify the spreadsheet data once and the queries will pick it up without having to adjust each query...


```
(SourceTable as text, LookupCode as text) => 
let
//    Get the translation table
    Source = Excel.CurrentWorkbook(){[Name=SourceTable]}[Content],
//    Rename the columns in case they aren't standard
    RenameCols = Table.RenameColumns(Source,{{Table.ColumnNames(Source){0}, "Index"}, {Table.ColumnNames(Source){1}, "Value"}}),
     Result = List.First(Table.Column(Table.SelectRows(RenameCols, each [Index] = LookupCode),"Value"))
in
    Result
```
This works fine as a generic function, but of course it's going to load the translation table for every line of the source data where the translation is happening.  That will push the query time into next week for million+ data sources.

Is there a way to call the translation table just once so that the function within the query can re-use the table for every line of the primary query?  Or am I stuck with hard-coding each separate translation function?


----------



## macfuller (Jan 19, 2018)

Yes, I can load the table in the primary query and have the function parameter as a table instead of a string.  But I was kind of hoping I could have a self-contained function to keep the logic entirely in one place.


----------



## Ozeroth (Jan 25, 2018)

Hi macfuller,

Regardless of how you use functions, I would think a Merge Queries operation (Table.NestedJoin) would be the most efficient way of looking up values for every row of your original table.

You would have to do this for every translation you have to perform, so you would need a set of translation tables.

If you wanted to you could write a function that works something like:


```
TableWithTranslatedColumn = fn_TranslateColumn( OriginalTable, OriginalColumnName, TranslationTable)
```
Please post back if needed - you may well have a good solution already! 

Regards,
Owen


----------



## macfuller (Jan 28, 2018)

Merges seem to take a very long time on large record sets, though I haven't done a test to compare.  

As part of that question I would be curious to know if a query processes all operations on a row before moving to the next, thus having to iterate through the record set only once, instead of iteration for each alteration.  For example, if I'm creating a calendar table and I have a couple of added columns for month name (Sep) and workday (Mon-Fri, Y/N), even though the query steps would suggest it iterates once for each column it would be tons faster if it did both operations at the same time.  In that way, I think a function to add a lookup value would be faster than a merge.


----------



## anvg (Jan 30, 2018)

Hi


macfuller said:


> Merges seem to take a very long time on large record sets, though I haven't done a test to compare.


What is record count large? I tested with 600000 records. https://drive.google.com/open?id=1VZqaDdrYXLuQKKGuVJ3S2BGHqKDeKBXM
Regards,


----------



## macfuller (Jan 30, 2018)

5.6 million as of this week.  There are quite a few other operations performed on the table as well so I haven't split out just the initial merge operation.  And with 3 or 4 lookup activities for the various columns coming with codes from the ERP system that would be 3-4 merges I believe.

Another question on the merge is how it would handle the - occasional - missing values.  With the lookup function I can enter a default such as "Other" but with a merge I think the values are blanks?  On some of the columns a blank could imply an unknown code, but on other columns blanks indicate something else so I would need a default for when a code value is unknown rather than missing.


----------

