Hi there,
I am trying to find a way to harmonise the way a product is coded from various ways people may have entered it (i.e., the cut of fish - in parentheses here but note that this is not always the case).
For example, in the sample data below, "Gutted" and "whole gutted" mean the same thing, but "headed & gutted" is a different thing, and "whole" alone is yet a different thing.
TAB "Raw data", column A:
I am trying to automate the translation of these multiple possible terms people may use into a standard state code, by means of a formula in column B.
I have created a list of possible ways to name various cuts of fish (i.e. StateName) in a different tab "States translation" (column A), and the corresponding standard code (StateCode) next to it (column B).
TAB "States translation", column C and D:
I have made these two lists into arrays that I have named "StateName" and "StateCode" and that I would like to refer to in the tab where my raw data will be.
As you see in the two columns above, different states may include the same terms (e.g., gutted is found in GGT, GHCO, HGT and HGU) and so I have organised my list of possible terms in a hierarchical order, so that the search should end when the first matching StateName is found withing the string in column D.
So for example, for the fifth term in column A, "Broadbill Swordfish (Headed & Gutted)", I'd like the formula to return HGU (rather than any of the codes above).
I have tried various methods (lookup, index, match) unsuccessfully as they either find multiple matches or result in lists of the same length as my arrays.
I have a very manual and inelegant solution to this problem, using IFS, ISNUMBER and SEARCH, but I am sure there must be a much more straightforward way than this:
=IF(D2<>"",IFS(ISNUMBER(SEARCH('States translation'!$C$2,D2)),'States translation'!$D$2,ISNUMBER(SEARCH('States translation'!$C$3,D2)),'States translation'!$D$3,ISNUMBER(SEARCH('States translation'!$C$4,D2)),'States translation'!$D$4,ISNUMBER(SEARCH('States translation'!$C$5,D2)),'States translation'!$D$5, [repeat same arguments with the entire list of 180+ items ] , TRUE,"Check"),"")
Many thanks for your help!
PS: Note that more state names could always be added to my arrays later on if it turn out that people have used names that were not covered in my current lists.
I am trying to find a way to harmonise the way a product is coded from various ways people may have entered it (i.e., the cut of fish - in parentheses here but note that this is not always the case).
For example, in the sample data below, "Gutted" and "whole gutted" mean the same thing, but "headed & gutted" is a different thing, and "whole" alone is yet a different thing.
TAB "Raw data", column A:
I am trying to automate the translation of these multiple possible terms people may use into a standard state code, by means of a formula in column B.
I have created a list of possible ways to name various cuts of fish (i.e. StateName) in a different tab "States translation" (column A), and the corresponding standard code (StateCode) next to it (column B).
TAB "States translation", column C and D:
I have made these two lists into arrays that I have named "StateName" and "StateCode" and that I would like to refer to in the tab where my raw data will be.
As you see in the two columns above, different states may include the same terms (e.g., gutted is found in GGT, GHCO, HGT and HGU) and so I have organised my list of possible terms in a hierarchical order, so that the search should end when the first matching StateName is found withing the string in column D.
So for example, for the fifth term in column A, "Broadbill Swordfish (Headed & Gutted)", I'd like the formula to return HGU (rather than any of the codes above).
I have tried various methods (lookup, index, match) unsuccessfully as they either find multiple matches or result in lists of the same length as my arrays.
I have a very manual and inelegant solution to this problem, using IFS, ISNUMBER and SEARCH, but I am sure there must be a much more straightforward way than this:
=IF(D2<>"",IFS(ISNUMBER(SEARCH('States translation'!$C$2,D2)),'States translation'!$D$2,ISNUMBER(SEARCH('States translation'!$C$3,D2)),'States translation'!$D$3,ISNUMBER(SEARCH('States translation'!$C$4,D2)),'States translation'!$D$4,ISNUMBER(SEARCH('States translation'!$C$5,D2)),'States translation'!$D$5, [repeat same arguments with the entire list of 180+ items ] , TRUE,"Check"),"")
Many thanks for your help!
PS: Note that more state names could always be added to my arrays later on if it turn out that people have used names that were not covered in my current lists.