I need to parse complex, variable length strings (from a csv export from a purchasing system when coding is split) where each string may contact one record or multiple records. This example contains three records. The fields may contain 2 to 99 individual records.
Some of the data has been replaced with fake data to preserve business confidentiality:
Records start with number of accounts ("3 Accounts") not repeated in the string, then a numeric amount. Then each significant field (Proj ID, BU, Account, Team, Sub-Team, Project Code) in each record is repeated three to six times - once in parenthesis, once w/ Field Name: 00000 format, and one to four times in "open strings". The end of each record is a semicolon.
I need to extract each significant field ONCE and remove all redundant or unnecessary data. And, of course, multiple records need to be separated into multiple lines. To make it more complicated, these strings have cells (fields) both left and right of them, mostly in good shape from the export. So I have to count the number of actual records in the string and store that to use for loops that copy the data to the left and right down the appropriate number of times.
Non-split records in this field look like this:
And are easily parsed using hyphens as a delimiter. So the result of parsing the "split" fields should be 6 fields separated by hyphens, even if the fields are empty, by however many lines are necessary for the number of records.
The solution could be VBA, formulas, or a combination of both. I've been racking my brain but can't really wrap my head around a solution. Thank you in advance.
Some of the data has been replaced with fake data to preserve business confidentiality:
3 Accounts: 650.22 USD (8.46%) New Store/Facility (180000) US, 6666 - Fake City-99999 - XYZ - Fake City-180000 - CIP-STORES-200 - Whole Body-21020 - Refrigerated Cases, Proj ID: 6666 - Fake City (6666), BU: 99999 - XYZ - Fake City (99999), Account: 180000 - CIP-STORES (180000), Team: 200 - Whole Body (200), Sub-Team: , ProjCode: 21020 - Refrigerated Cases (21020); 3,196.56 USD (41.59%) New Store/Facility (180000) US, 6666 - Fake City-99999 - XYZ - Fake City-180000 - CIP-STORES-160 - Meat-21020 - Refrigerated Cases, Proj ID: 6666 - Fake City (6666), BU: 99999 - XYZ - Fake City (99999), Account: 180000 - CIP-STORES (180000), Team: 160 - Meat (160), Sub-Team: , ProjCode: 21020 - Refrigerated Cases (21020); 3,839.09 USD (49.95%) New Store/Facility (180000) US, 8888 - Fake City-99999 - XYZ - Fake City-180000 - CIP-STORES-180 - Seafood-21020 - Refrigerated Cases, Proj ID: 6666 - Fake City (6666), BU: 99999 - XYZ - Fake City (99999), Account: 180000 - CIP-STORES (180000), Team: 180 - Seafood (180), Sub-Team: , ProjCode: 21020 - Refrigerated Cases (21020)
Records start with number of accounts ("3 Accounts") not repeated in the string, then a numeric amount. Then each significant field (Proj ID, BU, Account, Team, Sub-Team, Project Code) in each record is repeated three to six times - once in parenthesis, once w/ Field Name: 00000 format, and one to four times in "open strings". The end of each record is a semicolon.
I need to extract each significant field ONCE and remove all redundant or unnecessary data. And, of course, multiple records need to be separated into multiple lines. To make it more complicated, these strings have cells (fields) both left and right of them, mostly in good shape from the export. So I have to count the number of actual records in the string and store that to use for loops that copy the data to the left and right down the appropriate number of times.
Non-split records in this field look like this:
9999-66666-180000-120--21020
And are easily parsed using hyphens as a delimiter. So the result of parsing the "split" fields should be 6 fields separated by hyphens, even if the fields are empty, by however many lines are necessary for the number of records.
The solution could be VBA, formulas, or a combination of both. I've been racking my brain but can't really wrap my head around a solution. Thank you in advance.