hello,
I've got an interesting problem to solve. I've got a worksheet that will be re-used repeatedly to take raw automotive data and "convert" that data into a specifically formatted text output on another sheet.
Here is a sample source table of the data I have to process:
<tbody>[TR]
[TD]SKU[/TD]
[TD]YEAR[/TD]
[TD]MAKE[/TD]
[TD]MODEL[/TD]
[TD]SUB-MODEL[/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2006-2013|2005-2009[/TD]
[TD]Audi|Volkswagen[/TD]
[TD]A3|GTI[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007;2008|2006-2010[/TD]
[TD]Subaru|Mitsubishi[/TD]
[TD]Impreza;Legacy|Lancer[/TD]
[TD]2.5i,WRX,STi;GT,SpecB|GS[/TD]
[/TR]
</tbody>[/TABLE]
Here is what the output should be like:
<tbody>[TR]
[TD]SKU[/TD]
[TD]YEAR[/TD]
[TD]MAKE[/TD]
[TD]Model[/TD]
[TD]Sub-Model[/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2006-2013[/TD]
[TD]Audi[/TD]
[TD]A3[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2005-2009[/TD]
[TD]Volkswagen[/TD]
[TD]GTI[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]2.5i[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]WRX[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]STi[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2008[/TD]
[TD]Subaru[/TD]
[TD]Legacy[/TD]
[TD]GT[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2008[/TD]
[TD]Subaru[/TD]
[TD]Legacy[/TD]
[TD]SpecB[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2006-2010[/TD]
[TD]Mitsubishi[/TD]
[TD]Lancer[/TD]
[TD]GS[/TD]
[/TR]
</tbody>[/TABLE]
So as you can see, this is a lot of text processing with several different rules.
1) The sheet can have anywhere from 1 to 100,000 rows of data like this. The columns themselves WILL NOT be in this order, but will be spread out, and will have Named Ranges applied to each column for easy "grabbing" of that chunk of data.
2) The Pipe symbol will always designate a major separation of Year/Makes/Model/Sub-Models (YMMS). Meaning if a pipe symbol is encountered, it should be expected to find a matching pipe symbol in the rest of the YMMS columns.
3) semi-colons are used as symbol to split data within the same piped sequence.
4) commas are used to indicate that each piece of data between the commas will require a new line of data.
5) If there is commas and/or semi-colons encountered and there is no matching option in the next column over, leave the value blank, and start the next destination row.
6) If there is NO comma and/or semi-colons specified in the previous column, then re-use the same value within the same pipe segment.
7) I'm sure I'm missing a rule atm (its 1AM and I'm wiped).
I'd love to hear from people on what they think might be the best/most efficient way to handle this problem, hopefully with some sample code ideas.
Thanks in advance,
Spyrule.
I've got an interesting problem to solve. I've got a worksheet that will be re-used repeatedly to take raw automotive data and "convert" that data into a specifically formatted text output on another sheet.
Here is a sample source table of the data I have to process:
Source Sheet:
[TABLE="class: grid, width: 500, align: center"]<tbody>[TR]
[TD]SKU[/TD]
[TD]YEAR[/TD]
[TD]MAKE[/TD]
[TD]MODEL[/TD]
[TD]SUB-MODEL[/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2006-2013|2005-2009[/TD]
[TD]Audi|Volkswagen[/TD]
[TD]A3|GTI[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007;2008|2006-2010[/TD]
[TD]Subaru|Mitsubishi[/TD]
[TD]Impreza;Legacy|Lancer[/TD]
[TD]2.5i,WRX,STi;GT,SpecB|GS[/TD]
[/TR]
</tbody>[/TABLE]
Here is what the output should be like:
Destination Sheet:
[TABLE="class: grid, width: 500, align: center"]<tbody>[TR]
[TD]SKU[/TD]
[TD]YEAR[/TD]
[TD]MAKE[/TD]
[TD]Model[/TD]
[TD]Sub-Model[/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2006-2013[/TD]
[TD]Audi[/TD]
[TD]A3[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductA[/TD]
[TD]2005-2009[/TD]
[TD]Volkswagen[/TD]
[TD]GTI[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]2.5i[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]WRX[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2002-2007[/TD]
[TD]Subaru[/TD]
[TD]Impreza[/TD]
[TD]STi[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2008[/TD]
[TD]Subaru[/TD]
[TD]Legacy[/TD]
[TD]GT[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2008[/TD]
[TD]Subaru[/TD]
[TD]Legacy[/TD]
[TD]SpecB[/TD]
[/TR]
[TR]
[TD]ProductB[/TD]
[TD]2006-2010[/TD]
[TD]Mitsubishi[/TD]
[TD]Lancer[/TD]
[TD]GS[/TD]
[/TR]
</tbody>[/TABLE]
So as you can see, this is a lot of text processing with several different rules.
1) The sheet can have anywhere from 1 to 100,000 rows of data like this. The columns themselves WILL NOT be in this order, but will be spread out, and will have Named Ranges applied to each column for easy "grabbing" of that chunk of data.
2) The Pipe symbol will always designate a major separation of Year/Makes/Model/Sub-Models (YMMS). Meaning if a pipe symbol is encountered, it should be expected to find a matching pipe symbol in the rest of the YMMS columns.
3) semi-colons are used as symbol to split data within the same piped sequence.
4) commas are used to indicate that each piece of data between the commas will require a new line of data.
5) If there is commas and/or semi-colons encountered and there is no matching option in the next column over, leave the value blank, and start the next destination row.
6) If there is NO comma and/or semi-colons specified in the previous column, then re-use the same value within the same pipe segment.
7) I'm sure I'm missing a rule atm (its 1AM and I'm wiped).
I'd love to hear from people on what they think might be the best/most efficient way to handle this problem, hopefully with some sample code ideas.
Thanks in advance,
Spyrule.