Hi Guys, Long time user but first time poster. I have always been able to find solutions and model them to my usage from this gem of a website. Kudos to the content here.
I have a deadline on Monday and need to somehow gather a lot of info from a thousands of rows of product description/import-export trade data. the data is of this format:
[TABLE="width: 867"]
<colgroup><col></colgroup><tbody>[TR]
[TD="align: left"]STAINLESS STEEL HOT ROLLED SHEET GRADE 310 S WIDTH 1524 MM THK 5 MM[/TD]
[/TR]
[TR]
[TD="align: left"](H.R.)S.S COIL(GRADE:316L)SIZE:6.28MMX1600MMXCOIL(EXCESS PRIME,SLIT EDGES,HR,ANNEALED(MECH.DESCALRD)BRAND;1811ML(ASTM A[/TD]
[/TR]
[TR]
[TD="align: left"](STAINLESS STEEL COIL) COLD ROLLED (GRADE 304) MDK PPSS SIZE 07 X 580 FF[/TD]
[/TR]
[TR]
[TD="align: left"]0.29*335MM COLD ROLLED STAINLESS STEEL COIL 201,2B[/TD]
[/TR]
[TR]
[TD="align: left"]S.S. SHEET 316LVM 3.00MM DIA (C.EX. FILE NO. VGN(30)26/MISC.CORR/ADLER/RTN/T/12 DT.15/10/2012)[/TD]
[/TR]
[TR]
[TD="align: left"]STAINLESS STEEL SHEETS GRADE 1.4003, H.R. SIZE - 4.00 MM X 1092 MM X 3454 MM[/TD]
[/TR]
[TR]
[TD="align: left"]0.37 MM X 1240MM X COIL NON MAGENTIC STAINLESS STEEL COIL AISI 201[/TD]
[/TR]
[TR]
[TD="align: left"]STAINLESS STEEL HOT ROLLED SHEET GRADE 310 S WIDTH 1500 MM THK 5 MM[/TD]
[/TR]
[TR]
[TD="align: left"]0.76 MM X 1240MM X COIL NON MAGENTIC STAINLESS STEEL COIL AISI 201[/TD]
[/TR]
</tbody>[/TABLE]
As you can see there is no discernible pattern in string types. below is the stuff I need and how I am going about it:
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Parameter Needed [/TD]
[TD]How I am doing it [/TD]
[/TR]
[TR]
[TD]Stainless steel or not (basically from keywords Stainless or S.S/S. S and variants)[/TD]
[TD]Advanced Filter[/TD]
[/TR]
[TR]
[TD]Product type (Sheet, Coil, plate etc)[/TD]
[TD]Advanced filter[/TD]
[/TR]
[TR]
[TD]Hot Roll or cold roll (with keywords as HOT RO/H.R/H R and variants with wild card chars)[/TD]
[TD]Advanced filter [/TD]
[/TR]
[TR]
[TD]Steel Grade ( this is a 3 digit number or a decimal'd 5 digit number can be anywhere preceeded by aisi grade or not, essentially there are fixed values for this for eg, 201, 301, 304, 1.4003 etc which are there)[/TD]
[TD]I am stumped by this[/TD]
[/TR]
[TR]
[TD]Width (a 3 digit or 4 digit number will be in the middle in formats such as either b/w X1240 MM X or b/w *1240 MM*[/TD]
[TD]I am stumped by this[/TD]
[/TR]
</tbody>[/TABLE]
So to give an example :
[TABLE="width: 867"]
<tbody>[TR]
[TD="align: left"]STAINLESS STEEL SHEETS GRADE 1.4003, H.R. SIZE - 4.00 MM X 1092 MM X 3454 MM[/TD]
[/TR]
[TR]
[/TR]
</tbody>[/TABLE]
I shud be able to split this/filter this as SS(in a new column : Category ), Sheet ( in a new column : Product), 1.4003(Grade), HOTROLLED ( because it has H.R), 1092(Width)
The problem is no discernible pattern in the text fields anything can be at first second or last place... I was thinking of reg ex to find all occurrences within some patterns might solve this... but I can't understand Greek that regex has (eg:\-[]**d+ 'wtf' )
Excel Gods please help me out on this.....
I have a deadline on Monday and need to somehow gather a lot of info from a thousands of rows of product description/import-export trade data. the data is of this format:
[TABLE="width: 867"]
<colgroup><col></colgroup><tbody>[TR]
[TD="align: left"]STAINLESS STEEL HOT ROLLED SHEET GRADE 310 S WIDTH 1524 MM THK 5 MM[/TD]
[/TR]
[TR]
[TD="align: left"](H.R.)S.S COIL(GRADE:316L)SIZE:6.28MMX1600MMXCOIL(EXCESS PRIME,SLIT EDGES,HR,ANNEALED(MECH.DESCALRD)BRAND;1811ML(ASTM A[/TD]
[/TR]
[TR]
[TD="align: left"](STAINLESS STEEL COIL) COLD ROLLED (GRADE 304) MDK PPSS SIZE 07 X 580 FF[/TD]
[/TR]
[TR]
[TD="align: left"]0.29*335MM COLD ROLLED STAINLESS STEEL COIL 201,2B[/TD]
[/TR]
[TR]
[TD="align: left"]S.S. SHEET 316LVM 3.00MM DIA (C.EX. FILE NO. VGN(30)26/MISC.CORR/ADLER/RTN/T/12 DT.15/10/2012)[/TD]
[/TR]
[TR]
[TD="align: left"]STAINLESS STEEL SHEETS GRADE 1.4003, H.R. SIZE - 4.00 MM X 1092 MM X 3454 MM[/TD]
[/TR]
[TR]
[TD="align: left"]0.37 MM X 1240MM X COIL NON MAGENTIC STAINLESS STEEL COIL AISI 201[/TD]
[/TR]
[TR]
[TD="align: left"]STAINLESS STEEL HOT ROLLED SHEET GRADE 310 S WIDTH 1500 MM THK 5 MM[/TD]
[/TR]
[TR]
[TD="align: left"]0.76 MM X 1240MM X COIL NON MAGENTIC STAINLESS STEEL COIL AISI 201[/TD]
[/TR]
</tbody>[/TABLE]
As you can see there is no discernible pattern in string types. below is the stuff I need and how I am going about it:
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Parameter Needed [/TD]
[TD]How I am doing it [/TD]
[/TR]
[TR]
[TD]Stainless steel or not (basically from keywords Stainless or S.S/S. S and variants)[/TD]
[TD]Advanced Filter[/TD]
[/TR]
[TR]
[TD]Product type (Sheet, Coil, plate etc)[/TD]
[TD]Advanced filter[/TD]
[/TR]
[TR]
[TD]Hot Roll or cold roll (with keywords as HOT RO/H.R/H R and variants with wild card chars)[/TD]
[TD]Advanced filter [/TD]
[/TR]
[TR]
[TD]Steel Grade ( this is a 3 digit number or a decimal'd 5 digit number can be anywhere preceeded by aisi grade or not, essentially there are fixed values for this for eg, 201, 301, 304, 1.4003 etc which are there)[/TD]
[TD]I am stumped by this[/TD]
[/TR]
[TR]
[TD]Width (a 3 digit or 4 digit number will be in the middle in formats such as either b/w X1240 MM X or b/w *1240 MM*[/TD]
[TD]I am stumped by this[/TD]
[/TR]
</tbody>[/TABLE]
So to give an example :
[TABLE="width: 867"]
<tbody>[TR]
[TD="align: left"]STAINLESS STEEL SHEETS GRADE 1.4003, H.R. SIZE - 4.00 MM X 1092 MM X 3454 MM[/TD]
[/TR]
[TR]
[/TR]
</tbody>[/TABLE]
I shud be able to split this/filter this as SS(in a new column : Category ), Sheet ( in a new column : Product), 1.4003(Grade), HOTROLLED ( because it has H.R), 1092(Width)
The problem is no discernible pattern in the text fields anything can be at first second or last place... I was thinking of reg ex to find all occurrences within some patterns might solve this... but I can't understand Greek that regex has (eg:\-[]**d+ 'wtf' )
Excel Gods please help me out on this.....