Hi,
I am trying to figure out how to implement a regex solution to a data field and was looking for some help.
I have got a fair bit done using mid and search/find but I am still doing a lot of clean up to the data after extraction to compensate for the different way people record the same info. The regular expression I want to extract from is:
## hours @ £##.##
I have been using the @ sign to get the hours and the £ sign to get the cost. However....some people write hrs, others write hour or hours and some just hr. The hours may be part values as well, so 10.5 or 9.25 etc. For the price some will include the decimal place and others won't. The expression sits within a larger string and is never in the same place. I had tried to use find and replace to standardise the hours data but it starts to change existing text that includes some of the combinations - for example gilchrist becomes gilchourist, not ideal!!
The £ value also has issues, some people put a space after the £ sign whilst others incude no decimal point so £15 instead of £15.00 I have used Search within MID to get the starting point for extraction but because of the variation I often have values coming out with other aspects of the string that I don't want to see.
By using the @ sign I can get at the data either side of it. The main issue is extracting the hours. I thought I could use a regex to pick up on the variation and still pull through the numerical value for the hours irrespective of whether someone has input hr/hrs/hour/hours.
I was contemplating writing a series of formulas using mid and then choosing the one that returned the cleanest value However, when tweaking the formula manually this changes from returning 3 characters to returning eight. Whilst it works I get the feeling there is a better way and when I looked I saw regex and thought that might work. I am looking to get the hours in one cell in the row and the £value in another.
An example of the output would be that the expression 10 hrs @ £50.50 results in 10 in one cell and £50.50 in another.
Thanks for your help.
I am trying to figure out how to implement a regex solution to a data field and was looking for some help.
I have got a fair bit done using mid and search/find but I am still doing a lot of clean up to the data after extraction to compensate for the different way people record the same info. The regular expression I want to extract from is:
## hours @ £##.##
I have been using the @ sign to get the hours and the £ sign to get the cost. However....some people write hrs, others write hour or hours and some just hr. The hours may be part values as well, so 10.5 or 9.25 etc. For the price some will include the decimal place and others won't. The expression sits within a larger string and is never in the same place. I had tried to use find and replace to standardise the hours data but it starts to change existing text that includes some of the combinations - for example gilchrist becomes gilchourist, not ideal!!
The £ value also has issues, some people put a space after the £ sign whilst others incude no decimal point so £15 instead of £15.00 I have used Search within MID to get the starting point for extraction but because of the variation I often have values coming out with other aspects of the string that I don't want to see.
By using the @ sign I can get at the data either side of it. The main issue is extracting the hours. I thought I could use a regex to pick up on the variation and still pull through the numerical value for the hours irrespective of whether someone has input hr/hrs/hour/hours.
I was contemplating writing a series of formulas using mid and then choosing the one that returned the cleanest value However, when tweaking the formula manually this changes from returning 3 characters to returning eight. Whilst it works I get the feeling there is a better way and when I looked I saw regex and thought that might work. I am looking to get the hours in one cell in the row and the £value in another.
An example of the output would be that the expression 10 hrs @ £50.50 results in 10 in one cell and £50.50 in another.
Thanks for your help.