Text to Number Conversion - Which functions should be used?

aaalviny

Board Regular
Joined
Apr 25, 2013
Messages
128
Disclaimer:
I am not turning a deaf ear to others' comment, but I just had enough evidence to turn them down. I don't turn them down due to no reasons.


p.s. I do admire the use of converting 8-digit number-formatted dates to a date format by use of TEXT("0000-00-00")
I think it's the best solution given the Excel's behaviour on dates.
I didn't thoroughly use it although I heard it once before, and this time it reminds me again.

just an example which I learnt here.


and p.s. I can't find anywhere suitable to put this article, so I probably put it in this board






Text to Number Conversion - Which functions should be used?


sometimes we encounter situation that we may need to convert texts to numbers,
by using either VALUE(A1), --A1, A1*1, A1^1, A1+0, INT(A1)
which of them should be used? Below are my analysis


in the context, --A1, A1*1, A1^1, A1+0 and INT are of the same type, and represented by --A1 in the discussion below, unless otherwise specified.


1. -- or INT() or VALUE() matches the aim of the formula?


the aim of these are to convert texts to numbers
the underlying meaning of --, is to add twice the negative signs to A1, which does not show the meaning of conversion.
the aim of INT() is to extract the integer portion of a number
on the other hand, VALUE(), the function itself, means to convert the texts to number, which matches the true aim




2. -- is simpler than VALUE()?


By simple, it doesn't mean the length, but it means the ease of understanding, and edit/update
VALUE() is easier to be understood than --, because
as mentioned before, VALUE() matches the aim to convert texts to numbers
If one writes VALUE(A1), I think every excel users, even Newbie, can understand the meaning of the formula very well
On the other hand, there are always users asking what -- is.




3. Convention is to use --?


Convention doesn't necessarily mean the correct method.
And everywhere is using different conventions for this conversion (My environment uses *1)
There are so many kinds of conventions to -- because it does not represent the original aim
e.g. If one wants to use a cell to store A1 + 2
=A1+2*1 won't be used, =A1+0+2 won't be used, =(A1+2)*1 won't be used.
because the correct representation is =A1 + 2
although all gives the correct result of the value.


and "convention" is to use VLOOKUP but not INDEX+MATCH
in fact INDEX+MATCH is better in various aspects, except the length being longer
(this is another issue I can talk about if you wish)






4. VALUE() requires one more parenthesis than --


Yes it's true, you may argue excel 2003 (that's what I am using) only allows 7 in one cell.
The fact is, under a good excel design, 3 parenthesis in the formula of any cells are already too much.
too many parenthesis also mean harder to read and maintain
There may be violation of DRY principle or the user is putting too much things in one cell.
some examples of wrong usage includes:
Code:
=IF(Curr = "USD", a, IF(Curr = "GBP", b, IF(Curr = "JPY", c, IF(Curr = "CAD", d))))
Code:
=IF(ISNA(MATCH(1, $A$1:$A$3, 0)), 0, MATCH(1, $A$1:$A$3, 0))
Code:
=VLOOKUP($A1, $B$2:$F$10, 2, 0)*VLOOKUP($A1, $B$2:$F$10, 3, 0)+VLOOKUP($A1, $B$2:$F$10, 4, 0)-VLOOKUP($A1, $B$2:$F$10, 5, 0)+VLOOKUP($A1, $B$2:$F$10, 4, 0)/VLOOKUP($A1, $B$2:$F$10, 5, 0)


and just a random search on the threads:
Code:
=LOOKUP(9.99999999999999E+307,MID(SUBSTITUTE(A2," ","x"),SMALL(IF(ISNUMBER(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)+0),ROW(INDIRECT("1:"&LEN(A2)))),ROW(INDIRECT("1:"&LEN(A2)))),5)+0)
by immediatelly looking the formula, do you find it trival to understand the formula?
personally I don't
If you find it trival, then you may use it, and I admit that you are really really very smart!!
(if you can be certain that only you will read the file in the future -- unless you delete it, how can you sure the file won't be exposed to others?)






5. -- is faster than VALUE?


Yes, -- is faster than VALUE
by my machine, converting 3,000,000 texts to numbers requires
2.44s by --, and
3.08s by VALUE
with a difference of 0.64s


but there are several points I'd like to make on this speed
a.
it's very seldom to have such large amounts of numbers stored wrongly as texts
I use "wrongly" because wrong data structure was used, e.g. in database, numbers should not be stored in the field type text
If there is really such a large amount of mis-formated texts, there is a problem with the raw data and it ought to be solved.


b.
even 3,000,000 texts give 0.64s difference.
In reality there won't be such large amount of texts.
Moreover, given there is no volatile funtions in the workbook (this is again another issue I can talk about if you wish)
the difference only happens once, and unless the data changes.
yet by proportion, 10,000 texts only give 0.002s difference


This is very micro-optimisation
In view of the small calculation time difference, I think this part is not a matter in comparing -- and VALUE


To supplement, the main calculation speed for excel depends on the number of lookups used and the amount of volatile functions used
which very much relies on the proper data structure, and formula designing e.g. DRY (Don't Repeat Yourself)
It can gives seconds to minutes of difference.
This is the real things about calculation speed that excel designers should care about.
And to have a fast calculation speed, a good workbook design with correct choice of algorithms will mostly do the job.


And volatile functions should be avoided (If it's really required due to poor data structures, copy and paste as value after adding a remark stating the original formula used)
(this is again another issue I can talk about if you wish)


c.
If SUMPRODUCT is used, the speed is slower because of SUMPRODUCT itself, not about -- / VALUE
because SUMPRODUCT should also be avoided and there are other alternatives
(this is again another issue I can talk about if you wish)






6. -- is shorter in terms of length than VALUE()?


Yes, 2 characters comparing to 7 charaters.
Length is the matter of syntax pre-defined by Excel, which is independent of the logic.
Other matters above are more important than the syntax pre-defined.




Conclusion
As a conclusion, please use VALUE() instead of --, *1, ^1, +0 or INT() for Excel best practice






Alvin - 2013 05 18
 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
in that case do you mean one should never use any $ in the design
I don't consider it as "design". I consider the use of absolute referencing to be there to serve a purpose i.e. solely to make a reference either wholly or partly absolute.
Sometimes the whole reference needs to be absolute, sometimes part of the reference needs to be absolute.
Sometimes it specifically mustn't be absolute.

When it doesn't matter if it is absolute or not then it is personal choice whether you use it or not but the one it doesn't do is make the formula easier to read.
Why do you think that it is suggested that you put symbols in passwords?

if he/she knows $, will use that for a higher efficiency, right?
No. The use of absolute references is there when you need the references to be absolute and solely that. If you need to drag the formula in future you need to remove them which makes spurious use of them inefficient.

I don't think we should address on whether a non-excel user should understand it
No we should address readability (by the the way anyone who has opened an Excel spreadsheet is an Excel user at whatever level, a non-excel user won't see the formula in the first place)

If one starts using Excel as a general excel user, he/she should learn $
Of course they should but that has nothing to do with the readability of the formula.

and another purpose is for readability
Nonsense, adding extra characters to any text makes it harder to read, the fewer characters the easier it is to read.

Where I come from we would consider it clutter. If it doesn't serve a purpose why be there?

We all regularly use the word "Function" that word is a bit of a giveaway.

Like I said I have no objection to anyone using absolute referencing when not necessary as that is personal choice but I do object to you giving reasons that are just there to fill white space.
 
Last edited:
Upvote 0
This was the point where you lost credibility.
41 simple formulas compared with 1 complex formulas
which is better?

let me demonstrate with an example,
it was taken from a thread I replied earlier, no offence to Aladin, given the layout restriction, there are no means to have simple formulas there
but the point I am going to make is:
which approach is the easiest to be understood!
and it's how we should use helper columns to shorten formulas and provide a better readability.
60 formulas vs 30 formulas

thread:
http://www.mrexcel.com/forum/excel-...rds-where-adjacent-cells-meet-1-criteria.html

the different approach file:
http://aaalviny.magix.net/public/alvin_excel_helper_col_ease_readability_and_simpler_20130519.xls
I added the best solution I suggested at the end columns, which is different from what I replied in the post, due to the layout restriction given by OP
due to laziness I don't give the column title :rofl:
let see which way you can understand better and faster, and find it easier to maintain and update
I don't consider it as "design". I consider the use of absolute referencing to be there to serve a purpose i.e. solely to make a reference either wholly or partly absolute.
Sometimes the whole reference needs to be absolute, sometimes part of the reference needs to be absolute.
Sometimes it specifically mustn't be absolute.

When it doesn't matter if it is absolute or not then it is personal choice whether you use it or not but the one it doesn't do is make the formula easier to read.
Why do you think that it is suggested that you put symbols in passwords?


No. The use of absolute references is there when you need the references to be absolute and solely that. If you need to drag the formula in future you need to remove them which makes spurious use of them inefficient.


No we should address readability (by the the way anyone who has opened an Excel spreadsheet is an Excel user at whatever level, a non-excel user won't see the formula in the first place)


Of course they should but that has nothing to do with the readability of the formula.


Nonsense, adding extra characters to any text makes it harder to read, the fewer characters the easier it is to read.

Where I come from we would consider it clutter. If it doesn't serve a purpose why be there?

We all regularly use the word "Function" that word is a bit of a giveaway.

Like I said I have no objection to anyone using absolute referencing when not necessary as that is personal choice but I do object to you giving reasons that are just there to fill white space.
I will start a thread about this if you wish
 
Last edited:
Upvote 0
No, I stopped reading your responses in any kind of detail some time ago.
I would like to have a fruitful discussion, please don't just feel dissatisfied with my comments and then give negative response due to no reasons.
I am sad to say it's a kind of disrespect in my sense.

In case you don't like to read the attachment, here is a short recap:

30 cells of formulas contains: (less number of formulas)
there are two kinds

1. =(COUNT(B2:C2,E2:F2,H2)>0)+0
2. (array formulas)
2007 version:
=IFERROR(INDEX($A$2:$A$16,SMALL(IF($I$2:$I$16=1, ROW($A$2:$A$16)-ROW($A$2)+1),ROWS($J$2:J2))),"")
2003 version:
=IF(ROWS($J$2:J2)<=SUM($I$2:$I$16),INDEX($A$2:$A$16,SMALL(IF($I$2:$I$16=1, ROW($A$2:$A$16)-ROW($A$2)+1),ROWS($J$2:J2))),"")

and for my more formulas: (there are 4 kinds)
=IF(COUNT($B5:$C5, $E5:$F5, $H5)>0, W4+1, W4)
=X4+1
=INDEX($A$4:$A$18, MATCH($X5, $W$4:$W$18, 0))
and
=IF(ISNA($Y5), "", $Y5)
 
Last edited:
Upvote 0
I will start a thread about this if you wish
You can start a new thread if you wish, but I will not participate as I would much rather spend my time on the forum helping people rather than getting involved in a never ending discussion in which you obviously will not change your opinion and I will always differ.

If you choose to take away Excels built in functionality to drag a formula by using needless absolute references I am not going to try and stop you.

I will still advice people how to use them as intended.

I wish you well on your future endeavours.
 
Last edited:
Upvote 0
You can start a new thread if you wish, but I will not participate as I would much rather spend my time helping people rather than getting involved in a never ending discussion in which you obviously will not change your opinion and I will always differ.

If you choose to take away Excels built in functionality to drag a formula by using needless absolute references I am not going to try and stop you.

I will still advice people how to use them as intended.

I wish you well on your future endeavours.
As similar to situation of Jon von der Heyden
He also said he would not participate, but at the end he does! which I think it's a good thing. I have to thank him for his comments.
and as you see, we have come to a concensus (if not all, at least most) on the good and bad of each of the option -- and VALUE() in the end
Then, how we weigh the importance of those good/bad side is our own decision

but at least, we just can have a discussion on the pros and cons

These kinds of discussion are not only simply talking, they are useful in terms of future excel design of our own practice.
to achieve the aim of better maintenance ability and readability and to avoid errors,
which in the end it will benefit all of us from the discussion

I sincerely hope you can, as well, enjoy the discussion.
or do you feel better if you treat it as a "debate competition", with for side and against side?
 
Upvote 0

Forum statistics

Threads
1,223,237
Messages
6,170,928
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top