Text to Number Conversion - Which functions should be used?

aaalviny · May 17, 2013

Disclaimer:
I am not turning a deaf ear to others' comment, but I just had enough evidence to turn them down. I don't turn them down due to no reasons.

p.s. I do admire the use of converting 8-digit number-formatted dates to a date format by use of TEXT("0000-00-00")
I think it's the best solution given the Excel's behaviour on dates.
I didn't thoroughly use it although I heard it once before, and this time it reminds me again.

just an example which I learnt here.

and p.s. I can't find anywhere suitable to put this article, so I probably put it in this board

Text to Number Conversion - Which functions should be used?

sometimes we encounter situation that we may need to convert texts to numbers,
by using either VALUE(A1), --A1, A1*1, A1^1, A1+0, INT(A1)
which of them should be used? Below are my analysis

in the context, --A1, A1*1, A1^1, A1+0 and INT are of the same type, and represented by --A1 in the discussion below, unless otherwise specified.

1. -- or INT() or VALUE() matches the aim of the formula?

the aim of these are to convert texts to numbers
the underlying meaning of --, is to add twice the negative signs to A1, which does not show the meaning of conversion.
the aim of INT() is to extract the integer portion of a number
on the other hand, VALUE(), the function itself, means to convert the texts to number, which matches the true aim

2. -- is simpler than VALUE()?

By simple, it doesn't mean the length, but it means the ease of understanding, and edit/update
VALUE() is easier to be understood than --, because
as mentioned before, VALUE() matches the aim to convert texts to numbers
If one writes VALUE(A1), I think every excel users, even Newbie, can understand the meaning of the formula very well
On the other hand, there are always users asking what -- is.

3. Convention is to use --?

Convention doesn't necessarily mean the correct method.
And everywhere is using different conventions for this conversion (My environment uses *1)
There are so many kinds of conventions to -- because it does not represent the original aim
e.g. If one wants to use a cell to store A1 + 2
=A1+2*1 won't be used, =A1+0+2 won't be used, =(A1+2)*1 won't be used.
because the correct representation is =A1 + 2
although all gives the correct result of the value.

and "convention" is to use VLOOKUP but not INDEX+MATCH
in fact INDEX+MATCH is better in various aspects, except the length being longer
(this is another issue I can talk about if you wish)

4. VALUE() requires one more parenthesis than --

Yes it's true, you may argue excel 2003 (that's what I am using) only allows 7 in one cell.
The fact is, under a good excel design, 3 parenthesis in the formula of any cells are already too much.
too many parenthesis also mean harder to read and maintain
There may be violation of DRY principle or the user is putting too much things in one cell.
some examples of wrong usage includes:

Code:

=IF(Curr = "USD", a, IF(Curr = "GBP", b, IF(Curr = "JPY", c, IF(Curr = "CAD", d))))

Code:

=IF(ISNA(MATCH(1, $A$1:$A$3, 0)), 0, MATCH(1, $A$1:$A$3, 0))

Code:

=VLOOKUP($A1, $B$2:$F$10, 2, 0)*VLOOKUP($A1, $B$2:$F$10, 3, 0)+VLOOKUP($A1, $B$2:$F$10, 4, 0)-VLOOKUP($A1, $B$2:$F$10, 5, 0)+VLOOKUP($A1, $B$2:$F$10, 4, 0)/VLOOKUP($A1, $B$2:$F$10, 5, 0)

and just a random search on the threads:

Code:

=LOOKUP(9.99999999999999E+307,MID(SUBSTITUTE(A2," ","x"),SMALL(IF(ISNUMBER(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)+0),ROW(INDIRECT("1:"&LEN(A2)))),ROW(INDIRECT("1:"&LEN(A2)))),5)+0)

by immediatelly looking the formula, do you find it trival to understand the formula?
personally I don't
If you find it trival, then you may use it, and I admit that you are really really very smart!!
(if you can be certain that only you will read the file in the future -- unless you delete it, how can you sure the file won't be exposed to others?)

5. -- is faster than VALUE?

Yes, -- is faster than VALUE
by my machine, converting 3,000,000 texts to numbers requires
2.44s by --, and
3.08s by VALUE
with a difference of 0.64s

but there are several points I'd like to make on this speed
a.
it's very seldom to have such large amounts of numbers stored wrongly as texts
I use "wrongly" because wrong data structure was used, e.g. in database, numbers should not be stored in the field type text
If there is really such a large amount of mis-formated texts, there is a problem with the raw data and it ought to be solved.

b.
even 3,000,000 texts give 0.64s difference.
In reality there won't be such large amount of texts.
Moreover, given there is no volatile funtions in the workbook (this is again another issue I can talk about if you wish)
the difference only happens once, and unless the data changes.
yet by proportion, 10,000 texts only give 0.002s difference

This is very micro-optimisation
In view of the small calculation time difference, I think this part is not a matter in comparing -- and VALUE

To supplement, the main calculation speed for excel depends on the number of lookups used and the amount of volatile functions used
which very much relies on the proper data structure, and formula designing e.g. DRY (Don't Repeat Yourself)
It can gives seconds to minutes of difference.
This is the real things about calculation speed that excel designers should care about.
And to have a fast calculation speed, a good workbook design with correct choice of algorithms will mostly do the job.

And volatile functions should be avoided (If it's really required due to poor data structures, copy and paste as value after adding a remark stating the original formula used)
(this is again another issue I can talk about if you wish)

c.
If SUMPRODUCT is used, the speed is slower because of SUMPRODUCT itself, not about -- / VALUE
because SUMPRODUCT should also be avoided and there are other alternatives
(this is again another issue I can talk about if you wish)

6. -- is shorter in terms of length than VALUE()?

Yes, 2 characters comparing to 7 charaters.
Length is the matter of syntax pre-defined by Excel, which is independent of the logic.
Other matters above are more important than the syntax pre-defined.

Conclusion
As a conclusion, please use VALUE() instead of --, *1, ^1, +0 or INT() for Excel best practice

Alvin - 2013 05 18

aaalviny · May 18, 2013

To supplement, during my work, I have redesigned worksheets to combine original 6 worksheets of different products each about 10MB to one with 5MB of clearly clean and readable formulas, in case you have question about my redesigning.

Peter_SSs · May 18, 2013

I'm going a fair way back in the thread:

aaalviny said:
that 41 formulas don't matter...
.. and that's why it's (SUMPRODUCT) slow

I'd say that very much depends on what you are measuring and how often and how much of the data is likely to change.

For my testing I repeated the data you showed in that screen shot 84 times so the data extended from row 6 to row 1097 - so that there was bit more to do for my timer to measure.

Using your setup on Sheet1 and the single formula =SUMPRODUCT(--(E6:E1097="Yes"),--(H6:H1097="No"),J6:J1097) on Sheet2, a Sheet2 recalculation took 30% of the time for a Sheet1 recalculation. By that measure I wouldn't call SUMPRODUCT slow.

aaalviny said:
If now, the data changes, "Yes" is no longer "Yes" and becomes "Y"
if you are using sumproduct, you have to change five times in the formula (one in each sumproduct)

Do you? I would have thought you could set it up so that the formula(s) wouldn't need changing at all.
If you are worried about ongoing maintenance, why not use cell references - surely that is simpler and easier to maintain than changing a formula AND copying it down a whole column!

Excel Workbook

B

C

D

E

H

J

1

Value of interest:

Y

N

2

3

1260

4

5

6

Y

2

7

N

3

8

4

9

N

5

10

Y

N

6

11

7

12

N

8

13

Y

N

9

14

10

15

N

11

16

N

12

17

13

18

Y

14

19

Y

2

20

N

3

1092

Y

N

9

1093

10

1094

N

11

1095

N

12

1096

13

1097

Y

14

test 3

aaalviny said:
that 41 formulas don't matter...

No, I'm sure the user won't mind an extra 3 columns of data cluttering their sheet when they want to print it!!

aaalviny said:
that 41 formulas don't matter, as long as they are easy to read and maintain

Let's talk more about ease of maintenance. Remember "future is always unpredictable" so perhaps some more data will be added over time (I currently have data down to row 1097). To cover for this possible extra data, I set up my SUMPRODUCT formula:

=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1),J6:J2000)

What maintenance do I have to do when new data is added below row 1097? None
What maintenance do you have to do when new data is added below row 1097? Copy 3 columns of formulas down to cover the new data (and possibly adjust your B2 formula, although this could be set up with a larger range at the start)

What maintenance do I have to do when new rows of data are inserted in the middle of the existing data? None
What maintenance do you have to do when new rows of data are inserted in the middle of the existing data? You have to copy 3 columns of formulas down to fill the newly inserted rows.

Hmmm, which set-up is easiest to maintain?

Talk about repeating yourself? With a sheet set up as you have shown, but with data to row 1097 as I have, there are 3 columns of formulas repeated over 1,000 times!
With my sheet I have a single formula - nothing repeated at all.

Hmmm, who's repeating themselves?

aaalviny said:
... those formulae are easy to read and take instant time to understand

If you want your formulas to be easy to read and understand, why clutter your simple 13 row sample file formulas with 60 $ signs, not one of which is required?

Jon von der Heyden · May 18, 2013

aaalviny said:
And if the data structure consists of all the things you listed above, it is clearly not a database of qualified data, isn't it?

But Alvin Excel is NOT a database.

That's one of the beauties of it. I can mix up up the data types. If I want a database then I will use a database. Excel offers us countless abilities to tap into databases so if it is important then we can always use both.

aaalviny said:
well Boolean is Boolean, never a number, don't be trapped by excel's logic. Only few support the convention false = 0 and true = 1 (correct me if wrong, excel, Vba, c support this only)

This is an Excel forum. We are discussing best practise Excel methods. -TRUE equals -1, which if treated as a logical is TRUE yet again. -FALSE equals 0, which if treated as a logical is FALSE yet again. To invert, using arithmetic, one would use 1-Logical_Value, i.e:
1-TRUE = 1-1 = 0 = FALSE
1-FALSE = 1-0 = 1 = TRUE

You have also made suggestion that using arithmetic should be used to complete arithmetic operation and that coercion is not by design. Had that been true then all of the following statements would yield #VALUE! error:

=TEXT(1,0)+0 = 1
="01-Jan-2013" + 1 = 41276 (which is 02-Jan-2013)
=1-TRUE = 0

So clearly the fact that Excel coerces each operand to it's numeric equivalent, it IS by design. So both VALUE() and arithmetic are by design; except that artithmetic works on all data types whilst VALUE() does not. I wouldn't either choose VALUE() coercion in order to discriminate a single data type (since VALUE() isn't particular to one data type anyway). No no, one must be more explicit. So had I needed to test the type typically using one of the IS functions (or whatever!).

I agree that there are instances where it is good to discriminate. A fine example would be to test whether or not a value is #N/A, specifically missing from a lookup table or array. In such an instance IF(ISNA() construct is a better candidate that IFERROR(), even though it involves an extra function call... For this context IFERROR is too indiscriminate.

So far Alvin you have described one reasonable advantage of using VALUE() and that is the fact that a read would be inclined to better understand it. My feeling is that this reason though is of very little significance because (a) the topic is easy to understand and (b) it shan't take a user long to figure it out either using the formula evaluation tool or forum. Besides, I am paid to build solutions using the skills I have acquired over the years. I am not paid to build a solution using methods only known to novices - otherwise why would you hire me? That would surely p1ss-off my employers.

So of all your other reasons none stand true so far. We have many more better reasons to use -- than VALUE(), described now exhaustively in this thread:
-- is by design
-- is more efficient
-- works on all data types (use IS functions to test data types, it is what the are intended for)
-- adheres to an existing convention (best practise)
-- is neater (shorter) and doesn't increase parenthesis

Weighing it all up -- makes a lot more sense.

aaalviny · May 18, 2013

Peter_SSs said:
I'm going a fair way back in the thread:

I'd say that very much depends on what you are measuring and how often and how much of the data is likely to change.

as I mentioned before, no one knows what will happen in the future
no one knows the amount of conditions to be added and no one knows the number of values of interest
we cannot decide, based on existing condition, that how likely the data is to be changed
that's why maintenance is important

Peter_SSs said:
For my testing I repeated the data you showed in that screen shot 84 times so the data extended from row 6 to row 1097 - so that there was bit more to do for my timer to measure.

Using your setup on Sheet1 and the single formula =SUMPRODUCT(--(E6:E1097="Yes"),--(H6:H1097="No"),J6:J1097) on Sheet2, a Sheet2 recalculation took 30% of the time for a Sheet1 recalculation. By that measure I wouldn't call SUMPRODUCT slow.

if only 1 value of interests, i.e. only 1 sumproduct, is used, it clearly doesn't make a difference.
this is the same of using VLOOKUP vs INDEX+MATCH for one time only, but the fact is, if more are used, it makes a difference.

and again, no one knows about the future , <= this is one very important I am making, so that maintenance will become easier if we think about the future.

Peter_SSs said:
Do you? I would have thought you could set it up so that the formula(s) wouldn't need changing at all.
If you are worried about ongoing maintenance, why not use cell references - surely that is simpler and easier to maintain than changing a formula AND copying it down a whole column!

being easy maintenance doesn't necessary mean no change, but the change should be easy and obvious, and also not troublesome
your suggestion of using cell reference is good, and it qualifies the "No magic strings" principle in anti-pattern
In any models this should actually be used, good point.
This eases one maintenance effort, but there are still others.

Peter_SSs said:
No, I'm sure the user won't mind an extra 3 columns of data cluttering their sheet when they want to print it!!

this is another issue
that worksheet is "PROCESS" in the cycle of "INPUT" -> "PROCESS" -> "OUTPUT"
if one is going to print it out, that should be an "OUTPUT" which contains no calculation logic in the process (of course if the logic is about the layout of the output, it is ok)
and that "output" worksheet should be a new one, taking values from this "process" worksheet without calculation, and only shows what the user wants to show in the printout (If you are going to print that worksheet, you don't expect to print the whole set of data, right?

)

Peter_SSs said:
Let's talk more about ease of maintenance. Remember "future is always unpredictable" so perhaps some more data will be added over time (I currently have data down to row 1097). To cover for this possible extra data, I set up my SUMPRODUCT formula:

=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1),J6:J2000)

What maintenance do I have to do when new data is added below row 1097? None
What maintenance do you have to do when new data is added below row 1097? Copy 3 columns of formulas down to cover the new data (and possibly adjust your B2 formula, although this could be set up with a larger range at the start)

What maintenance do I have to do when new rows of data are inserted in the middle of the existing data? None
What maintenance do you have to do when new rows of data are inserted in the middle of the existing data? You have to copy 3 columns of formulas down to fill the newly inserted rows.

Hmmm, which set-up is easiest to maintain?

Talk about repeating yourself? With a sheet set up as you have shown, but with data to row 1097 as I have, there are 3 columns of formulas repeated over 1,000 times!
With my sheet I have a single formula - nothing repeated at all.

Hmmm, who's repeating themselves?

Talking about the ease of maintenance, think when you are working on it.
If you are adding records there, you clearly know you have to copy down those helper columns, right? (as helper columns are put in front of the raw data, it is clearly visible)
and it doesn't cost a second.
this is easy-to-notice and fast update

then it comes to the point of other maintenance:
as I mentioned before, when there are more value of interests,
clearly the SUMPRODUCT formula cannot be copied down directly (it needs amendment to each SUMPRODUCT)
e.g. there are two values of interest, as shown below (note: there can be more)
=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1), J6:J2000)
and
=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1), --(F6:F2000=F1), J6:J2000)

If now, H6:H2000=H1 is no longer the desired condition, but H6:H2000<>H1 is
two separate modifications are needed, and no copy down can be done directly
imagine if there are more values of interest and different kinds of conditions
changing 10 cells of H6:H2000=H1 to H6:H2000<>H1 is just not easy (well you can use replace, but as another matter, there's always a risk when using "REPLACE" as every cells have to be checked
and the point is, for my design, I only need to modify one column, which is:
=IF(H6=H$1, 1, 0) becomes =IF(H6<>H$1, 1, 0) and drag down, then it's done.

the repeating yourself doesn't mean the repeating for each record (you have to do once for each record, and this can't be avoided)
and if the no. of checking for each record is more than once on the same issue, it's the real "Repeat Yourself"
e.g. there are two values of interest
=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1),J6:J2000)
and
=SUMPRODUCT(--(E6:E2000=E1),--(H6:H2000=H1), --(F6:F2000=F1), J6:J2000)
these are double checking whether H6:H2000 = H1 in the two cells, and that's why it is slower
i.e. H6=H1 is checked twice, as there are two these kinds of cells

Peter_SSs said:
If you want your formulas to be easy to read and understand, why clutter your simple 13 row sample file formulas with 60 $ signs, not one of which is required?

easy to understand means proper using of $ signs
if $ signs are not added, it is harder to read, and easier to have errors

when cell copying is needed, think about whether that reference should be fixed or movable
it can also help users to understand the formula better
e.g. for =IF($H6="No", 1, 0) what happens if I am copying it to other columns? do I expect cell H6 to change?
In my demonstration no, as I don't expect two data fields will have the same validation.
maybe column G is checking on date, I would use =IF($G6=cutoff_date, 1, 0)
and I don't expect other columns to share this logic

on the other hand, for
=SUMIF(B$6:B$18, 1, $J$6:$J$18)
why I don't use $ for B, it is because when I copy this formula to column C, I expect it to refer to the 1 and 0 of column C and make the SUMIF based on criteria of column C
for 6 and 18, because the no. of record doesn't change, I don't expect it to be moving when cells are copied upward and downward (which gives less change of error)
for J, I use $ because column J is the value to sum in the data, which is not likely to be changed

by using $, users can have a better feeling on what the reference refers to
e.g. using fixed reference $A$1 means the reference is a stand-alone value, while by using $A3 it probably refers to one column of a database

shg · May 18, 2013

This thread makes a great case for youth in Asia.

aaalviny · May 18, 2013

Jon von der Heyden said:
But Alvin Excel is NOT a database. That's one of the beauties of it. I can mix up up the data types. If I want a database then I will use a database. Excel offers us countless abilities to tap into databases so if it is important then we can always use both.

Well Excel is not a database, but for efficient working I bet you will agree to have consistent data structure, as well as neat tables, right?
you don't want to see empty rows within a set of 10000 records, you don't want to see a string in the column of dates. You don't want to see a missing data, etc.
So it's exactly what a database means: to have consistent data format
Do you agree?

Actually you can mix up the data type because of the "freedom" Excel offers, but by giving too much freedom, it yields improper use of it.

Jon von der Heyden said:
This is an Excel forum. We are discussing best practise Excel methods. -TRUE equals -1, which if treated as a logical is TRUE yet again. -FALSE equals 0, which if treated as a logical is FALSE yet again. To invert, using arithmetic, one would use 1-Logical_Value, i.e:
1-TRUE = 1-1 = 0 = FALSE
1-FALSE = 1-0 = 1 = TRUE

You have also made suggestion that using arithmetic should be used to complete arithmetic operation and that coercion is not by design. Had that been true then all of the following statements would yield #VALUE! error:

=TEXT(1,0)+0 = 1
="01-Jan-2013" + 1 = 41276 (which is 02-Jan-2013)
=1-TRUE = 0

So clearly the fact that Excel coerces each operand to it's numeric equivalent, it IS by design. So both VALUE() and arithmetic are by design; except that artithmetic works on all data types whilst VALUE() does not. I wouldn't either choose VALUE() coercion in order to discriminate a single data type (since VALUE() isn't particular to one data type anyway). No no, one must be more explicit. So had I needed to test the type typically using one of the IS functions (or whatever!).

I think, here, the concern is again on the database data. Yes, data validation is always important.
We normally don't do that (ISNUMBER, etc.) because we are confident that the data is of good quality, isn't it?
However the good quality of data may not be always true, as mentioned before, if there are flaws in the raw data, it is very important to spot them out.
I am not sure what kind of data validations you use, at least if I am working on some database, when the cutoff date is today, then I will at least use auto-filter to check whether the dates are on or before the cutoff date. This is one kind of data validation I will carry out before I move on the work.
And there are other kinds of data validation, including checking the data types.
(there may be a system error that a date field of one record is missing, it's important to spot it out)

Jon von der Heyden said:
I agree that there are instances where it is good to discriminate. A fine example would be to test whether or not a value is #N/A, specifically missing from a lookup table or array. In such an instance IF(ISNA() construct is a better candidate that IFERROR(), even though it involves an extra function call... For this context IFERROR is too indiscriminate.

and I have to say, using ISNA() doesn't involve an extra function call compared to IFERROR,
did you violate "DON'T REPEAT YOURSELF" principle by using =IF(ISNA(VLOOKUP()), 0, VLOOKUP())?
the correct way is:
A1: =VLOOKUP(), A2: =IF(ISNA(A1), 0, A1)

clearly it is one evidence why you should agree of using VALUE(), as -- is too indiscriminate

Jon von der Heyden said:
So far Alvin you have described one reasonable advantage of using VALUE() and that is the fact that a read would be inclined to better understand it. My feeling is that this reason though is of very little significance because (a) the topic is easy to understand and (b) it shan't take a user long to figure it out either using the formula evaluation tool or forum. Besides, I am paid to build solutions using the skills I have acquired over the years. I am not paid to build a solution using methods only known to novices - otherwise why would you hire me? That would surely p1ss-off my employers.

to my side, I believe your reasonable advantage is the more efficient. And in fact the less use of parenthesis can be one advantage to me. But in contrast, the problem of being indiscriminate and aim-matching are more important to me.

Jon von der Heyden said:
So of all your other reasons none stand true so far. We have many more better reasons to use -- than VALUE(), described now exhaustively in this thread:
-- is by design
-- is more efficient
-- works on all data types (use IS functions to test data types, it is what the are intended for)
-- adheres to an existing convention (best practise)
-- is neater (shorter) and doesn't increase parenthesis

Weighing it all up -- makes a lot more sense.

1. -- is by design, well I don't think it's a reason of using it? the existence of this feature does not mean we have to use it?
2. yes, it's more efficient, I agree
3. If we always use IS functions to check each data, it's kind of messy. given we can continue our job given data checking is embedded, I would go for this way.
4. I still doubt, convention is just a culture, yet the majority is used to the convention/culture but doesn't ask the reason why we should follow it. If reason can be given, then yes we can follow it. But being simply convention doesn't make sense in my thought.
5. yes, I agree

so in my summary, the only advantage is: more efficient, shorter, neater, and doesn't increase parenthesis, which only relates to the "hard" side but are not related to the "soft" side of the choice of functions, I would say

MARK858 · May 18, 2013

if $ signs are not added, it is harder to read, and easier to have errors

WHAT! how is it harder to read? if you have a complete novice they nearly always recognize a cell reference like A1 (and know what it is referring to) because they can easily relate it to column headers and row numbers that they see on the sheet but ask they what the $ does (there have been plenty of posts where we have had to explain it).

This makes it the other way round to "it is harder to read".

The $ is used to serve a purpose, it is not there for added readability by any stretch of the imagination.

I have no problem with it being used by anyone but please don't state readability as a justification for using it.

Jon von der Heyden · May 18, 2013

aaalviny said:
Well Excel is not a database, but for efficient working I bet you will agree to have consistent data structure, as well as neat tables, right?
you don't want to see empty rows within a set of 10000 records, you don't want to see a string in the column of dates. You don't want to see a missing data, etc.
So it's exactly what a database means: to have consistent data format
Do you agree?

Yes and no. Sometimes I like to mix it up; say if I'm building a dashboard. I may have several summaries on one sheet. It's more succinct that way. And this was my point. There are advantages to Excel not being a database; and I fully aim to take advantage of that fact.

aaalviny said:
and I have to say, using ISNA() doesn't involve an extra function call compared to IFERROR,

I said IF(ISNA() does. IF is the 1st function call, ISNA is the 2nd, compared with 1 function call when using IFERROR

aaalviny said:
did you violate "DON'T REPEAT YOURSELF" principle by using =IF(ISNA(VLOOKUP()), 0, VLOOKUP())?

No I didn't. Have a look!

aaalviny said:
the correct way is:
A1: =VLOOKUP(), A2: =IF(ISNA(A1), 0, A1)

Or even more succinct yet would be to use: =LOOKUP(10^308,CHOOSE({1;2},0,VLOOKUP(...)))

aaalviny said:
clearly it is one evidence why you should agree of using VALUE(), as -- is too indiscriminate

No! If you recall I said that I would rather use more explicit means to test data type if it is necessary to do so. VALUE() does not provide sufficient data type test.

aaalviny said:
But in contrast, the problem of being indiscriminate and aim-matching are more important to me.

Well I believe I have suitably dismissed your issue of type indiscrimination. As for aim-matching; I suggest that -- is aim-matching since -- is by design, as is the case with VALUE().

aaalviny said:
1. -- is by design, well I don't think it's a reason of using it? the existence of this feature does not mean we have to use it?

See above...

aaalviny said:
2. yes, it's more efficient, I agree

Yippeeee!

aaalviny said:
3. If we always use IS functions to check each data, it's kind of messy. given we can continue our job given data checking is embedded, I would go for this way.

VALUE() does not provide sufficient data type test. How can you claim this suggestion "messy" after that monstrous sample you posted earlier? Sheesh!

aaalviny said:
4. I still doubt, convention is just a culture, yet the majority is used to the convention/culture but doesn't ask the reason why we should follow it. If reason can be given, then yes we can follow it. But being simply convention doesn't make sense in my thought.

How many more reasons do you need? Your entire case for VALUE() has been ripped apart! It's in tatters! Ruined!

aaalviny said:
5. yes, I agree

Great!

Ok I've presented my case for -- over VALUE() for coercion. You're not bringing up any new facts - you're just repeating yourself and completely missing/avoiding the case against VALUE(). Actually you're opening new topics; topics again that should probably belong in their own threads.

aaalviny · May 18, 2013

Jon von der Heyden said:
This is an Excel forum. We are discussing best practise Excel methods. -TRUE equals -1, which if treated as a logical is TRUE yet again. -FALSE equals 0, which if treated as a logical is FALSE yet again. To invert, using arithmetic, one would use 1-Logical_Value, i.e:
1-TRUE = 1-1 = 0 = FALSE
1-FALSE = 1-0 = 1 = TRUE

You have also made suggestion that using arithmetic should be used to complete arithmetic operation and that coercion is not by design. Had that been true then all of the following statements would yield #VALUE! error:

=TEXT(1,0)+0 = 1
="01-Jan-2013" + 1 = 41276 (which is 02-Jan-2013)
=1-TRUE = 0

So clearly the fact that Excel coerces each operand to it's numeric equivalent, it IS by design. So both VALUE() and arithmetic are by design; except that artithmetic works on all data types whilst VALUE() does not. I wouldn't either choose VALUE() coercion in order to discriminate a single data type (since VALUE() isn't particular to one data type anyway). No no, one must be more explicit. So had I needed to test the type typically using one of the IS functions (or whatever!).

Sorry I think I have missed this part in the previous reply

I think by Excel best practice, some basic aims have to be kept e.g. the aim matching

=TEXT(1,0)+0 = 1
="01-Jan-2013" + 1 = 41276 (which is 02-Jan-2013)
=1-TRUE = 0

Yes, they are false in my sense, or I would say in the general logic it is false
(Do you think they are true before learning excel? I don't believe you thought that 1-TRUE makes sense, nor a text + number = number makes sense, right?)
it's because of excel that makes you think it's true, but it is in fact a logic violation! although excel allows this false logic, should we continue to follow it?

MARK858 said:
WHAT! how is it harder to read? if you have a complete novice they nearly always recognize a cell reference like A1 (and know what it is referring to) because they can easily relate it to column headers and row numbers that they see on the sheet but ask they what the $ does (there have been plenty of posts where we have had to explain it).

This makes it the other way round to "it is harder to read".

The $ is used to serve a purpose, it is not there for added readability by any stretch of the imagination.

I have no problem with it being used by anyone but please don't state readability as a justification for using it.

in that case do you mean one should never use any $ in the design? I think everyone, if he/she knows $, will use that for a higher efficiency, right?
and $ is a syntax designed by Excel, to denote the fixed reference, which is important to the design of excel which relies much on the layout and address.
If one starts using Excel as a general excel user, he/she should learn $. It is similar to learning +, -, *, /, and SUM() for them.
therefore I don't think we should address on whether a non-excel user should understand it, by novice I don't mean about knowing nothing about Excel, but of general excel users who have basic knowledge.

About why I say it's of better readability, I already mentioned it above.
and one purpose is for the correct logic, and another purpose is for readability

aaalviny · May 18, 2013

Jon von der Heyden said:
Yes and no. Sometimes I like to mix it up; say if I'm building a dashboard. I may have several summaries on one sheet. It's more succinct that way. And this was my point. There are advantages to Excel not being a database; and I fully aim to take advantage of that fact.

well, several summaries on one sheet is about the "OUTPUT" I mentioned, no matter how the OUTPUT layout is, it is independent of the logic. and It can be of any formatting as the users want to present
i.e. in the "PROCESS" worksheet, I suppose you will have the data in a good format like a database, right?

Jon von der Heyden said:
I said IF(ISNA() does. IF is the 1st function call, ISNA is the 2nd, compared with 1 function call when using IFERROR
No I didn't. Have a look!

Oh well, I didn't count the IF

Although there is one more parenthesis, I think it doesn't affect the readability as it's very clear?
IF(ISNA()), one is clearly a IF, and the other clearly a "is N/A", easy to understand.

Jon von der Heyden said:
Or even more succinct yet would be to use: =LOOKUP(10^308,CHOOSE({1;2},0,VLOOKUP(...)))

oh no, it's an example of difficult-to-read formula! I absolutely don't think it's succinct!
what is 10^308? and you use CHOOSE {1, 2}, what is this?
I have to think for a few seconds before knowing what it means!
and again, it does not match the aim! that's why it's hard to read!
the aim is: check if it's N/A, if yes then 0, no then the value
Going for an indirect approach just adds more complexity!

Jon von der Heyden said:
No! If you recall I said that I would rather use more explicit means to test data type if it is necessary to do so. VALUE() does not provide sufficient data type test.

At least not as generalized as --. I would say, so it is best to narrow down the possibility of overlooked errors.

Jon von der Heyden said:
Well I believe I have suitably dismissed your issue of type indiscrimination. As for aim-matching; I suggest that -- is aim-matching since -- is by design, as is the case with VALUE().

Well, the aim of -- is do twice negative, but not conversion from text to values
this -- being able to convert text to values is by design, i.e. actual implementation
in fact the aim of -- deviates from the actual result in the case of data format conversion.
or alternatively, the actual implementation result does not match with the aim for Excel's -- design. yet it accidentally comes up the correct solution.

Jon von der Heyden said:
VALUE() does not provide sufficient data type test. How can you claim this suggestion "messy" after that monstrous sample you posted earlier? Sheesh!

as mentioned before, not sufficient but at least more sufficient than --.

Jon von der Heyden said:
How many more reasons do you need? Your entire case for VALUE() has been ripped apart! It's in tatters! Ruined!
Great!

Ok I've presented my case for -- over VALUE() for coercion. You're not bringing up any new facts - you're just repeating yourself and completely missing/avoiding the case against VALUE(). Actually you're opening new topics; topics again that should probably belong in their own threads.

well I am still having two solid reasons there: aim-matching (and therefore easier to understand) and the indiscriminate
which I think they are soft advantages
and I think the soft advantages are more important to your hard advantages of simply efficient / less number of parenthesis

Yes I do think there are many threads to be able to be started. and thanks for the reminder I am just not going to reply those off-topic issues.

Text to Number Conversion - Which functions should be used?

aaalviny

Board Regular

aaalviny

Board Regular

Peter_SSs

MrExcel MVP, Moderator

Jon von der Heyden

MrExcel MVP, Moderator

aaalviny

Board Regular

shg

MrExcel MVP

aaalviny

Board Regular

MARK858

MrExcel MVP

Jon von der Heyden

MrExcel MVP, Moderator

aaalviny

Board Regular

aaalviny

Board Regular

Similar threads

Share this page

Text to Number Conversion - Which functions should be used?

Board Regular

Board Regular

MrExcel MVP, Moderator

MrExcel MVP, Moderator

Board Regular

MrExcel MVP

Board Regular

MrExcel MVP

MrExcel MVP, Moderator

Board Regular

Board Regular

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock