The Array Type

Colin Legg

MrExcel MVP
Joined
Feb 28, 2008
Messages
3,507
Office Version
  1. 365
Platform
  1. Windows
Hi Everyone,

I'm going to start this thread with a pretty broad question...

Does anyone know where I can find some documentation or information (exluding the Excel helpfile) about Excel's Array Type? There seems to be an inexplicable lack of information about it!

To be clear, when I say "array type", I mean the data type that the TYPE() worksheet function returns a value of 64 on.

Thanks
Colin
 
Last edited:
I don't know why this matters. It is so completely irrelevant. Among the several "high reliability, high performance" projects I've worked on, one was a large (multi-mainframe) project where nothing -- and I mean nothing short of a simultaneous multi-mainframe failure) could compromise my security kernel code while adding no more than a few milliseconds of overhead to overall transaction durations. And, given my *practical* experience in high performance projects, I can assure you I neither know nor do I care about stuff like where Excel stores a constant. Nor do I care about what excites so many others -- though, not the topic of this discussion {grin} -- as to which data type is faster than another.

These kinds of what I call micro-optimizations will *at the most* gain you a few percentage points. It won't bring a magnitude improvement in performance. Nor will they help you develop robust, reliable, understandable, and easy to maintain systems.
Hi,

Thanks for your replies. Yeah, xld, that's a fair enought point - I'll start being a little bit more specific, but rather than jumping in the deep end I'll try to build up.

In the introduction paragraph in this primer on array constants, J-Walk states that array constants are stored within the memory rather than within the workbook. This surprises me. Do we know if this is fact (and how do we know this!?), do we know the type of structure that is used in the memory to store them, and do we know if named array constants are stored differently to [non-named] array constants?

Cheers,
Colin
 
Upvote 0

Excel Facts

Format cells as currency
Select range and press Ctrl+Shift+4 to format cells as currency. (Shift 4 is the $ sign).
I'll give my two cents here on what J-Walk's getting at, and this is my unofficial interpretation.

Excel stores variables like this in memory, and it travels with the Workbook, i.e., open the Workbook and it's in memory, close the Workbook, it's out of memory. But it's committed at the Application level.

In a certain sense, this used to be less trivial than it is today, with memory-caps, and whatnot. If you open a 2nd instance of Excel, your memory-slate is clear, regardless of what's happening with the initial instance.

The same would be true where you use a VLookUp() call against a Range. You might think you have a Range burned into memory, but Excel's actually housing your Range reference as an Array, in memory.
 
Upvote 0
Hi everyone,
Thanks for your replies.
Tushar - I assure you I'm not heading down the efficiency road with this! ;)
Nate - Fully recoverd from your birthday hangover, at last?!


Array constants and named array constants seem to be interpretted differently when they are passed to and from the worksheet. I think this is to do with how they are structured within the memory. Excel is a black box and I am guessing, so I could most certainly be wrong.

If we set up a named array constant, which is really just a named formula, in the workbook:
Named array constant: My_Array ={3,5}

Firstly, it seems that a named array constant is NOT an array type.
We can see this with a couple of formulas, entered normally:
=TYPE({3,5}) returns 64 --> array type
=TYPE(My_Array) returns 1 --> not an array type. Returns 1 because the first element of My_Array is a double.
Simple as this is, as far as I am aware - and I have looked around - this is not actually documented anywhere? Maybe I'm missing something here?

Does this matter? Yes, it does - to me, at least. It means that calculations on array constants and named array constants work differently - they need to be treated as two separate animals.
=SUM({3,5}+1) returns 10
=SUM(My_Array+1) returns 4
=SUM({3,5})+1 returns 9
=SUM(My_Array) +1 returns 9
Evaluating the formulas 1 and 2 with F9 shows exactly the same thing, but the results are different? It seems that this is because only the first element of the named array is passed back for SUM to aggregate?

This is further supported by what happens if we deliberately convert My_Array into an array type. It seems there are three ways we can do this.
1. Use CTRL+SHIFT+ENTER:
=SUM(My_Array+1) returns 10 when CTRL+SHIFT+ENTERED
2. Use Sumproduct, which convertes its arguments into array types:
=SUMPRODUCT(My_Array+1) returns 10
3. Use the N() worksheet function:
=SUM(N(My_Array)+1) returns 10

Now the third one, using the N() worksheet function, is bizarre - to me, at least. Again, I can't find it documented anywhere that the N() worksheet function can do this? Perhaps it has a use after all!? :eeek:


So, getting back to array constants and the memory. Looking at these results, it would make sense to me that a named array constant has to pass through a particular layer of interpretation that an array constant doesn't have to; or, at least to say that they are dimensioned somewhat differently in the memory?

The three ways we can induce Excel to calculate on array constants and named array constants in the same way suggests to me that when we CTRL+SHIFT+ENTER a formula, we are just converting the relevant parts, such as range reference values, of that formula into array types. So, in turn, an array formula should simply be defined as a formula that contains an array type. This means that array formulas are not just CSE formulas, and any formula that uses Sumproduct, an array constant, or another array type construct, is in fact an array formula. A formula that contains a named array constant is not an array formula unless there is an array type conversion in there. Again, looking around the internet, there are plenty of array formula tutorials that describe array formulas and what they do, but none of them seem to specifically define what an array formula is.

All thoughts, possibilities and suggestions are welcome?

Cheers,
Colin
 
Upvote 0
What you seem to be missing to me is that names are evaluated whenever they are used. This means that array constants that are use in a non-array context, such as TYPE and name+1 picks the first item.

But you can force it to be properly evaluated if you array enter it, so {TYPE(My_Array)} returns 64 just as TYPE({3,5}) returns 64, that is you are telling TYPE To process an array as an array. Similalrly, SUM(My_Type+1) resolves to SUM(3+1), but SUM({4,6}) if array-entered.

So, as I keep re-iterating, array constants are array constants, they are very simple, and you are over-thinking it IMO.
 
Upvote 0
What you seem to be missing to me is that names are evaluated whenever they are used. This means that array constants that are use in a non-array context, such as TYPE and name+1 picks the first item.

But you can force it to be properly evaluated if you array enter it, so {TYPE(My_Array)} returns 64 just as TYPE({3,5}) returns 64, that is you are telling TYPE To process an array as an array. Similalrly, SUM(My_Type+1) resolves to SUM(3+1), but SUM({4,6}) if array-entered.

Hi xld,

I'm not missing that - honest. And I actually think we're kind of agreeing here - with an array constant we can put it in a cell and it is processed as an array type. But with a named array constant, we have to tell Excel to process it like an array type. So clearly there is a difference there and I cannot find any documentation which actually says this. And I think it is interesting that using N() does this successfully for a named array constant - again I can find no information on this.

So, as I keep re-iterating, array constants are array constants, they are very simple, and you are over-thinking it IMO.
I could well be over-thinking it. I've been banging my head against a brick wall about this ever since I wrote an array formula tutorial a little while back. Well, I never answered it myself and I was pretty sure that if there is an answer then the members on here would be able to come up with it - or at least some seriously good ideas! I do think it is a valid observation and I can't see any harm in asking the questions, especially since these observations do not appear to have been discussed on the forums before... at the very least I am going to learn something, right?

And thank you for taking the time to share your opinion and thoughts with me - I appreciate it and I have taken it on board.

Cheers,
Colin
 
Upvote 0
I think it is not that we have with a named array constant, we have to tell Excel to process it as an array. When it is an in-line array constant, such as SUM({3,5}) Excel can work that out itself. When it is a named array constant, because the name is evaluated on use, Excel doesn't know that SUM(My_Array) is an array, so it has to be informed by array entering the formula.
 
Upvote 0
Colin,

Up to now I hadn't looked in this thread...

The points you rise and observations you make are interesting and valid.

...

Array constants and named array constants seem to be interpretted differently when they are passed to and from the worksheet. I think this is to do with how they are structured within the memory. Excel is a black box and I am guessing, so I could most certainly be wrong.

Everything is in memory, even the objects that are anchored to "cells" which define spreadsheets.

If we set up a named array constant, which is really just a named formula, in the workbook:

Named array constant: My_Array ={3,5}

Firstly, it seems that a named array constant is NOT an array type.
We can see this with a couple of formulas, entered normally:

=TYPE({3,5}) returns 64 --> array type
=TYPE(My_Array) returns 1 --> not an array type. Returns 1 because the first element of My_Array is a double.


Simple as this is, as far as I am aware - and I have looked around - this is not actually documented anywhere? Maybe I'm missing something here?

While =TYPE({3,5}) is anchored to the cell where this bit is entered, {3,5} itself is not. TYPE recognizes thus the {3,5} bit as array object. Hence 64.

But when you enter that bit into a cell as ={3,5} or as =My_Array, TYPE sees just the first item in both cases. Hence, both evaluate to 1.

The following are equivalent:

=TYPE({3,5})

{=TYPE(My_Array)}

Does this matter? Yes, it does - to me, at least. It means that calculations on array constants and named array constants work differently - they need to be treated as two separate animals.

=SUM({3,5}+1) returns 10
=SUM(My_Array+1) returns 4
=SUM({3,5})+1 returns 9
=SUM(My_Array) +1 returns 9


Evaluating the formulas 1 and 2 with F9 shows exactly the same thing, but the results are different? It seems that this is because only the first element of the named array is passed back for SUM to aggregate?

The 2nd formula result shows that SUM() is somewhat special (see: http://tinyurl.<WBR>com/5wc4ho, for some take ups by Harlan Grove and others). If an object must be treated as an array object by SUM, this needs to be signalled to Excel explicitly.

This is further supported by what happens if we deliberately convert My_Array into an array type.

Right.

It seems there are three ways we can do this.
1. Use CTRL+SHIFT+ENTER:

=SUM(My_Array+1) returns 10 when CTRL+SHIFT+ENTERED

2. Use Sumproduct, which convertes its arguments into array types:

=SUMPRODUCT(My_Array+1) returns 10

3. Use the N() worksheet function:

=SUM(N(My_Array)+1) returns 10

Now the third one, using the N() worksheet function, is bizarre - to me, at least. Again, I can't find it documented anywhere that the N() worksheet function can do this? Perhaps it has a use after all!? :eeek:

Actually, there is a single method involving functions that admit array objects: That is Control+Shift+Enter.

A few functions like SumProduct, Frequency, and Lookup admit range objects as well as array objects.

N together with + in the 3rd formula yields an array object (first round of evaluation)...

=N(My_Array)+1

==>

{4,6}

which SUM can grind (second round evaluation).

So, getting back to array constants and the memory. Looking at these results, it would make sense to me that a named array constant has to pass through a particular layer of interpretation that an array constant doesn't have to; or, at least to say that they are dimensioned somewhat differently in the memory?

I tend to agree with the former line of thought rather than the "memory" line...

The three ways we can induce Excel to calculate on array constants and named array constants in the same way suggests to me that when we CTRL+SHIFT+ENTER a formula, we are just converting the relevant parts, such as range reference values, of that formula into array types. So, in turn, an array formula should simply be defined as a formula that contains an array type. This means that array formulas are not just CSE formulas, and any formula that uses Sumproduct, an array constant, or another array type construct, is in fact an array formula. A formula that contains a named array constant is not an array formula unless there is an array type conversion in there. Again, looking around the internet, there are plenty of array formula tutorials that describe array formulas and what they do, but none of them seem to specifically define what an array formula is...

I also characterize the formulas requiring CSE and the formulas with SumProduct (always), Frequency, and Lookup as range and array grinding formulas.
 
Upvote 0
Hi xld,
I think it is not that we have with a named array constant, we have to tell Excel to process it as an array. When it is an in-line array constant, such as SUM({3,5}) Excel can work that out itself. When it is a named array constant, because the name is evaluated on use, Excel doesn't know that SUM(My_Array) is an array, so it has to be informed by array entering the formula.
Okay, thanks for that input. I'll have a careful think about it!

Hi Aladin
Colin,

Up to now I hadn't looked in this thread...
The points you rise and observations you make are interesting and valid....
....I also characterize the formulas requiring CSE and the formulas with SumProduct (always), Frequency, and Lookup as range and array grinding formulas.
Well, I'm glad you stopped by and that you found it interesting! :)
And I'm pleased that my definition of an array formula is not altogether different from yours.

I'll have a careful read through your post and the link: I like to research the answers I'm given. Once I've let the information sink in and I've played around with it I'll post back with a well considered reply (this may take a little while depending on how much playing around I have to do).

Cheers,
Colin
 
Upvote 0
Okay, I've done a bit of work on this.

Aladin - I agree that N() is able to de-reference name objects; hence, in our case, it effectively returns a literal array onto the stack.


In terms of different results between named array constants and literal arrays, I was not able to determine at the deepest level exactly how this is done. Whilst some technical documentation exists for how array constants in formulas are parsed, there's only vague information on how names are stored in the names table and there is even less information about the various contexts that will cause it to be de-referenced when it is pushed onto the stack. Still, I'm happy that I have a general understanding on how the tokens are working here, so this exercise has been productive.

As a passing point of interest, N() can also have a similar effect (although I think the inner working is slightly different) for VBA UDFs that return arrays. For example, in this thread here it was agreed that the Sumproduct version in post #4 had to be CSE entered to get the correct result (or to use a CSE formula with the SUM worksheet function), because of TRANSPOSE:
CSE:
=SUMPRODUCT(TRANSPOSE(IsNotFormula(G1:I1))*(G2:I2))
or
CSE:
=SUM(TRANSPOSE(IsNotFormula(G1:I1))*(G2:I2))

Well, we could work around it using N(); this does not need to be CSE entered:
=SUMPRODUCT(TRANSPOSE(N(IsNotFormula(G1:I1)))*(G2:I2))

So I think that it has some practical applications.

Pity it can't do it for Range references!

Thanks again for your help.
Colin
 
Upvote 0

Forum statistics

Threads
1,225,155
Messages
6,183,208
Members
453,151
Latest member
Lizamaison

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top