Carl Colijn
New Member
- Joined
- Jun 23, 2011
- Messages
- 32
Hi all,
(bit of a cross-post here with social.technet.microsoft.com and answers.microsoft.com since I'm getting desperate)
Using Excel 2010 32-bit here: does anyone know if there are troubles to be expected when filling Excel tables (the ListObject kind) with a big number of calculated columns and a big number of rows? Like about 70000 rows and 136 column, of which 44 contain data and the other 92 contain somewhat complex formulas? The resulting .xlsb is 72MB in size; not exactly too large for Excel.
This leads to Excel 2010 to not being able to open these files anymore once you save and close them. It tries to open them, but then (after a long while) tells the workbook contains unreadable content, tries to repair it (up to attempt #3), and often finally crashes doing so.
I've also ran into this problem about a year ago (see my StackOverflow post), and have then managed it by reducing my formula complexity; the form then worked OK up to 40000-ish rows, which was still unsatisfying but adequate. This time round however, the table needs to house more calculated columns while the row count also needs to go up.
Some observations:
My conclusion is that Excel reaches an internal memory limit. It seems like Excel's loading routines just run out of memory converting the file into it's internal in-memory structure or such, and give up in the process, giving Excel the impression that the file is truncated or such. And .xlsx's obviously need more memory to decode than .xlsb's.
Doing some memory stress tests on my 8GB RAM PC, I see that while loading these files directly from Explorer with no Excel open, Excel's RAM usage steadily goes up, plateaus at a max, and then drops 20% again when loading is done. Some stats:
When loading a second 50000 row file while a first one is already open, Excel's ram usage goes up from 680MB to 970MB when Excel gives up. I.e. well below the 1200MB max while loading a 70000 row file, which works on my PC. And since I have the full 4GB ram limit available for 32-bit processes, you'd expect that Excel wouldn't just give up at about 1GB of ram usage...?
The problem is that this particular file needs to be processed by another Excel workbook, which splits the data in it up into several smaller files based on certain criteria. While opening the file directly from Explorer works, this is irrelevant once the file needs to be loaded by Excel from another workbook. I'm hesitant to rewrite the processing code to use a second Excel instance for the loading, since a lot of data inspection is needed in the process, and since the process is already painfully slow I do not want to add inter-process communication to further worsen it.
The nasty thing is also that I'm stuck with Excel 2010 32 bits with (I think) 4GB of RAM, since that is the type of hardware and software this solution needs to run on; my client can't just upgrade all his relevant PC's and Office installs for this process alone...
Any insight is really welcome! As well as a bug report to the Excel dev team
(bit of a cross-post here with social.technet.microsoft.com and answers.microsoft.com since I'm getting desperate)
Using Excel 2010 32-bit here: does anyone know if there are troubles to be expected when filling Excel tables (the ListObject kind) with a big number of calculated columns and a big number of rows? Like about 70000 rows and 136 column, of which 44 contain data and the other 92 contain somewhat complex formulas? The resulting .xlsb is 72MB in size; not exactly too large for Excel.
This leads to Excel 2010 to not being able to open these files anymore once you save and close them. It tries to open them, but then (after a long while) tells the workbook contains unreadable content, tries to repair it (up to attempt #3), and often finally crashes doing so.
I've also ran into this problem about a year ago (see my StackOverflow post), and have then managed it by reducing my formula complexity; the form then worked OK up to 40000-ish rows, which was still unsatisfying but adequate. This time round however, the table needs to house more calculated columns while the row count also needs to go up.
Some observations:
- I've already ruled out that the workbook I start with is corrupt (I've re-created it a few times again from scratch already before).
- If you reduce the number of (calculated) columns or rows, the problem disappears.
- If you first completely close Excel and then open these big files directly from Explorer, Excel manages to open files with a somewhat bigger number of rows.
- If I create a 50000 row .xlsb file and make a copy of it in Explorer (thus identical files), opening the first one goes fine, but subsequently opening the copy in the same Excel gives the same problem. Opening the second one in a separate 2nd Excel process works OK (as expected).
- Files saved as .xlsb can have about 2x more rows in them before Excel croaks.
- When saving a working .xlsb version that is near it's row limit as an .xlsx, the generated .xlsx will be considered broken by Excel.
My conclusion is that Excel reaches an internal memory limit. It seems like Excel's loading routines just run out of memory converting the file into it's internal in-memory structure or such, and give up in the process, giving Excel the impression that the file is truncated or such. And .xlsx's obviously need more memory to decode than .xlsb's.
Doing some memory stress tests on my 8GB RAM PC, I see that while loading these files directly from Explorer with no Excel open, Excel's RAM usage steadily goes up, plateaus at a max, and then drops 20% again when loading is done. Some stats:
- 30000 rows: max 530MB, finally 420MB
- 50000 rows: max 890MB, finally 680MB
- 70000 rows: max 1200MB, finally 950MB
When loading a second 50000 row file while a first one is already open, Excel's ram usage goes up from 680MB to 970MB when Excel gives up. I.e. well below the 1200MB max while loading a 70000 row file, which works on my PC. And since I have the full 4GB ram limit available for 32-bit processes, you'd expect that Excel wouldn't just give up at about 1GB of ram usage...?
The problem is that this particular file needs to be processed by another Excel workbook, which splits the data in it up into several smaller files based on certain criteria. While opening the file directly from Explorer works, this is irrelevant once the file needs to be loaded by Excel from another workbook. I'm hesitant to rewrite the processing code to use a second Excel instance for the loading, since a lot of data inspection is needed in the process, and since the process is already painfully slow I do not want to add inter-process communication to further worsen it.
The nasty thing is also that I'm stuck with Excel 2010 32 bits with (I think) 4GB of RAM, since that is the type of hardware and software this solution needs to run on; my client can't just upgrade all his relevant PC's and Office installs for this process alone...
Any insight is really welcome! As well as a bug report to the Excel dev team
