If anyone is having Out of Memory errors running procedures that use InStr on large docs, the following may help.
For the past few months, I've been fighting what seems like a memory leak and periodic Out of Memory errors when running a procedure that iteratively searches XML docs. I've posted questions about aspects of this and members of this board have been very helpful.
What seems be a key part of the code is posted below.
Last night I ran an updated procedure that saved the Excel WorkingSet memory use at each iteration and I noticed today that (some) of the XML docs that are not searched correspond with Excel having a high WorkingSet allocation (600+ mb).
Part of the procedure I ran last night uses InStr to search the XML docs. I've also been working on learning parsing so I can use this method to search the docs, instead of using InStr, thinking it might be faster. So the following improved memory usage result was unexpected.
The procedure below has two .Load commands; the first is for a small-ish doc and the second is for a large doc. There are two sections between '++++, to make them easier to identify, that I alternated between to test memory use. The first section (obviously) uses parsing and the second (currently commented out) uses InStr.
The results below are for that second large doc I referred to above.
Note that I'm a rookie at most of this (parsing in particular) so my syntax may be the problem. If anyone can suggest an improvement that will help memory use, I would be grateful.
'Step Through Max Mem' means I was stepping through the code, watching Task Manager and noting the memory use.
Obviously this isn't a scientific test. There are things running in background that will impact results, I was watching Task Manager rather than using some memory management software etc etc, but I ran this several times commenting out one of the two methods, saving, exiting excel, starting again and stepping through the procedure. The results were nearly identical so it seems to me the large increase in memory use is tied to InStr.
If anyone has another idea as to what is causing the memory use peak, I'm really interested to hear about it.
For the past few months, I've been fighting what seems like a memory leak and periodic Out of Memory errors when running a procedure that iteratively searches XML docs. I've posted questions about aspects of this and members of this board have been very helpful.
What seems be a key part of the code is posted below.
Last night I ran an updated procedure that saved the Excel WorkingSet memory use at each iteration and I noticed today that (some) of the XML docs that are not searched correspond with Excel having a high WorkingSet allocation (600+ mb).
Part of the procedure I ran last night uses InStr to search the XML docs. I've also been working on learning parsing so I can use this method to search the docs, instead of using InStr, thinking it might be faster. So the following improved memory usage result was unexpected.
The procedure below has two .Load commands; the first is for a small-ish doc and the second is for a large doc. There are two sections between '++++, to make them easier to identify, that I alternated between to test memory use. The first section (obviously) uses parsing and the second (currently commented out) uses InStr.
The results below are for that second large doc I referred to above.
Note that I'm a rookie at most of this (parsing in particular) so my syntax may be the problem. If anyone can suggest an improvement that will help memory use, I would be grateful.
Code:
WorkingSet Parsing WorkingSet InStr
Start Excel 52 mb 52 mb
Open .xlsm 149 mb 149 mb
Open VBA window 156 mb 156 mb
.Load 380 mb 379 mb
Step Through Max Mem 846 mb ** 385 mb **
Exit Sub 165 mb 168 mb
'Step Through Max Mem' means I was stepping through the code, watching Task Manager and noting the memory use.
Obviously this isn't a scientific test. There are things running in background that will impact results, I was watching Task Manager rather than using some memory management software etc etc, but I ran this several times commenting out one of the two methods, saving, exiting excel, starting again and stepping through the procedure. The results were nearly identical so it seems to me the large increase in memory use is tied to InStr.
If anyone has another idea as to what is causing the memory use peak, I'm really interested to hear about it.
Code:
Sub tester3()
Dim oInstance As MSXML2.DOMDocument60
Dim n
Dim o
Dim oNodelist As MSXML2.IXMLDOMNodeList
Dim m_strUSGAAP_TaxonomyVersion As String
Dim m_strInvest_TaxonomyVersion As String
Dim m_strDEI_TaxonomyVersion As String
Set oInstance = New MSXML2.DOMDocument60
oInstance.setProperty "SelectionLanguage", "XPath"
oInstance.setProperty "SelectionNamespaces", "xmlns:xsl='http://www.w3.org/1999/XSL/Transform'"
oInstance.async = False
oInstance.validateOnParse = False
'oInstance.Load "http://www.sec.gov/Archives/edgar/data/1143908/000114390813000035/ck0001143908-20130831.xml"
oInstance.Load "http://www.sec.gov/Archives/edgar/data/7323/000006598413000050/etr-20121231.xml"
m_strContextForInstants = "ERROR"
m_strContextForDurations = "ERROR"
'++++++++++++++++++++*
Set n = oInstance.namespaces
Do
For Each o In n
If o = "http://fasb.org/us-gaap/2013-01-31" Then
m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2013-01-31"
m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2013-01-31"
NO_USGAAP_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://fasb.org/us-gaap/2012-01-31" Then
m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2012-01-31"
m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2012-01-31"
NO_USGAAP_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://fasb.org/us-gaap/2011-01-31" Then
m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2011-01-31"
m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2011-01-31"
NO_USGAAP_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://xbrl.us/us-gaap/2009-01-31" Then
m_strUSGAAP_TaxonomyVersion = "http://xbrl.us/us-gaap/2009-01-31"
m_strInvest_TaxonomyVersion = "http://xbrl.us/invest/2019-01-31"
NO_USGAAP_TAXONOMY_INFO = False
Exit Do
Else: NO_USGAAP_TAXONOMY_INFO = True
End If
Next o
Loop While False
'DEI Taxonomy
Do
For Each o In n
If o = "http://xbrl.sec.gov/dei/2013-01-31" Then
m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2013-01-31"
NO_DEI_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://xbrl.sec.gov/dei/2012-01-31" Then
m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2012-01-31"
NO_DEI_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://xbrl.sec.gov/dei/2011-01-31" Then
m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2011-01-31"
NO_DEI_TAXONOMY_INFO = False
Exit Do
ElseIf o = "http://xbrl.us/dei/2009-01-31" Then
m_strDEI_TaxonomyVersion = "http://xbrl.us/dei/2009-01-31"
NO_DEI_TAXONOMY_INFO = False
Exit Do
Else: NO_DEI_TAXONOMY_INFO = True
End If
Next o
Loop While False
'+++++++++++++++++++++
'If InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2013-01-31") <> 0 Then
'This IS the 2012 US GAAP taxonomy
' m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2013-01-31"
' m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2013-01-31" 'NEED TO CONFIRM THE DATE HERE
'ElseIf InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2012-01-31") <> 0 Then
'This IS the 2012 US GAAP taxonomy
' m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2012-01-31"
' m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2012-01-31"
'ElseIf InStr(1, oInstance.XML, "http://fasb.org/us-gaap/2011-01-31") <> 0 Then
'This IS the 2011 US GAAP taxonomy
' m_strUSGAAP_TaxonomyVersion = "http://fasb.org/us-gaap/2011-01-31"
' m_strInvest_TaxonomyVersion = "http://xbrl.sec.gov/invest/2011-01-31"
'ElseIf InStr(1, oInstance.XML, "http://xbrl.us/us-gaap/2009-01-31") <> 0 Then
'This IS the 2009 US GAAP taxonomy
' m_strUSGAAP_TaxonomyVersion = "http://xbrl.us/us-gaap/2009-01-31"
' m_strInvest_TaxonomyVersion = "http://xbrl.us/invest/2019-01-31"
'Else: NO_USGAAP_TAXONOMY_INFO = True
'End If
'DEI Taxonomy
'If InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2013-01-31") <> 0 Then
'This IS the 2012 DEI taxonomy
' m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2013-01-31"
'ElseIf InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2012-01-31") <> 0 Then
'This IS the 2012 DEI taxonomy
' m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2012-01-31"
'ElseIf InStr(1, oInstance.XML, "http://xbrl.sec.gov/dei/2011-01-31") <> 0 Then
'This IS the 2011 DEI taxonomy
' m_strDEI_TaxonomyVersion = "http://xbrl.sec.gov/dei/2011-01-31"
'ElseIf InStr(1, oInstance.XML, "http://xbrl.us/dei/2009-01-31") <> 0 Then
'This IS the 2009 DEI taxonomy
' m_strDEI_TaxonomyVersion = "http://xbrl.us/dei/2009-01-31"
'Else: NO_DEI_TAXONOMY_INFO = True
'End If
'+++++++++++++++++++++
If NO_USGAAP_TAXONOMY_INFO = True Or NO_DEI_TAXONOMY_INFO = True Then
Exit Sub
End If
oInstance.setProperty "SelectionNamespaces", "xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xbrli='http://www.xbrl.org/2003/instance' xmlns='http://www.xbrl.org/2003/instance' xmlns:xbrldi='http://xbrl.org/2006/xbrldi' xmlns:us-gaap='" & m_strUSGAAP_TaxonomyVersion & "' xmlns:dei='" & m_strDEI_TaxonomyVersion & "' xmlns:currency='http://xbrl.sec.gov/currency/2012-01-31' xmlns:invest='" & m_strInvest_TaxonomyVersion & "'"
Set oNodelist = oInstance.SelectNodes("//dei:EntityRegistrantName")
If oNodelist.Length > 0 Then
Debug.Print "found it: " & oInstance.SelectSingleNode("//dei:EntityRegistrantName").Text
Else
Debug.Print "didn't find it"
End If
End Sub
Last edited: