Extract text from XML file (multiple lines)

erikgrave1

New Member
Joined
Jul 22, 2015
Messages
9
Hi,
I´m struggling with extracting data from an XML file. All I need is the "Status" of each SID listed in XML file, where can be up to 50 different SIDs. This is the example of XML file.
I have solved extracting 1 status (function is down below), but it is really very slow (1 SID = 1 second, and I need to check around 50.000 in total). So my function will assign correct Status according the SID (S12345678 will be "in operation", S23456789 will be "out of operation" etc.) I think there should be used "For each ..." somehow.

Thank you very much. Your help is really appreciated. And even if there will be no solution for me, the code below will probable help someone :)

This is the XML file (I have added space after the <) :

Code:
  [TABLE]
<tbody>[TR]
[TD]< ?xml   version="1.0" encoding="UTF-8"?>[/TD]
[/TR]
[TR]
[TD]-< root>[/TD]
[/TR]
[TR]
[TD]-< record   type="tsacinv/system">[/TD]
[/TR]
[TR]
[TD]< status>in operation<   /status>[/TD]
[/TR]
[TR]
[TD]< systemid>S12345678<   /systemid>[/TD]
[/TR]
[TR]
[TD]< /record>[/TD]
[/TR]
[TR]
[TD]-< record   type="tsacinv/system">[/TD]
[/TR]
[TR]
[TD]< status>out of operation<   /status>[/TD]
[/TR]
[TR]
[TD]< systemid>S23456789<   /systemid>[/TD]
[/TR]
[TR]
[TD]< /record>[/TD]
[/TR]
[TR]
[TD]-< record   type="tsacinv/system">[/TD]
[/TR]
[TR]
[TD]< status>in operation<   /status>[/TD]
[/TR]
[TR]
[TD]< systemid>S34567890<   /systemid>[/TD]
[/TR]
[TR]
[TD]< /record>[/TD]
[/TR]
[TR]
[TD]< /root>[/TD]
[/TR]
</tbody>[/TABLE]

This is my function (working for 1 SID) :

Code:
Public Function statusAM(S As Range) As Variant

myURL = "" '<- here is the link for getting the XML file, using variable "S" from a cell for SID added to the link

Set xmlDoc = CreateObject("Microsoft.XMLDOM")
xmlDoc.SetProperty "SelectionLanguage", "XPath"
xmlDoc.async = False

With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", myURL, False
    .Send
    xmlDoc.LoadXML .responseText
End With

Set NodeXML = xmlDoc.getElementsByTagName("status")
statusAM = NodeXML(0).text

End Function
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
Here are 3 methods:

Code:
    Dim recordNodes As Object 'IXMLDOMNodeList
    Dim recordNode As Object 'IXMLDOMNode
    
    Set recordNodes = XMLdoc.SelectNodes("root/record")
    For Each recordNode In recordNodes
        Debug.Print recordNode.SelectSingleNode("systemid").Text; " ";
        Debug.Print recordNode.SelectSingleNode("status").Text
    Next
    
    Dim statusNodes As Object 'IXMLDOMNodeList
    Dim statusNode As Object 'IXMLDOMNode
    
    Set statusNodes = XMLdoc.getElementsByTagName("status")
    For Each statusNode In statusNodes
        Debug.Print statusNode.Text
    Next

    Set recordNodes = XMLdoc.SelectNodes("root/record")
    For Each recordNode In recordNodes
        Debug.Print recordNode.ChildNodes(1).nodeTypedValue; " ";
        Debug.Print recordNode.ChildNodes(0).nodeTypedValue
    Next
 
Upvote 0
Hello,

thank you John_w
all of them are perfect and work like a charm.
I used 3rd method and added to my script.
Is it also possible to get the ID of childNode somehow ? (in reverse direction)
Code:
getElementsByTagName("status")

-> output will be 1 (ChildNodes(1))

Many thanks
 
Upvote 0
I don't know what you're asking for. By 'ID' do you mean array index? The "status" child node in the example XML is always the 0th array index in the ChildNodes array, not 1. To find the array index of each "status" try this:

Code:
    Dim i As Long
    For Each recordNode In recordNodes
        For i = 0 To recordNode.ChildNodes.Length - 1
            If recordNode.ChildNodes(i).nodeName = "status" Then
                Debug.Print i
                Exit For
            End If
        Next
    Next
The above prints 0 for all 3 records.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,907
Messages
6,175,300
Members
452,633
Latest member
DougMo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top