Excel's XML SOURCE window showing heirarchy - Need to transfer or extract it's contents ASAP

ChrisOK

Well-known Member
Joined
Mar 26, 2003
Messages
601
As a long time member of MrExcel, I'm hoping all the Aladin Akyurek genius-types of folks can provide some desperately needed help -- I've been searching for weeks and not finding a *single soul* that knows anything on this topic or it's Excel functionality which seems shocking!?!

With client deadline fast approaching and knowing that it's sometimes hard to visualize; I recently posted on 2 sites that have the ability to upload a sample pic file - to better explain..
Cross-post link disclaimer is available if seeing the uploaded visual is helpful:
https://www.excelforum.com/excel-pr...336-extract-heirarchy-from-source-window.html
http://www.msofficeforums.com/excel-programming/37789-extract-heirarchy-source-window.html

The image shows the several tiers of the Parent Child relationships within the EXCEL XML SOURCE window. (there's at least 6 tiers)

I simply need everything that's on this list to be layed into a spreadsheet or document.
Ideally, if it could mirror the architecture that'd be great to see visually at a glance how many layers deep an element lies...

Heck, I'll even take it in a hard to read layout like:
Level.........Name........
1..............Abc (which is obviously the parent as a '1')
2..............Bbc (level 2 is a child to the parent listed above it)
3..............Ccc (level 3 here is a child to the child)
1..............Ddd
2..............Eee
1..............
2..............
3..............
4..............
5..............
6..............

Ultimately the client needs to see (with as much ease as possible) a full list of everything in that map... to gauge magnitude... next to see how many parents there are -- and determine how many levels inward they want us to go as a dev team -- to tackle issues within a system.

When I select all, copy & paste everything in that XML Source window, (paste it long ways along row one of a sheet) -- it holds over 8,000 items
I'm finding that it is leaving out the PARENTS which is a critical piece...
So, I don't have any idea HOW MANY items are REALLY within that XML Source window because what got pasted is NOT all inclusive! Frustrating!

The main 4 things I need are the PARENT, REQUIRED PARENT, CHILD and REQUIRED CHILD (based on the icons shown in the XML Source window, those are the 4 items I need a list of)

Need help with some VBA or a method for grabbing everything in that window and pasting it either into an Excel sheet - -or even into Word if that's easier --

Any ideas how to pull out what's being displayed visually within the Excel files XML Source window?
Thanks in advance -- praying SOMEONE -- ANYONE knows something about this window and how to get what's in it --- OUT!
Chris
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
AWESOME! There's at least ONE person in this world that knows something about this -- or is willing to take a stab at the challenge!! I'm so excited!!
t appears these links are public -- let me know if this first one is accessible:
I had to download the .xsd file to pc. Open Excel, Developer tab > Source > XML Maps.. (button at bottom), ADD, navigate to .xsd file to display it within Excel's XML Source window. That beautiful hierarchy is what I need to extract and lay into the xls/xlsm file or into a word doc.
Thought SURELY there's a way to grab it using VBA!?!?!

https://www.acq.osd.mil/dpap/pdi/eb/procurement_data_standard.html
PDS V2.5.1 Schema (.xsd)
 
Upvote 0
Hi;

Actually the URL you have given is not a well designed XML file so I couldn't get the node values with classical approach.

Instead of it, I have done a small loop to see what the document is housing.

You can construct more nested loops to achieve the childnodes if you want.

Code:
Sub TestXML_3()
    'Haluk
    
    Dim xDoc As Object
    ActiveSheet.Cells.Clear
    
    Set xDoc = CreateObject("MSXML2.DOMDocument")
    xDoc.async = False
    xDoc.validateOnParse = False
    
    strURL = "https://www.acq.osd.mil/dpap/pdi/eb/docs/PDS_Schema_v2.5.1.xsd"
    xDoc.Load strURL
     
    Set MyList = xDoc.DocumentElement
    docLength = MyList.ChildNodes.Length
    
    j = 1
    For i = 0 To docLength - 1
        Range("A" & j) = MyList.ChildNodes(i).BaseName
        Range("B" & j) = MyList.ChildNodes(i).Text
        If MyList.ChildNodes(i).HasChildNodes Then
            j = j + 1
            Range("C" & j) = MyList.ChildNodes(0).Text
        End If
        j = j + 1
    Next
End Sub
 
Upvote 0
Well-- (results attached) -- I started w/ a new XLSM file, added the XML to the Source window, saved.
Opened VBA window, created a new Mod1 in that file and added the code, ran and this is what I came up with - but it's not holding the items shown in the hierarchy?
Is there something else I needed to do - in order to see the hierarchy extracted and layed in -- in some way?
Thank you!

Here's a link to view the results image once I ran the code:
https://app.box.com/s/gk4hmk6uqyu3hg4s8hxnjupn7eirsf0p

Actually, here's the excel file itself:
https://app.box.com/s/us7y4xaui9qyvnyxq9lo91uasvgqs84f
 
Upvote 0
Try this macro, which outputs the structure of the .xsd file in a hierarchical layout to cells in the first sheet of the macro workbook. The macro expects the PDS_Schema_v2.5.1.xsd file to be in the same folder as the macro workbook, because reading the file locally is faster than reading it via the URL "https://www.acq.osd.mil/dpap/pdi/eb/docs/PDS_Schema_v2.5.1.xsd".

The output produced by the macro seems to match the XML Source shown by Excel, though I haven't checked everything. I don't know how Excel parses and interprets the .xsd file to get its XML Source layout, so the 'rules' I have implemented in the macro are based on trying to make the macro's output match the XML Source, rather than my very limited knowledge of XML/XSD files. For example, the element name "AlternateIdentifier" has the type "AlternateIdentifierType" and the code finds the complexType with name "AlternateIdentifierType" and outputs the constituent elements "Description" and "Value". See the comments for more details.

Note: the code uses early binding of the MSXML2 object library, so that variables can be declared with the appropriate data types. Therefore you must set the reference Microsoft XML v6.0, via Tools -> References in the VBA editor.

Code:
Option Explicit

Public Sub Extract_XML_Document_Structure()
    
    'Requires reference: Microsoft XML v6.0
    
    Dim XMLdoc As DOMDocument60
    Dim XMLmainNode As IXMLDOMNode
    Dim destCell As Range
    Dim rowOffset As Long
    
    Set XMLdoc = New DOMDocument60
    'Set XMLdoc = CreateObject("MSXML2.DOMDocument.6.0") 'for late binding
    
    With XMLdoc
        'The .xsd file contains the following schema:
        '< xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="07112016">
                
        'How To Specify Namespace when Querying the DOM with XPath - https://support.microsoft.com/en-us/help/294797
        
        .async = False
        .validateOnParse = False
        .SetProperty "SelectionLanguage", "XPath"
        .SetProperty "SelectionNamespaces", "xmlns:xs='http://www.w3.org/2001/XMLSchema'"
        .Load ThisWorkbook.Path & "\PDS_Schema_v2.5.1.xsd"
    End With
    
    With ThisWorkbook.Worksheets(1)
        .Cells.ClearContents
        Set destCell = .Range("A1")
    End With
    
    'Get the main document node, < xs:element name="ProcurementDocument">.  All data elements are contained in this node
    
    Set XMLmainNode = XMLdoc.SelectSingleNode("//xs:element [@name='ProcurementDocument']")
    
    Scan_ElementsToCells XMLdoc, XMLmainNode, destCell, rowOffset, 0
    
    MsgBox "Done"

End Sub


Private Sub Scan_ElementsToCells(XMLdoc As DOMDocument60, node As IXMLDOMNode, destCell As Range, ByRef rowOffset As Long, ByVal colOffset As Long)
    
    Dim child As IXMLDOMNode
    Dim element As IXMLDOMElement
    Dim findNode As IXMLDOMNode
    Dim nameAttribute As Variant, lookupAttribute As Variant
    
    If node.NodeType = NODE_ELEMENT Then
        Set element = node
        
        'Debug.Print element.nodeName
        
        If element.nodeName = "xs:element" Then
            nameAttribute = element.getAttribute("name")
            'Debug.Print destCell.offset(rowOffset, colOffset).Address, element.nodeName & " " & nameAttribute
            destCell.offset(rowOffset, colOffset).Value = nameAttribute
            rowOffset = rowOffset + 1
            colOffset = colOffset + 1
        End If
            
        If element.nodeName = "xs:element" Or element.nodeName = "xs:complexType" Or element.nodeName = "xs:extension" Or element.nodeName = "xs:group" Then
            
            'If element has a 'type' or 'base' attribute then find and parse its complexType element
            'Example 1: Element < xs:element name="AlternateIdentifier" type="AlternateIdentifierType"
            '           Find    < xs:complexType name="AlternateIdentifierType">
            'Example 2: Element < xs:extension base="ProcurementInstrumentType"/>
            '           Find    < xs:complexType name="ProcurementInstrumentType"
            
            'Or if element has a 'ref' attribute then find and parse its group element
            'Example 1: Element < xs:group ref="ProcurementInstrumentLocationGroup">
            '           Find    < xs:group name="ProcurementInstrumentLocationGroup">
            
            lookupAttribute = element.getAttribute("type")
            If IsNull(lookupAttribute) Then lookupAttribute = element.getAttribute("base")
            If IsNull(lookupAttribute) Then lookupAttribute = element.getAttribute("ref")
            
            If Not IsNull(lookupAttribute) Then
                'Debug.Print lookupAttribute
                Set findNode = XMLdoc.SelectSingleNode("//xs:complexType [@name='" & lookupAttribute & "']")
                If findNode Is Nothing Then Set findNode = XMLdoc.SelectSingleNode("//xs:group [@name='" & lookupAttribute & "']")
                If Not findNode Is Nothing Then
                    If findNode.HasChildNodes Then
                        For Each child In findNode.ChildNodes
                            Scan_ElementsToCells XMLdoc, child, destCell, rowOffset, colOffset
                        Next
                    End If
                End If
            End If
        
        End If
        
    End If
    
    'Traverse child nodes
    
    If node.HasChildNodes Then
        For Each child In node.ChildNodes
            Scan_ElementsToCells XMLdoc, child, destCell, rowOffset, colOffset
        Next
    End If
    
End Sub<xs:group ref="ProcurementInstrumentLocationGroup"><xs:group name="ProcurementInstrumentLocationGroup"><xs:element name="AlternateIdentifier" type="AlternateIdentifierType" 
 <xs:complextype name="AlternateIdentifierType"><xs:extension base="ProcurementInstrumentType"><xs:complextype name="ProcurementInstrumentType" 
 <xs:group ref="ProcurementInstrumentLocationGroup"><xs:group name="ProcurementInstrumentLocationGroup">
</xs:group></xs:group></xs:complextype></xs:extension></xs:complextype></xs:element></xs:group></xs:group>
 
Upvote 0
OMG John -- this feels like a big piece of candy sitting in front of me and can't take a lick!
I created a new xlsm file, save in the same location as where the imported xml file sits... copy/pasted code into a NEW module 1 -- ran it and get an error...
Went back and re-read detail -- saw that I needed to mk sure an Object Library item needed to be checkboxed --- went there and NOT seeing that item!?!
Any ideas what I can do -- to get it into the list to ensure it's checkmarked?
I'm SOOO excited to see there's something that might save the day -- because I told I boss - that if a solution using all the diff tools I've attempted could not be found -- then I'd start HAND TYPING out this stupid architecture from the source window... which is a painful option -- but ----- well, hoping you can help w/ what's causing the error --"Compile Error-User-Defined Type Not Defined"
and here's the line that's highlighted:
Code:
Private Sub Scan_ElementsToCells(XMLdoc As DOMDocument60, node As IXMLDOMNode, destCell As Range, ByRef rowOffset As Long, ByVal colOffset As Long
If it's the Object Library -- then how to get the item needed INTO that library?

Here's a snap shot of where the error occurred within the code and images of what I'm seeing in the Object Library...
https://app.box.com/s/jasqcvzrk1uqu0q87nr5ae1b8ek0osmk
Hope this helps someone -- help me----

Thank you greatly --- salivating to see what this code can do!?!?
Chris
 
Upvote 0
Went back and re-read detail -- saw that I needed to mk sure an Object Library item needed to be checkboxed --- went there and NOT seeing that item!?!
Any ideas what I can do -- to get it into the list to ensure it's checkmarked?
I can see the library you need (Microsoft XML, v6.0) at the top left of the image you posted, listed first under 'Available References'. Just click the box next to Microsoft XML, v6.0 to select it and click OK, and then the code should compile (Debug menu -> Compile) and run without error.
 
Upvote 0
When exactly does that error occur when you run the macro?

Since Scan_ElementsToCells calls itself, the error could occur when Scan_ElementsToCells is called for the first time from the main procedure, Extract_XML_Document_Structure, or it could occur on one of the numerous times it calls itself. If no data has been output to the first sheet then the former case applies, otherwise it indicates the latter.

A quick check to see if the error occurs when Scan_ElementsToCells is called from Extract_XML_Document_Structure is to add a new line, as shown here:
Code:
    Set XMLmainNode = XMLdoc.SelectSingleNode("//xs:element [@name='ProcurementDocument']")
    If XMLmainNode Is Nothing Then Stop  'New line
If the code stops in the VBA debugger at the new line it means the preceding line did not find the element "< xs:element name="ProcurementDocument">" in the PDS_Schema_v2.5.1.xsd file.
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,169
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top