Is it possible to build a macro to import similar, but different xml/text files?

alparsons672

New Member
Joined
Feb 10, 2023
Messages
1
Office Version
  1. 365
  2. 2021
Platform
  1. Windows
Hello. I am trying to build a macro to import data from an external file that is generated with another program. I have two options for my source file; an .xml file and a .txt file. Both contain essentially the same data that I want to manipulate into my end product. My plan is to be able to copy a workbook with this macro into the directory of an individual project and then have it pull the data pertaining only to that project. Each project will have its own workbook and source files; I am not trying to make a single workbook to collect all of the data from each individual project.

The problem I am running into is that each source file (pertaining to each individual project) is slightly different. Attached is my attempt at showing the problem I am working with. The columns represent the raw data if imported from the .txt file. The table is what I want my end product to be.

There are some rules certain data follows that I figured I could exploit. In the blue dataset each "Group" has an entry for "AnimalsMice" even though only one group has a value for the amount of mice in that group. For data pertaining to passengers ("People" and "Animals") this will always be the case. I believe I am fortunate because this is the data that I want.

There is some data present that I know I do not need and would like to simply delete. The "Small Bus", "Plane", and "Candy" entries are part of a much more finite set of data that I think I could hard code removal into.
For instance, I know the transportation, if present, will only ever be "Small Bus", "Plane", or "Large Bus". The snacks will only ever be "Candy" or "Chips". Hopefully this makes sense.

Each project will have very similar data categories, but are not guaranteed to be exactly the same. For instance, the yellow dataset also has an additional "Group". With my real data, I could have anywhere from a few dozen groups to a few hundred.
My largest problem has been the variation in categories of data. In the yellow dataset, each "Group" has an entry for the amount of "AnimalsBirds" while the blue dataset does not. In my real data, there might be about 30-50 total categories/columns that could be present. This is what has caused problems with importing the data in the .xml format. If I record a macro and import my xml file the end code looks something like this:

VBA Code:
Sub Import_XML()
' Import_XML Macro
    ActiveWorkbook.Queries.Add Name:="SECTION", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Xml.Tables(File.Contents(""file location here""))," & Chr(13) & "" & Chr(10) & "    Table0 = Source{0}[Table]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Table0,{{""Group"", type text}, {""Id"", Int64.Type}, {""Snack"", type text}, {""PeopleBoys"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    'I removed most of the categories here in order to make it simpler to read, I assume these are generated based on what is read from the source file during the query
    'The actual amount of column types gets quite high
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=SECTION;Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [SECTION]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "SECTION"
        .Refresh BackgroundQuery:=False
    End With
End Sub

I hope this makes sense. I am having a difficult time because I am not super experienced with making macros or working in VBA. I know what my data looks like and what I want it to look like in the end, but I am having a hard time figuring out the middle part. Most of the time I have been able to get by using the macro recorder and then editing the parts I understand to accomplish my goals. This is a little bit more of a beast than I am used to.
 

Attachments

  • Sample Data Transportation.png
    Sample Data Transportation.png
    65.3 KB · Views: 32

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)

Forum statistics

Threads
1,223,888
Messages
6,175,207
Members
452,618
Latest member
Tam84

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top