Parsing a text file

Jimmy509

New Member
Joined
Apr 18, 2019
Messages
29
I have an xml file that is about 80MB. I constantly have to import it in excel and do filtering in it. Sadly, it takes too long to open and sometimes crash. So I have decided to manually break it down into pieces.
My steps are usually:
1-Use a software that split the files into multiple pieces. the
2-Track the starting and ending tags. The schema looks like this:
Rich (BB code):
    <satellite_system = Intelsat>
        <pfd_mask = integer_value>
            <by_a a = 50>
                 <by_b b = 0>
                       <pfd c="0">-177.1175</pfd>
                        <pfd c="10">-177.3175</pfd>
                  </by_b>
                 <by_b b = 1>
                     <pfd c="0">-174.617</pfd>
                      <pfd c="10">-174.817</pfd>
                  </by_b>
    </by_a>
  </pfd_mask>
</satellite_system>

3- Since each new file will be imported independently into excel, I have to make sure the opening and closing tags are included as well as the value for each tag. Because the files are split randomly at any lines.

4- Import the files into excel and filter each of them to get the highest pfd value.

Manually I have done that but it is time consuming. I have decided to automate that process. but all I am missing at this point to is to split the xml file into smaller pieces. I tried but failed. Now I am thinking If I turn the xml file into a text file, then would that be easier to achieve my goal. At this point what I am trying to achieve is:

1- Open the original text file and copy the first 5000 lines or so to a text file
2- add the closing tags at the end of the file :
</by_a>
</pfd_mask>
</satellite_system>
3- Save it as xxx_part1.txt
4- Somehow keep in memory the value for by_a a and by_b b
3-Copy the next 5000 lines.
5- add the opening tags
<satellite_system = Intelsat>
<pfd_mask = integer_value>
then add the last known value for by_a a and by_ b b
paste the 5000 lines
then add the closing tags
</by_a>
</pfd_mask>
</satellite_system>

6-Do that until the end of the original file.

This is the code I have started with:

VBA Code:
Sub Text_file_parser()
Dim FSO As FileSystemObject
Dim txtStream As TextStream
Dim DetFile As String
Dim sFile As String
Dim New_file As Object
Dim file_path, New_filename, New_filepath As String
Dim i, j, k As Integer
Dim header, body, footer As Variant
Dim File_size As Long


DetFile = "C:\\Documents\ RDTE\test"
sFile = Dir(DetFile & "\*.txt")
file_path = DetFile & "\" & sFile
'MsgBox file_path
New_filename = "New_" & sFile
New_filepath = DetFile & "\" & New_filename
'MsgBox New_filepath
'MsgBox file_path
Set FSO = New FileSystemObject
Set txtStream = FSO.OpenTextFile(file_path, ForReading, False)


'Set the number of File
File_size = FileLen(file_path) ' remove in function
Debug.Print File_size

If Not FSO.FileExists(New_filepath) Then

Set New_file = FSO.CreateTextFile(New_filepath, False, True) 
'Do Until txtStream.AtEndOfStream
i = 0
j = 0
k = 0

For i = 1 To 3
   ThisLine = txtStream.ReadLine
   'i = i + 1
   'Debug.Print "Line " & i, ThisLine
  New_file.WriteLine (ThisLine)

Next
Else
'Set txtStream = FSO.OpenTextFile(New_filepath, ForWriting, False)
Set New_file = FSO.OpenTextFile(New_filepath, ForWriting, False)
For i = 1 To 3
   ThisLine = txtStream.ReadLine
   'i = i + 1
   'Debug.Print "Line " & i, ThisLine
   
   New_file.WriteLine (ThisLine)

Next

  ' Loop
End If
Do While Not txtStream.AtEndOfStream
  
   'FooterLine = txtStream.ReadLine
  ' Debug.Print FooterLine
Loop



txtStream.Close
End Sub
 

Excel Facts

Control Word Wrap
Press Alt+Enter to move to a new row in a cell. Lets you control where the words wrap.
HI. I just stumbled across your post by accident, but I'm wondering if you've managed to solve this or if you still need help. If the former, how did you manage to resolve it? If not, is it possible to convert the XML file into another (less-verbose) format?
 
Upvote 0

Forum statistics

Threads
1,223,236
Messages
6,170,915
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top