QuietRiot
Well-known Member
- Joined
- May 18, 2007
- Messages
- 1,079
- Office Version
- 365
- 2021
- Platform
- Windows
- MacOS
This is the code I'm using below and I've tried innerHTML and innerTEXT, but it removes any line breaks and such so it's just a wall of text instead of nicely formatted paragraphs. Any ideas on how I can accomplish keeping the text in the same format between the tags? When I view the file in notepad++ I can see that they are nicely formatted.
Code:
Sub ScrapeData()
Dim hDoc As MSHTML.HTMLDocument
Dim hElem As MSHTML.HTMLGenericElement
Dim sFile As String, lFile As Long
Dim sHtml As String
Dim x As Long
x = 1
'read in the file
lFile = FreeFile
sFile = "C:\Users\test\Desktop\test\htmltest.html"
Open sFile For Input As lFile
sHtml = Input$(LOF(lFile), lFile)
'put into an htmldocument object
Set hDoc = New MSHTML.HTMLDocument
hDoc.body.innerHTML = sHtml
'loop through tags
For Each hElem In hDoc.getElementsByTagName("FONT")
Cells(x, 1).Formula = hElem.innerText
x = x + 1
Next hElem
End Sub
Last edited: