Using Notepad to strip HTML formatting, then paste to Excel

fizzy6

New Member
Joined
Jul 13, 2013
Messages
3
This question is similar to many "Notepad posts," but I've not been able to adapt the solutions to my use. Specifically, I'm not trying to write to a text file. I'm trying to use Notepad to strip out HTML formatting before pasting the resulting HTML-free text into a one-column Excel range. (PasteSpecial method is not working reliably in Excel, perhaps due to timing issues or because of control codes in the source.)

I want to do the following from inside a VBA subroutine that runs as a result of a button click in Excel 2007:

1) In the initialization section of the subroutine, open a notepad window. Currently I'm using
Code:
Shell "notepad.exe"

2) During the main loop of the subroutine, I get to a point where an entire web page (i.e., the HTML source) is on the windows clipboard. At that point, the code should:

Code:
' Move focus to Notepad window if necessary
' Clear the previous contents of the Notepad window
' Paste Windows clipboard contents to the Notepad window
' Wait for paste to complete, which takes 1 to 60 seconds
' Copy contents of Notepad window to Windows clipboard
' Move focus back to Excel spreadsheet
' Paste Clipboard into a specific cell

Parameters:
1) The window titles (Notepad and Excel) are fixed and never change.
2) The Excel cells are static and do not change
3) I'm looking for a simple solution that's (hopefully) easy to understand.

Thanks in advance for your assistance.
 
Last edited:

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
2 sample files , source and xls desired result would be helpful

Unfortunately, web site is proprietary. But any web site would do. I want to copy the entire page (e.g., using ^a ^c) , paste the entire page to Notepad (e.g., ^v), then copy the entire Notepad page (e.g., ^a ^c) and paste it into Excel. But in case it's helpful, I show below some awful code that works 90% of the time. I rely on popping up system modal message boxes, which let the user monitor the process and click "OK" when each critical operation seems to have succeeded.

The main problem is that the main paste operation into Notepad sometimes fails to occur at all. The SendKeys command that places "Hello, world" into Notepad window always succeeds, but the subsequent paste operation fails. The data is on the clipboard, since a manual paste succeeds.

Second problem is that it's really slow. I have to wait between SendKey operations, since system response is erratic (due to Internet connection and other unknown reasons). Since so far I've been unable to programmatically monitor the success of each SendKeys operation, I just wait several hundred milliseconds, hoping the operation will complete. I'd much rather do something like
Code:
x = hWndNotepad.paste
x = hWndNotepad.selectall
x = hWndNotepad.copy

Here's some code I've written since my original post that works, but not very well. I've left out the error processing, dimension statements and similar items in the interest of brevity.

Thanks again for advice.

Code:
Private Declare Function MessageBox Lib "user32" Alias "MessageBoxA" (ByVal HWnd As Long, ByVal lpText As String, ByVal lpCaption As String, ByVal wType As Long) As Long
        
Declare Function FindWindow Lib "user32" Alias "FindWindowA" (ByVal lpClassName As String, ByVal lpWindowName As String) As Long

Declare Function SetForegroundWindow Lib "user32" (ByVal HWnd As Long) As Long

Sub Send1(ByVal xSend As String, xReps As Long, SleepMilliseconds As Long)
    Dim xCount As Long
    For xCount = 1 To xReps
        Sleep SleepMilliseconds
        SendKeys xSend
    Next
End Sub

Sub ProcessSSN()

Worksheets("Sheet1").Activate
        With ActiveSheet
        
'===============================
'  Get the data from the web page
'===============================
                ' SendKeys mode
                ' ---------------------------------
                ' 1) Make sure IE is active
                ' 2) Paste Index
                ' 3) Navigate to "Search" button
                ' 4) Click the search button
                ' 5) Put up modal window = "Wait until results come back, then press OK"
            
                AppActivate IESearchScreen
                ' Code assumes that the user  has put the cursor in the data entry box
                Send1 CurrentIndex, 1, xTinyDelay            ' Punch in the Index
                Send1 "{TAB}", 3, xTinyDelay             ' 3 TAB characters to get to [Search] link
                Send1 "~", 1, xTinyDelay                ' Send Enter] to activate the search
                ' Note: If this code seems to be broken, try increasing the value of xDelay

                M1 = "Wait until the search results appear, then click [OK]." & CRLF
                msgboxAnswer = MessageBox(&H0, M1, "", vbSystemModal + vbOK)

                ' At this point the user has clicked "OK" and the GotoData link is in view.
                ' 1) Tab to GotoData link
                ' 2) "Click" the GotoData link
                ' 3) Put up modal window = "Wait until GotoData screen appears"
                
                AppActivate IESearchScreen
                Send1 "{TAB}", 8, xTinyDelay            ' TAB characters to get to [GotoData] link
                Send1 "~", 1, xTinyDelay                ' Send Enter] to click on the GotoData link

                M1 = "Wait for the normal Data screen to appear, and then click OK."
                msgboxAnswer = MessageBox(&H0, M1, "", vbSystemModal + vbOK)

                ' Still in SendKeys mode
                AppActivate IESearchScreen
                Send1 "^a", 1, xDelay
                Send1 "^c", 1, xDelay
                DoEvents    ' Not sure if we need this or not. It does not seem to hurt.
                
                ' At this point, the contents of the GotoData screen should be
                ' on the Windows Clipboard
        End With 'ActiveSheet
                
'===============================
'  Paste the desired data into worksheet
'===============================
        ' Find the Notepad window.
        Do
                DoEvents
                HWnd = FindWindow(vbNullString, "Untitled - Notepad")
        Loop Until HWnd > 0
        
        ' Clear out Notepad window, then paste in the desired data
        ' The delay occurs before the indicated characters are sent
        SetForegroundWindow HWnd
        Send1 "^a", 1, xDelay                 ' Clear out previous data
        Send1 "{DEL}", 1, xDelay
        Send1 "Hello, world", 1, xDelay
        Send1 "^v", 1, xDelay
        Send1 "^a", 1, xDelay
        Send1 "^c", 1, xDelay

        M1 = "Click [OK] if processing is proceeding normally."
        msgboxAnswer = MessageBox(&H0, M1, "", vbSystemModal + vbOK)

        ' Set PasteRange = Worksheets("Sheet1").Range("a24:a1000")
        ' Clear the previous contents of the paste range
        ' PASTE  data into cell A24 and following. Data will go in column order (i.e., A24, A25,  A26 etc.)
        Worksheets("Sheet1").Activate
        With Worksheets("Sheet1")
                Set PasteRange = .Range("A24:A1000")
                PasteRange.ClearContents
                ' Moving the visual focus here helps user know data is processed
                Range("A24").Select
                
                ' This code seems to fail randomly. Using Notepad approach because of this inconsistency.
                ' ActiveSheet.PasteSpecial Format:="Text", Link:=False, DisplayAsIcon:=False
                
                ActiveSheet.Paste
                
        End With

End Sub
 
Upvote 0
did you try a web query like this ?
Code:
Sub Web_QueryBooking2()
Dim wsWQ As Worksheet
Dim rCell As Range
Set wsWQ = Sheets(1)
date1 = Cells(1, 1)
day1 = Day(date1)
month1 = Month(date1)
year1 = Year(date1)
day2 = Day(date2)
month2 = Month(date2)
year2 = Year(date2)

myURL = "http://www.booking.com/searchresults.html?src=index&error_url=http%3A%2F%2Fwww.booking.com%2Findex.it.html%3Flabel%3Dgog235jc-index-XX-XX-XX-unspec-it-com%3Bsid%3Ded836f22ccab0716e509dfad443e468c%3Bdcid%3D1%3B&dcid=1&label=gog235jc-index-XX-XX-XX-unspec-it-com&sid=ed836f22ccab0716e509dfad443e468c&si=ai%2Cco%2Cci%2Cre%2Cdi&ss=Roma%2C+Italia&checkin_monthday=" & day1 & "&checkin_year_month=" & year1 & "-" & month1 & "&checkout_monthday=" & day2 & "&checkout_year_month=" & year2 & "-" & month2 & "&org_nr_rooms=1&org_nr_adults=2&org_nr_children=0&group_adults=2&group_children=0&dest_type=city&dest_id=-126693&ac_pageview_id=114c445b512f002d"
With wsWQ.QueryTables.Add(Connection:="URL;" & myURL, Destination:=wsWQ.Cells(3, 1))
        .Name = "search?hl=en&ie=UTF-8&oe=UTF-8&q=" & strSearch & ""
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = False
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        '.RefreshStyle = xlInsertDeleteCells
        .RefreshStyle = xlOverwriteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlAllTables
        .WebFormatting = xlWebFormattingAll
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .Refresh BackgroundQuery:=False

End With

End Sub
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,228
Messages
6,170,871
Members
452,363
Latest member
merico17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top