EXCEL 2010: VBA: Download .csv file with dynamic name from a secure website

mrsklb

New Member
Joined
Feb 23, 2015
Messages
8
Program: EXCEL 2010
OS: Windows 8.1

I am wanting to login to a site (I have a login script), navigate to a page, then download the `.CSV` file that will always end with a dynamic string due to it being 'custom'.

I have tried to access the site by recording a macro. However, as the data is not in a table the macro recorder is not able to pick up the actual address of the .csv file.

The display text is always:
Code:
Results [link]Click to Download[/link]

The html values are always shown as:
Code:
[COLOR=#800000][FONT=Consolas]<td[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#FF0000][FONT=Consolas]class[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]=[/FONT][/COLOR][COLOR=#0000FF][FONT=Consolas]"smallText"[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]><b>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]Results[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]</b>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#800000][FONT=Consolas]<a[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#FF0000][FONT=Consolas]href[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]=[/FONT][/COLOR][COLOR=#0000FF][FONT=Consolas]"vendor_report.php?report=custom [insert extremely long string here] >[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]<u>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]Click to Download[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]</u></a></td>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR]

Without using a table, is there a way to get to this .csv & save it to my PC?
I am aware that the use of
Code:
<td>
denotes it is part of a table, but it is definitely not picking it up, I've gone through the site using the macro recorder and it's not picking up the inner contents from the page.

As a secondary solution (and not my preferred):
I had also thought to navigate to the site page, highlight the text, copy & paste to a spare sheet in my book, then use some code L42 previously wrote here (below) however I can't even get the copy & paste to work correctly.

Code:
    For Each hlink In ThisWorkbook.Sheets("NameOfYourSheet").Hyperlinks
        Set wb = Workbooks.Open(hlink.Address)
        wb.SaveAs saveloc & hlink.Range.Offset(0,1).Value & ".xlsx"
        wb.Close True
        Set wb = Nothing
    Next

Please advise. Thank you in advance.

11021207_10152843666235547_161674967640821083_n.jpg





[1]: Cross posted: stackoverflow
 
Last edited:

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Hi there,
Upon rereading my post I realised that some of the formatting was incorrect, as such the code wasn't showing correctly.

Due to my issues with how the code looks I have substituted the following:
[ or < as {
] or > as }

1) Here is how the Download links looks:
Code:
Results [link]Click to Download[/link]

2) Here is how the download link looks in html:
Code:
{td class="smallText"}{b}Results{/b} {a href="vendor_report.php?report=custom [insert extremely long string here]}{u}Click to Download{/u}{/a}{/td}

3) Here is how I know this belongs in another table, however I can't figure out which one (I have looped through all tables):
Code:
{td}

Thank you
 
Upvote 0
Bump

I have the link, however there is no file name, it looks like it may be a php mask of some sort. ends with: xxxxxxxx101a1cc0

Once clicked, the link then generates a ' Payments.csv ' file name & prompts the ' save as ' dialog.

I've tried a number of different ways to get through to the 'save as' dialog, however either IE crashes or the whole process stalls.

Please help.

Cheers.


IE = Ver: 11.09
Win = 8.1 x64
Excel = 2010
 
Upvote 0
Hi John,

Thank you for your help, I've furthered my ability to get a file, and i've had some success.
On the 2nd link you posted I was able to get the code to work and it downloaded a .csv, however it contained the page details in HTML form, it didn't capture the actual .csv hiding under the string.

Code:
{!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN"}{html dir="LTR" lang="en"}
{head}
{meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"}

I've had to replace the < with { and > with } to keep the code intact.


Many thanks.
 
Upvote 0
You have to send the same form data that your browser would send. It might be a GET request or a POST request, or both in sequence. It all depends on the site.

Code:
Dim formData As String
formData = "?????"
<code>WHTTP.Open "POST", fileUrl, False
WHTTP.SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
WHTTP.Send formData
</code>Use the console tool in your browser (or Fiddler) to examine the request and response bodies to determine the form data.
 
Upvote 0
Hi John,

Thanks for the direction, I've had a play with Fiddler, I don't know exactly what I'm looking for. I think this is far beyond what I can currently understand, however I have found these items :

Code:
GET /vendors.php?vf=vieworders HTTP/1.1
Code:
(under webforms) vf  (value) vieworders

When I click on the "Click to Download" button this pops up:
Code:
GET /vendor_report.php?report=custom&q={exceptionally long strong}

In the Chrome Console, this is the response I get:
Code:
Resource interpreted as Script but transferred with MIME type text/html: "https://seal.godaddy.com/getSeal?sealID={long id string}".
Resource interpreted as Document but transferred with MIME type text/csv: "https://www.{noname}store.com/vendor_report.php?report=custom&q={long}==&k={morelong}".

Could the above be causing the issue?

I've had success with the following, it allows me to pop open the download link, enter my username & password, but then I can't move further without manual intervention, tell IE that I want to save the file manually.
When I attempt to get the file without IE, I can't move forward at all, the Username & Password isn't ever accepted.

Code:
Sub GetDets_1XXXX3()
    'sales records are in table 25
  
    'Begin sales history retrieval
    'source: http://dailydoseofexcel.com/archives/2011/03/08/get-data-from-website-that-requires-a-login/
    'ensure tools/references/microsoft internet controls = check
    'ensure tools/references/microsoft forms 2.0 = check
    
    Dim ieApp As InternetExplorer
    Dim ieDoc As Object
    Dim ieTable As Object
    Dim clip As DataObject
     
     
     'create a new instance of ie
        On Error Resume Next
        Set ieApp = New InternetExplorer
    
     'you don’t need this, but it’s good for debugging
        ieApp.Visible = True
    
    'assume we’re not logged in and just go directly to the login page
        ieApp.Navigate "{{Jumbo String}}"
            'Application.Wait Now + TimeSerial(0, 0, 10)
        Application.Wait Now + TimeSerial(0, 0, 10)
        Do While ieApp.Busy: DoEvents: Loop
        Do Until ieApp.ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
             
          Set ieDoc = ieApp.Document
     
      'fill in the login form – View Source from your browser to get the control names
        With ieDoc.forms(2)
            .UserName.Value = "name"
            .Password.Value = "pass"
            .Submit
        End With
        'Do While ieApp.Busy: DoEvents: Loop
        'Do Until ieApp.ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
        Application.Wait Now + TimeSerial(0, 0, 20)
        
        'wait here for IE to popup the "Save" option
        
    'close 'er up
        ieApp.Quit
        Set ieApp = Nothing
        Sheets("XXXXX").Select
 
Upvote 0
Thanks for the direction, I've had a play with Fiddler, I don't know exactly what I'm looking for. I think this is far beyond what I can currently understand, however I have found these items :

Code:
GET /vendors.php?vf=vieworders HTTP/1.1
Code:
(under webforms) vf  (value) vieworders
You are sort of looking in the right place in Fiddler. The Inspectors - WebForms tab shows the parameter names and values separated by "&" in the URL query string - the part of the URL after the "?" character. For example page.php?index=23&id=2&actionID=112&a=3268b66be4108e96f2a4e67f2749bbdc&f=/data.csv

But you really need to look at the Inspectors - Raw tab for the relevant request (what Fiddler calls a session in its main window).

When I click on the "Click to Download" button this pops up:
Code:
GET /vendor_report.php?report=custom&q={exceptionally long strong}
Click that request (session) in the main Fiddler window and then click Inspectors - Raw tab. Click 'Open in Notepad' to make it easier to view the data. If that data contains the "Content-Type", "application/x-www-form-urlencoded" header then you will need to send the same form data in your WinHttp GET request as I previously posted. If the data contains a cookie (also seen via Inspectors - Cookies) then you may need to specify the cookie value (obtained from the IE.document) in a "Cookie" header, like this:
Code:
WHTTP.SetRequestHeader "Cookie", IE.document.cookie
In the Chrome Console, this is the response I get:
Code:
Resource interpreted as Script but transferred with MIME type text/html: "https://seal.godaddy.com/getSeal?sealID={long id string}".
Resource interpreted as Document but transferred with MIME type text/csv: "https://www.{noname}store.com/vendor_report.php?report=custom&q={long}==&k={morelong}".

Could the above be causing the issue?
No - the second request is Chrome requesting the csv file download. I'm not familiar with the Chrome developer tools, but just had a play around. Click that 2nd link in the Console and it takes you to the Network tab, where if you click the same path (link) it displays the request headers (the same as shown in Fiddler but in a different layout).

In summary, the technique is to use IE automation to log into the website, navigate to the required page and obtain the URL (href) for the file download link. Then use WinHttpRequest 5.1 to send a GET request to that URL to download file, specifying any headers as required. Then save the responseBody bytes in a file.

If you want to use IE automation for the whole process then there is code at http://www.mrexcel.com/forum/excel-...ile-download-dialog-box-without-sendkeys.html which uses Windows API calls to handle the IE download dialogue. However I've only used it on IE8 and never tried to get it working on IE11.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,705
Messages
6,173,986
Members
452,541
Latest member
haasro02

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top