VBA for extracting specific data from PDF Docs

xcellrodeo

Board Regular
Joined
Oct 27, 2008
Messages
208
Hi Wonderful MrExcel community

I have some questions & was wondering if anyone has done this before / has any help available, ideally a VBA Script that will do this.
Here are the specifics:
CONTEXT:
I have a folder called Test Certificates.
This folder has around 200 PDF documents in there, all of which have different file names.
Each PDF is the same in terms of:
- Same layout format and has up to 30 pages per PDF document.
- Page 1 contains a 2 column table with multiple rows
- The table on page 1 has Column 1 (left side) which contains the terms below to be searched. Column 2 (right side contains the values to be returned)

OBJECTIVE - What I would like to do:
I would like a VBA script to look through each PDF in the folder (~200 PDF documents) and:
1) Search the following terms on page 1 (which as set out in the Context above, has a table with 2 columns and multiple rows) and return the values immediately to the right of each search term .
'Property Address'
'Post Code'
'Survey Date'
2) Search the following terms in each PDF document (the entire document not just page 1) and count how many times the search term occurs in each document.
'Laundry'
'Boiler Room'
'Block'

HOW SHOULD THE RESULTS OF ACTIONS DESCRIBED IN POINTS 1-3 ABOVE LOOK?
In a spreadsheet called 'Test CERTs List', tab name 'New CERTs', I would like the VBA script to return:
Col A - File name that is being searched
Col B Name of first search term i.e Laundry
Col C Count of occurence of term 'Laundry
Col D Name of second search term i.e Boiler Room
Col E Count of occurence of term 'Boiler Room
Col F Name of third search term i.e Block
Col G Count of occurence of term 'Block'
Col H Search term 'Property Address'
Col I Value of Property Address (see Point 1 description above)
Col J Search term 'Post Code'
Col K Value of Property Address (see Point 1 description above)
Col L Search term 'Survey Date'
Col M Value of Property Address (see Point 1 description above)

Please see attached layout example of final result (grey row 2 is an example row of data to be returned
Hope this all makes sense, if not please let me know and i will do my best to explain further.
Thanks so much for your help with this.
 

Attachments

  • Example of Results layout.JPG
    Example of Results layout.JPG
    78 KB · Views: 21

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
Any ideas/help on the above would be massively appreciated. Please note I only use the free Adobe Reader version.
 
Upvote 0
Word VBA code (called from Excel) can be used to open each PDF as a Word document and search and extract the required data.

The code in this thread shows the basic method, however it looks like it copies and pastes the whole PDF (Word document) into Excel, so your code will be different.


Another possibility is Power Query.
 
Upvote 0

Forum statistics

Threads
1,221,310
Messages
6,159,173
Members
451,543
Latest member
cesymcox

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top