Multiple PDF with different number of pages.

itsgrady

Board Regular
Joined
Sep 11, 2022
Messages
132
Office Version
  1. 2021
Platform
  1. Windows
  2. MacOS
I’m trying to combine PDFs through Power Query. The different PDFs have the some columns of information.

Some of the PDFs have two pages. I am able to combine the multiple PDFs but the PDFs with 2 pages, page one is combine but not page 2 is not combine.

The 2nd page of the pdf is continuation of page 1 table but the table on page 2 does not have the a headers.

How can I combine multiple PDFs that have one or more pages?

I did it through Get Data From Folder. I need to drop a new PDF in the folder each day. Some of the PDFs will have two pages.
 
Last edited:

Excel Facts

Which lookup functions find a value equal or greater than the lookup value?
MATCH uses -1 to find larger value (lookup table must be sorted ZA). XLOOKUP uses 1 to find values greater and does not need to be sorted.
hello, @itsgrady would be nice to see one of those files. Let's suppose that first page of each pdf doc contains table headers in the very first row. Then you could combine tables of all pages of the file, promote headers , do the same with each file and then combine tables in the end. You'd need to use some custom function to modify single file. This would look smh like this:
Power Query:
let
    Source = Folder.Files("full_path_to_your_folder"),
    // this is list of binary contents of our files
    pdf_files = Table.SelectRows(Source, each ([Extension] = ".pdf"))[Content],
    // custom function to get pages of each pdf doc, combine tables and promote headers
    get_pages = (bin as binary) => 
        [filter_pages = Table.SelectRows(Pdf.Tables(bin), each [Kind] = "Page")[Data],
        combine_tables = Table.Combine(filter_pages),
        promote = Table.PromoteHeaders(combine_tables)][promote],
    // apply get_pages to the list of file contents and combine tables together
    all_tables = Table.Combine(List.Transform(pdf_files, get_pages))
in
    all_tables
 
Upvote 0

Forum statistics

Threads
1,224,817
Messages
6,181,147
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top