# Power Query Does Not Import Multiple PDF Pages



## legalhustler (Oct 15, 2019)

Hi,

I'm using Power Query for Power BI Desktop and when I connect to a single PDF file the last record only shows the data for the first page and not the rest of the pages.  The rest of the data is the same as the first page.  Can Power Query read multiple pages for *one* PDF? If so, how can I get it to work?

Thanks in advance.


----------



## sandy666 (Oct 15, 2019)

I connect pdf with over 800 pages (tables) and it works well
but PBD is useless to me


----------



## legalhustler (Oct 15, 2019)

So you can see all the records once you connect to the single PDF, including the last record on the last page?

Why do you find PBD useless? I don't think Power Query for Excel can connect to a PDF data source. I was going to transform the PDF data in PBD then export the data to Excel.


----------



## sandy666 (Oct 15, 2019)

legalhustler said:


> So you can see all the records once you connect to the single PDF, including the last record on the last page?


Yes I see as I said before


legalhustler said:


> Why do you find PBD useless?


Just because    it's too heavy and problem with high resolution. All what I need I have in Excel: PQ, PowerPivot


legalhustler said:


> I don't think Power Query for Excel can connect to a PDF data source.


Right PQ in Excel cannot but there is a few ways to do it, OCR or Matt Allington advice and some more


legalhustler said:


> I was going to transform the PDF data in PBD then export the data to Excel.


Via csv not directly to Excel


----------



## legalhustler (Oct 15, 2019)

I think I see the problem.  I'm not technically connecting using "PDF" data source, I'm using "Folder" and when I do that it shows PDF file with 10 objects, of which 5 are tables (the one's I need to transform) and 5 non-important stuff.  How do I connect Power Query to the 5 table objects only? Can't seem to select multiple objects.  The Sample File drop down shows all my PDF files with different number of table objects....can Power Query combine table objects for each PDF automatically?


----------



## sandy666 (Oct 15, 2019)

remove not necessary rows then TransformData and choose from Attribute columns you need


----------



## legalhustler (Oct 15, 2019)

Once I connect via "Folder" data source I can select Combine, Load, or Transform.  

If I select Combine it will allow for either "combine & transform" or "load".  If I select "combine & transform", I have to select a sample file and it only allows me to select one table object (can't select multiple) then when I load it, it only shows the first table from each PDF instead of all the tables from each PDF.

If I select Transform, I have the Content and Attributes column but I don't see anyway to select the table objects from each PDF.

I can connect via "PDF" data source but I have to manually select each file select the checkbox next to each object, then I would probably have to combine all the tables.  This seems like it will work but I don't know how to combine each table.  Table.Combine?


----------



## sandy666 (Oct 15, 2019)

TransformData (_post#6_) == Transform

btw. I know what is there so describe what are you doing only not whole layout


----------



## legalhustler (Oct 15, 2019)

I don't want to assume what you know or don't know, so I like to describe in detail what I'm observing so you or anyone else can comprehend the issue.  Kind of like your quote "I know you know but I forgot my Crystal Ball and don't know what you know" 

If I click the double arrow to expand, it doesn't shows the table objects, it shows "Content", "Kind", "Size", "Hidden", "System", etc.....don't see anything related to the objects.


----------



## sandy666 (Oct 15, 2019)

because there are columns of the tables
Expand all
then transform/clean data and stay with data what you need

PBD (PQ) is not a Click&Go, sometimes you need to know what are you want to achieve

maybe it will help: Link


----------



## legalhustler (Oct 15, 2019)

Hi,

I'm using Power Query for Power BI Desktop and when I connect to a single PDF file the last record only shows the data for the first page and not the rest of the pages.  The rest of the data is the same as the first page.  Can Power Query read multiple pages for *one* PDF? If so, how can I get it to work?

Thanks in advance.


----------



## legalhustler (Oct 15, 2019)

Yeah, I just want to combine all table objects from PDFs from a folder.  The columns of the tables is in the Contents column which when I expand only shows the records of the first table object of each PDF and is missing the rest of the tables.  Only workaround I see is if I connect via "PDF" data source instead of from "Folder."


----------



## sandy666 (Oct 15, 2019)

try this way:

Folder
Load
on the left click Data icon
Edit Queries
Remove rows
right click Binary (column content)
Drill down
Column Data
Expand columns

because I don't have many files I did it for two files (one removed) second with 12 pages


----------



## legalhustler (Oct 15, 2019)

When I right click and Drill Down the Content column it creates a List in one column and I can drill down any of the Binary and expand the table objects, but I can't seem to drill down all of Binary at once so all the fields from each PDF are combined.  Hope I'm articulating it so that it makes sense.


----------



## sandy666 (Oct 15, 2019)

my fault 

right click on the first binary - add as new query then dbl click on the file icon
back to the top 
right click on the second binary - add as new query then dbl click on the file icon
and so on... 
select column Data (with Tables) and expand in each query
now you will have a few queries which can be merged/appended

easier will be if you merge these pdfs with Adobe Acrobat and import data from single pdf file


----------



## legalhustler (Oct 15, 2019)

I have 235 PDFs, manually adding as new query will take too much time.  That's a great idea about merging the PDFs with Adobe Acrobat (which I have).  Let's see how it turns.  Stay tuned.  Thanks!


----------



## sandy666 (Oct 15, 2019)

You are welcome

Impatient person 

have a nice day


----------

