Hello,
I need to get specific data from nonuniform PDFs and place them in a single table. For example, say I have 100 clients that have separate types of tax forms. Say Schedule A, B, E, F, and H. Some clients have only Schedules A, E, and H. Some clients have B, F, and H. The Schedules have varying length of page numbers, so it makes it difficult to use PowerQuery to open Table 20 (Page 10), for example, because the specific data I need from Schedule H sits on page 8 in one file and sits on page 12 in another file, but both contain Schedule H and Schedule H is always a uniform, identical, 4 pages.
How can I use PowerQuery or another tool to pull a specific data item from Schedule H, regardless of where it sits in the PDF file for individual clients? Say I have thousands of PDFs that I want to pull a specific data entry from but it sits on different pages as described above, but within a specific tax form, Schedule H.
Thank you so much,
Jared
I need to get specific data from nonuniform PDFs and place them in a single table. For example, say I have 100 clients that have separate types of tax forms. Say Schedule A, B, E, F, and H. Some clients have only Schedules A, E, and H. Some clients have B, F, and H. The Schedules have varying length of page numbers, so it makes it difficult to use PowerQuery to open Table 20 (Page 10), for example, because the specific data I need from Schedule H sits on page 8 in one file and sits on page 12 in another file, but both contain Schedule H and Schedule H is always a uniform, identical, 4 pages.
How can I use PowerQuery or another tool to pull a specific data item from Schedule H, regardless of where it sits in the PDF file for individual clients? Say I have thousands of PDFs that I want to pull a specific data entry from but it sits on different pages as described above, but within a specific tax form, Schedule H.
Thank you so much,
Jared