fanofstuff
New Member
- Joined
- Feb 1, 2023
- Messages
- 1
- Office Version
- 365
- Platform
- Windows
Hi, i was wondering if i could get some help with a problem i was having, once per month i need to download about 10-15 pdf files (updating them) and manually take data out of the files and add them to an excel spreadsheet.
I have adobe acrobat pro V 10.0 and have been trying to create a macro/vba code to automate at least the entering of numbers into the excel sheet
I was able to run some loops to cycle through the pdf. The issue is that negative numbers do not get printed/shown as negative numbers (as well as some words are printed with "..." instead of some letters but that is less important) and this is extremely important as this has to do with stock performance.
Almost everything in the pdf is in different tables, and there are 10 pages per pdf where only 3 are needed, i wanted to know if there was a way that i could avoid, downloading all 15 pdf's, then flattening them, then exporting them to excel, then using a macro on the excel sheet (if that is even possible)
This works for pdf files, however crucial information is ommited (even if i export to excel), when i flatten it, the info is no longer lost and i can export to excel but its incredibly messy and difficult to sort through as well as takes many extra steps/more time to do the task
Please help!!
*code*
Const pdf_file As String = "path_to_pdf"
Sub read_pdf()
Dim aApp As Acrobat.AcroApp
Dim av_doc As CAcroAVDoc
Dim pdf_doc As CAcroPDDoc
Dim sel_text As CAcroPDTextSelect
Dim i As Long, j As Long
Dim pagenumber, pagecontent, content
Set aApp = CreateObject("AcroExch.App")
Set av_doc = CreateObject("AcroExch.AVDoc")
If av_doc.Open(pdf_file, vbNull) <> True Then Exit Sub
While av_doc Is Nothing
Set av_doc = aApp.GetActiveDoc
Wend
Set pdf_doc = av_doc.GetPDDoc
For i = 0 To pdf_doc.GetNumPages - 1
Set pagenumber = pdf_doc.AcquirePage(i)
Set pagecontent = CreateObject("AcroExch.HiliteList")
On Error Resume Next
If pagecontent.Add(0, 9000) <> True Then Exit Sub
Set sel_text = pagenumber.CreatePageHilite(pagecontent)
On Error GoTo 0
For j = 0 To sel_text.GetNumText - 1
Debug.Print sel_text.GetText(j)
' //Range("A" & Rows.Count).End(x1Up).Offset(1, 0).Value = sel_text.GetText(j)
Next j
Next i
av_doc.Close False
aApp.Exit
Set pagenumber = Nothing
Set sel_text = Nothing
Set av_doc = Nothing
Set aApp = Nothing
End Sub
I have adobe acrobat pro V 10.0 and have been trying to create a macro/vba code to automate at least the entering of numbers into the excel sheet
I was able to run some loops to cycle through the pdf. The issue is that negative numbers do not get printed/shown as negative numbers (as well as some words are printed with "..." instead of some letters but that is less important) and this is extremely important as this has to do with stock performance.
Almost everything in the pdf is in different tables, and there are 10 pages per pdf where only 3 are needed, i wanted to know if there was a way that i could avoid, downloading all 15 pdf's, then flattening them, then exporting them to excel, then using a macro on the excel sheet (if that is even possible)
This works for pdf files, however crucial information is ommited (even if i export to excel), when i flatten it, the info is no longer lost and i can export to excel but its incredibly messy and difficult to sort through as well as takes many extra steps/more time to do the task
Please help!!
*code*
Const pdf_file As String = "path_to_pdf"
Sub read_pdf()
Dim aApp As Acrobat.AcroApp
Dim av_doc As CAcroAVDoc
Dim pdf_doc As CAcroPDDoc
Dim sel_text As CAcroPDTextSelect
Dim i As Long, j As Long
Dim pagenumber, pagecontent, content
Set aApp = CreateObject("AcroExch.App")
Set av_doc = CreateObject("AcroExch.AVDoc")
If av_doc.Open(pdf_file, vbNull) <> True Then Exit Sub
While av_doc Is Nothing
Set av_doc = aApp.GetActiveDoc
Wend
Set pdf_doc = av_doc.GetPDDoc
For i = 0 To pdf_doc.GetNumPages - 1
Set pagenumber = pdf_doc.AcquirePage(i)
Set pagecontent = CreateObject("AcroExch.HiliteList")
On Error Resume Next
If pagecontent.Add(0, 9000) <> True Then Exit Sub
Set sel_text = pagenumber.CreatePageHilite(pagecontent)
On Error GoTo 0
For j = 0 To sel_text.GetNumText - 1
Debug.Print sel_text.GetText(j)
' //Range("A" & Rows.Count).End(x1Up).Offset(1, 0).Value = sel_text.GetText(j)
Next j
Next i
av_doc.Close False
aApp.Exit
Set pagenumber = Nothing
Set sel_text = Nothing
Set av_doc = Nothing
Set aApp = Nothing
End Sub