Sort Pages According To List VBA / JavaScript / Actions

fiberboysa

Board Regular
Joined
Apr 25, 2012
Messages
106
Office Version
  1. 365
Platform
  1. Windows
Hi All and hope you are doing well,

I have a PDF file that contains 500+ pages containing a unique number on each page. First I wanted to sort them with Acrobat PDF Pro DC Actions/JavaScript but its hard and I don't understand Java at all. I also want to clear here that I get this PDF from other source I do not generate it. The person who generate it is totally not interested to help me in this, and I don't know the reason for it as well. I just receive it and it's my headache to reorder them. And this work is repetitive means I receive these PDFs several times in a month.

I have tried to do it with acrobatusers.com/assets/uploads/actions/… which is available at acrobatusers.com/actions-exchange. But problem is that it extracts pages randomly and its code is not understandable to me.

Then I realized that I can open PDFs in MS Word 365 as well and when I opened the same doc in MS Word 365 it is searchable.

Now I want a VBA code or Acrobat DC Pro JavaScript/Action to rearrange these pages according to a list that I provide in XLSX or CSV or TXT etc. format and save it as a new file. Please have a look on following as example.

File Before Sorting Pages
5xJqc.png


Sort List
Uuanl.png


File After Sorting Pages
d0NcC.png
 

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.
As you have Acrobat Pro, the Acrobat API is available and can be called from VBA.

For your example, do you know that all those 'UV n' unique values definitely occur on those specific pages in the input PDF? Or do you have to search for each unique value amongst all the pages to ascertain its actual page number?

If the former then the VBA macro just needs to loop through an array of page numbers and insert each page from the input PDF to a new output PDF. For your example, put the required page order in sheet cells:
Required page order
3
2
8
4
5
1
6
10
9
7

The macro would read those cells and use the InsertPages method of the Acrobat API to insert each specific page from the input PDF to the new output PDF. See example code at Merge only specific page from multiple pdf files in folder and save in the same folder with file name "Merged", which would need changing for your requirement.
 
Upvote 0
As you have Acrobat Pro, the Acrobat API is available and can be called from VBA.

For your example, do you know that all those 'UV n' unique values definitely occur on those specific pages in the input PDF? Or do you have to search for each unique value amongst all the pages to ascertain its actual page number?

If the former then the VBA macro just needs to loop through an array of page numbers and insert each page from the input PDF to a new output PDF. For your example, put the required page order in sheet cells:
Required page order
3
2
8
4
5
1
6
10
9
7

The macro would read those cells and use the InsertPages method of the Acrobat API to insert each specific page from the input PDF to the new output PDF. See example code at Merge only specific page from multiple pdf files in folder and save in the same folder with file name "Merged", which would need changing for your requirement.
@John_w Thanks for your reply.
Sorry I forgot to mention that the "UV n" are random and UV n may appear at any page in PDF that needs to be sorted. Means that each UV have to be searched and then added in Merged PDF.
 
Upvote 0
OK I've got following code with help from here and here. Now I can give text to search text cells A2 onwards and get page numbers where they are appearing in PDF document in sheet "Extract PDFs" in cells B2 and onwards.
Now I want to add pageNo in new PDF and save that PDF file as new PDF file.
VBA Code:
Option Explicit
 
Sub FindTextInPDF()
 
    '----------------------------------------------------------------------------------------
    'This macro can be used to find a specific TEXT (more than one word) in a PDF document.
    'The macro opens the PDF, finds the specified text (the first instance), scrolls so
    'that it is visible and highlights it.
    'The macro uses the FindText method (see the code below for more info).
 
    'Note that in some cases it doesn't work (doesn't highlight the text), so in those
    'cases prefer the SearchTextInPDF macro if you have only ONE WORD to find!
 
    'The code uses late binding, so no reference to an external library is required.
    'However, the code works ONLY with Adobe Professional, so don't try to use it with
    'Adobe Reader because you will get an "ActiveX component can't create object" error.
 
    'Written by:    Christos Samaras
    'Date:          04/05/2014
    'e-mail:        xristos.samaras@gmail.com
    'site:          http://www.myengineeringworld.net
    '----------------------------------------------------------------------------------------
 
    'Declaring the necessary variables.
    Dim PDFPath As String
    Dim App As Object
    Dim AVDoc As Object
    Dim pdDoc As Object
    Dim jso As Object
    Dim pageNo As Integer
    Dim No_Of_BLs As Integer
    Dim BL_Loop As Integer
    Dim wb As Workbook
    Dim ws As Worksheet
    Dim PDDocDestination As Acrobat.AcroPDDoc
    Dim PDDocSource As Acrobat.AcroPDDoc
    Dim mergedPDF As String
 
    'Specify the text you want to search.
    'TextToFind = "Christos Samaras"
    'Using a range:
    mergedPDF = "****" 'Change This
    
    'Delete destination (merged) PDF if it already exists - a new PDF will be created
    If Dir(mergedPDF) <> vbNullString Then Kill mergedPDF
 
    'Specify the path of the sample PDF form.
    'Full path example:
    'PDFPath = "C:\Users\Christos\Desktop\How Software Companies Die.pdf"
    'Using workbook path:
    'PDFPath = ThisWorkbook.Path & "\" & "How Software Companies Die.pdf"
    'Using a range:
    PDFPath = ThisWorkbook.Sheets("PDF Search").Range("C7").Value
    
    'Setiing Worksheet And Sheet
    Set wb = ThisWorkbook
    wb.Activate
    Set ws = wb.Sheets("Extract PDFs")
    Worksheets(ws.Name).Activate
    No_Of_BLs = Cells(Rows.Count, 1).End(xlUp).Row
    
    'Check if the file exists.
    If Dir(PDFPath) = "" Then
        MsgBox "Cannot find the PDF file!" & vbCrLf & "Check the PDF path and retry.", _
                vbCritical, "File Path Error"
        Exit Sub
    End If
 
    'Check if the input file is a PDF file.
    If LCase(Right(PDFPath, 3)) <> "pdf" Then
        MsgBox "The input file is not a PDF file!", vbCritical, "File Type Error"
        Exit Sub
    End If
 
    On Error Resume Next
 
    'Initialize Acrobat by creating the App object.
    Set App = CreateObject("AcroExch.App")
    Set PDDocDestination = CreateObject("AcroExch.PDDoc")
    Set PDDocSource = CreateObject("AcroExch.PDDoc")
 
    'Check if the object was created. In case of error release the object and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the Adobe Application object!", vbCritical, "Object Error"
        Set App = Nothing
        Exit Sub
    End If
 
    'Create the AVDoc object.
    Set AVDoc = CreateObject("AcroExch.AVDoc")
 
    'Check if the object was created. In case of error release the objects and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the AVDoc object!", vbCritical, "Object Error"
        Set AVDoc = Nothing
        Set App = Nothing
        Exit Sub
    End If
 
    On Error GoTo 0
 
 'Create new destination (merged) PDF
    PDDocDestination.Create
 
    'Open the PDF file.
    If AVDoc.Open(PDFPath, "") = True Then
        
        For BL_Loop = 2 To No_Of_BLs
        'Open successful, bring the PDF document to the front.
        'AVDoc.BringToFront
 
        'Use the FindText method in order to find and highlight the desired text.
        'The FindText method returns true if the text was found or false if it was not.
        'Here are the 4 arguments of the FindText methd:
        'Text to find:          The text that is to be found (in this example the TextToFind variable).
        'Case sensitive:        If true, the search is case-sensitive. If false, it is case-insensitive (in this example is True).
        'Whole words only:      If true, the search matches only whole words. If false, it matches partial words (in this example is True).
        'Search from 1st page:  If true, the search begins on the first page of the document. If false, it begins on the current page (in this example is False).
        If AVDoc.FindText(Cells(BL_Loop, 1).Value, True, True, False) = False Then
 
            'Text was not found, close the PDF file without saving the changes.
            'AVDoc.Close True
 
            'Close the Acrobat application.
            'App.Exit
 
            'Release the objects.
            'Set AVDoc = Nothing
            'Set App = Nothing
 
            'Inform the user.
            MsgBox "The text '" & Cells(BL_Loop, 1).Value & "' could not be found in the PDF file!", vbInformation, "Search Error"
        Else
        Set pdDoc = AVDoc.GetPDDoc()
        Set jso = pdDoc.GetJSObject
        pageNo = jso.pagenum + 1
        Cells(BL_Loop, 2).Value = pageNo
        'PDDocSource.Open PDFPath
        'If Not PDDocDestination.InsertPages(PDDocDestination.GetNumPages - 1, PDDocSource, pageNo, 1, 0) Then
        '    MsgBox "Error merging" & vbCrLf & PDFPath & vbCrLf & "to" & vbCrLf & mergedPDF, vbExclamation
        'End If
        End If
        Next BL_Loop
    Else
 'Save destination (merged) PDF
    PDDocDestination.Save 1, mergedPDF
    PDDocDestination.Close
      
    Set PDDocSource = Nothing
    Set PDDocDestination = Nothing
        'Unable to open the PDF file, close the Acrobat application.
        App.Exit
 
        'Release the objects.
        Set AVDoc = Nothing
        Set App = Nothing
 
        'Inform the user.
        MsgBox "Could not open the PDF file!", vbCritical, "File error"
 
    End If
 
End Sub
 
Upvote 0
OK I've got following code with help from here and here. Now I can give text to search text cells A2 onwards and get page numbers where they are appearing in PDF document in sheet "Extract PDFs" in cells B2 and onwards.
Now I want to add pageNo in new PDF and save that PDF file as new PDF file.
VBA Code:
Option Explicit
 
Sub FindTextInPDF()
 
    '----------------------------------------------------------------------------------------
    'This macro can be used to find a specific TEXT (more than one word) in a PDF document.
    'The macro opens the PDF, finds the specified text (the first instance), scrolls so
    'that it is visible and highlights it.
    'The macro uses the FindText method (see the code below for more info).
 
    'Note that in some cases it doesn't work (doesn't highlight the text), so in those
    'cases prefer the SearchTextInPDF macro if you have only ONE WORD to find!
 
    'The code uses late binding, so no reference to an external library is required.
    'However, the code works ONLY with Adobe Professional, so don't try to use it with
    'Adobe Reader because you will get an "ActiveX component can't create object" error.
 
    'Written by:    Christos Samaras
    'Date:          04/05/2014
    'e-mail:        xristos.samaras@gmail.com
    'site:          http://www.myengineeringworld.net
    '----------------------------------------------------------------------------------------
 
    'Declaring the necessary variables.
    Dim PDFPath As String
    Dim App As Object
    Dim AVDoc As Object
    Dim pdDoc As Object
    Dim jso As Object
    Dim pageNo As Integer
    Dim No_Of_BLs As Integer
    Dim BL_Loop As Integer
    Dim wb As Workbook
    Dim ws As Worksheet
    Dim PDDocDestination As Acrobat.AcroPDDoc
    Dim PDDocSource As Acrobat.AcroPDDoc
    Dim mergedPDF As String
 
    'Specify the text you want to search.
    'TextToFind = "Christos Samaras"
    'Using a range:
    mergedPDF = "****" 'Change This
   
    'Delete destination (merged) PDF if it already exists - a new PDF will be created
    If Dir(mergedPDF) <> vbNullString Then Kill mergedPDF
 
    'Specify the path of the sample PDF form.
    'Full path example:
    'PDFPath = "C:\Users\Christos\Desktop\How Software Companies Die.pdf"
    'Using workbook path:
    'PDFPath = ThisWorkbook.Path & "\" & "How Software Companies Die.pdf"
    'Using a range:
    PDFPath = ThisWorkbook.Sheets("PDF Search").Range("C7").Value
   
    'Setiing Worksheet And Sheet
    Set wb = ThisWorkbook
    wb.Activate
    Set ws = wb.Sheets("Extract PDFs")
    Worksheets(ws.Name).Activate
    No_Of_BLs = Cells(Rows.Count, 1).End(xlUp).Row
   
    'Check if the file exists.
    If Dir(PDFPath) = "" Then
        MsgBox "Cannot find the PDF file!" & vbCrLf & "Check the PDF path and retry.", _
                vbCritical, "File Path Error"
        Exit Sub
    End If
 
    'Check if the input file is a PDF file.
    If LCase(Right(PDFPath, 3)) <> "pdf" Then
        MsgBox "The input file is not a PDF file!", vbCritical, "File Type Error"
        Exit Sub
    End If
 
    On Error Resume Next
 
    'Initialize Acrobat by creating the App object.
    Set App = CreateObject("AcroExch.App")
    Set PDDocDestination = CreateObject("AcroExch.PDDoc")
    Set PDDocSource = CreateObject("AcroExch.PDDoc")
 
    'Check if the object was created. In case of error release the object and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the Adobe Application object!", vbCritical, "Object Error"
        Set App = Nothing
        Exit Sub
    End If
 
    'Create the AVDoc object.
    Set AVDoc = CreateObject("AcroExch.AVDoc")
 
    'Check if the object was created. In case of error release the objects and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the AVDoc object!", vbCritical, "Object Error"
        Set AVDoc = Nothing
        Set App = Nothing
        Exit Sub
    End If
 
    On Error GoTo 0
 
 'Create new destination (merged) PDF
    PDDocDestination.Create
 
    'Open the PDF file.
    If AVDoc.Open(PDFPath, "") = True Then
       
        For BL_Loop = 2 To No_Of_BLs
        'Open successful, bring the PDF document to the front.
        'AVDoc.BringToFront
 
        'Use the FindText method in order to find and highlight the desired text.
        'The FindText method returns true if the text was found or false if it was not.
        'Here are the 4 arguments of the FindText methd:
        'Text to find:          The text that is to be found (in this example the TextToFind variable).
        'Case sensitive:        If true, the search is case-sensitive. If false, it is case-insensitive (in this example is True).
        'Whole words only:      If true, the search matches only whole words. If false, it matches partial words (in this example is True).
        'Search from 1st page:  If true, the search begins on the first page of the document. If false, it begins on the current page (in this example is False).
        If AVDoc.FindText(Cells(BL_Loop, 1).Value, True, True, False) = False Then
 
            'Text was not found, close the PDF file without saving the changes.
            'AVDoc.Close True
 
            'Close the Acrobat application.
            'App.Exit
 
            'Release the objects.
            'Set AVDoc = Nothing
            'Set App = Nothing
 
            'Inform the user.
            MsgBox "The text '" & Cells(BL_Loop, 1).Value & "' could not be found in the PDF file!", vbInformation, "Search Error"
        Else
        Set pdDoc = AVDoc.GetPDDoc()
        Set jso = pdDoc.GetJSObject
        pageNo = jso.pagenum + 1
        Cells(BL_Loop, 2).Value = pageNo
        'PDDocSource.Open PDFPath
        'If Not PDDocDestination.InsertPages(PDDocDestination.GetNumPages - 1, PDDocSource, pageNo, 1, 0) Then
        '    MsgBox "Error merging" & vbCrLf & PDFPath & vbCrLf & "to" & vbCrLf & mergedPDF, vbExclamation
        'End If
        End If
        Next BL_Loop
    Else
 'Save destination (merged) PDF
    PDDocDestination.Save 1, mergedPDF
    PDDocDestination.Close
     
    Set PDDocSource = Nothing
    Set PDDocDestination = Nothing
        'Unable to open the PDF file, close the Acrobat application.
        App.Exit
 
        'Release the objects.
        Set AVDoc = Nothing
        Set App = Nothing
 
        'Inform the user.
        MsgBox "Could not open the PDF file!", vbCritical, "File error"
 
    End If
 
End Sub
@John_w sorry to bother you but can you please help me in adding required pages (pageNo) in Merged PDF.
 
Upvote 0
OK Alhamdulilah I've done it with help of 3 sources. To save the headache of variable like AVDoc, PDDoc and etc. I have divided the process in two macros. Its not perfect but its doing with I wanted it to do.
With Source 1 and Source 2, I've modified the 1st macro to search the required text (Cell A2 and onwards in sheet "Extract PDFs") in PDF and then get the PDF page numbers in Cell B2 and onwards with respect to searched text. If a search value in not found it will leave its page value in Column B blank. If Multiple search values are searched and found then it will extract their page numbers in search order.
Then with Source 3 I've modified the macro to used page numbers Cell B2 and onwards to create PDF of required pages in search order. I will skip blank values.
Following are the two macros respectively.

Macro 1
VBA Code:
Option Explicit

Sub FindTextInPDF()
 
    'Declaring the necessary variables.
    Dim PDFPath As String
    Dim App As Object
    Dim AVDoc As Object
    Dim pdDoc As Object
    Dim jso As Object
    Dim pageNo As Integer
    Dim No_Of_BLs As Integer
    Dim BL_Loop As Integer
    Dim wb As Workbook
    Dim ws As Worksheet
    Dim mergedPDF As String
    Dim x As Integer
 
    'Specify the destination PDF file name
    mergedPDF = ThisWorkbook.Sheets("Setup").Range("B2").Value
    
    'Delete destination (merged) PDF if it already exists - a new PDF will be created
    If Dir(mergedPDF) <> vbNullString Then Kill mergedPDF
 
    'Specify the path of the sample PDF form.
    PDFPath = ThisWorkbook.Sheets("Setup").Range("A2").Value
    
    'Setiing Worksheet And Sheet
    Set wb = ThisWorkbook
    wb.Activate
    Set ws = wb.Sheets("Extract PDFs")
    Worksheets(ws.Name).Activate
    No_Of_BLs = Cells(Rows.Count, 1).End(xlUp).Row
    
    'Check if the file exists.
    If Dir(PDFPath) = "" Then
        MsgBox "Cannot find the PDF file!" & vbCrLf & "Check the PDF path and retry.", _
                vbCritical, "File Path Error"
        Exit Sub
    End If
 
    'Check if the input file is a PDF file.
    If LCase(Right(PDFPath, 3)) <> "pdf" Then
        MsgBox "The input file is not a PDF file!", vbCritical, "File Type Error"
        Exit Sub
    End If
 
    On Error Resume Next
 
    'Initialize Acrobat by creating the App object.
    Set App = CreateObject("AcroExch.App")
 
    'Check if the object was created. In case of error release the object and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the Adobe Application object!", vbCritical, "Object Error"
        Set App = Nothing
        Exit Sub
    End If
 
    'Create the AVDoc object.
    Set AVDoc = CreateObject("AcroExch.AVDoc")
 
    'Check if the object was created. In case of error release the objects and exit.
    If Err.Number <> 0 Then
        MsgBox "Could not create the AVDoc object!", vbCritical, "Object Error"
        Set AVDoc = Nothing
        Set App = Nothing
        Exit Sub
    End If
 
    On Error GoTo 0
 
    'Open the PDF file.
    If AVDoc.Open(PDFPath, "") = True Then
        
        For BL_Loop = 2 To No_Of_BLs
        'Open successful, bring the PDF document to the front.
 
        'Use the FindText method in order to find and highlight the desired text.
        'The FindText method returns true if the text was found or false if it was not.
        If AVDoc.FindText(Cells(BL_Loop, 1).Value, True, True, False) = True Then
            Set pdDoc = AVDoc.GetPDDoc()
            Set jso = pdDoc.GetJSObject
            pageNo = jso.pagenum + 1
            Cells(BL_Loop, 2).Value = pageNo
            End If
        Next BL_Loop
    Else
        'Unable to open the PDF file, close the Acrobat application.
        App.Exit
 
        'Release the objects.
        Set AVDoc = Nothing
        Set App = Nothing
 
        'Inform the user.
        MsgBox "Could not open the PDF file!", vbCritical, "File error"
 
    End If
 
End Sub

Macro 2
VBA Code:
Public Sub Merge_PDFs()

    Dim PDDocDestination As Acrobat.AcroPDDoc
    Dim PDDocSource As Acrobat.AcroPDDoc
    Dim matchPDFs As String, mergedPDF As String
    Dim folder As String, PDFfile As String
    Dim No_Of_BLs As Integer
    Dim BL_Loop As Integer
    Dim wb As Workbook
    Dim ws As Worksheet
    
    Call FindTextInPDF
    
    matchPDFs = ThisWorkbook.Sheets("Setup").Range("A2").Value       'CHANGE THIS
   
    mergedPDF = ThisWorkbook.Sheets("Setup").Range("B2").Value     'CHANGE THIS
    
    'Delete destination (merged) PDF if it already exists - a new PDF will be created
    
    If Dir(mergedPDF) <> vbNullString Then Kill mergedPDF
    
    Set wb = ThisWorkbook
    wb.Activate
    Set ws = wb.Sheets("Extract PDFs")
    Worksheets(ws.Name).Activate
    No_Of_BLs = Cells(Rows.Count, 1).End(xlUp).Row

    'Create Acrobat API objects
    
    Set PDDocDestination = CreateObject("AcroExch.PDDoc") 'New Acrobat.AcroPDDoc
    Set PDDocSource = CreateObject("AcroExch.PDDoc") 'New Acrobat.AcroPDDoc
    
    'Create new destination (merged) PDF
    
    PDDocDestination.Create
    
    'Loop through matching PDFs and insert page 2 from each PDF to the merged PDF
    
    folder = Left(matchPDFs, InStrRev(matchPDFs, "\"))
    PDFfile = Dir(matchPDFs)
    While PDFfile <> vbNullString
        PDDocSource.Open folder & PDFfile
        For BL_Loop = 2 To No_Of_BLs
        If Cells(BL_Loop, 2) <> "" Then
        If Not PDDocDestination.InsertPages(PDDocDestination.GetNumPages - 1, PDDocSource, Cells(BL_Loop, 2) - 1, 1, 0) Then
            MsgBox "Error merging" & vbCrLf & folder & PDFfile & vbCrLf & "to" & vbCrLf & mergedPDF, vbExclamation
        End If
        End If
        Next BL_Loop
        PDDocSource.Close
        PDFfile = Dir
    Wend
    
    'Save destination (merged) PDF
    
    PDDocDestination.Save 1, mergedPDF
    PDDocDestination.Close
       
    Set PDDocSource = Nothing
    Set PDDocDestination = Nothing

    MsgBox "Created " & mergedPDF
    
End Sub
 
Upvote 0
Solution

Forum statistics

Threads
1,223,229
Messages
6,170,881
Members
452,364
Latest member
springate

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top