VBA to Combine Differently Formatted Word Files into 1 while Preserving Layouts

agent_maxine

New Member
Joined
Aug 23, 2017
Messages
38
Dear Mr. Excel,

Would like to request your kind assistance on my codes below... I have codes that will allow me to select multiple files and combine them into 1 Word (then exports into 1 PDF). Problem is:
  • Headers/Footers: The header/footer from the first file is used for the entire combined file.
  • Font Types/Sizes/Layouts: It seems to use the Word default template's font type/sizes and disregards the types used in original Word documents (even the first file).

How can I make it so that the Combined File keeps the formatting of individual documents? Codes were adopted from this page:
https://www.datanumen.com/blogs/2-ways-quickly-merge-multiple-word-documents-one-via-vba/

Code:
Sub Combine_Selected_Documents()

Dim dlgFile As FileDialog
Dim nTotalFiles As Integer, nEachSelectedFile As Integer
Dim strFolder As String, myFolder As folder

Dim fso As FileSystemObject
Set fso = CreateObject("Scripting.FileSystemObject")

Set dlgFile = Application.FileDialog(msoFileDialogFilePicker)

With dlgFile
    .AllowMultiSelect = True
    If .Show <> -1 Then
        Exit Sub
    Else
        strFolder = .SelectedItems(1) & Application.PathSeparator
        nTotalFiles = .SelectedItems.Count
    End If
End With

'Add code to check if PDF file exists, if no Word files were selected, if it works with different Word Document names, etc.

Dim myArr
myArr = Split(strFolder, "\")
strFolder = myArr(0)
For i = 1 To UBound(myArr) - 2
    strFolder = strFolder & "\" & myArr(i)
Next i
strFolder = strFolder & "\"
'MsgBox strFolder

Dim objWord As Word.Application
Set objWord = CreateObject("Word.Application")
Dim objDoc As Word.Document
Set objDoc = objWord.Documents.Add
objWord.Visible = False 'Temporary
Set objSelection = objWord.Selection
'objSelection.TypeText ("This is my text in Word Document using Excel")

For nEachSelectedFile = 1 To nTotalFiles
objSelection.InsertFile dlgFile.SelectedItems.Item(nEachSelectedFile)
   
    If nEachSelectedFile < nTotalFiles Then
        objSelection.InsertBreak Type:=wdSectionBreakNextPage
        
    Else
        If nEachSelectedFile = nTotalFiles Then
        objDoc.ExportAsFixedFormat OutputFileName:=strFolder & "Forms.pdf", ExportFormat:=wdExportFormatPDF ', OpenafterPublish:=OpenafterPublish
        objDoc.SaveAs2 (strFolder & "Forms Combined.docx")
        objDoc.Close
        'objWord.Quit
        
        Exit Sub
        End If
    End If
Next nEachSelectedFile

MsgBox "All the Selected Documents have been Combined into 1 PDF File."

End Sub
 
Thank you. I did kill all Excel/Word sessions and the codes ran without any error. Looking at the merged file, I have noticed the following:

Formatting Details that appear to be preserved --> Thank you!
- Headers
- Page Margins

Formatting Details that need fixing:
- Line Spacing: Source documents have 0pt After, Single spacing. The newly merged document has 8pt After, 1.08 Multiple Spacing, which seems to be default for my blank/new Word file. How do I tell Word to use 0pt After, Single spacing for the new document? Or keep the source spacing? (I can certainly update my own Normal Template, but this will need to be the same for other users as well).
- Font Types: Some keep the source types and some don't.
- Paragraph Alignment: Some keep the source types (Justified) and some don't.
- Page/Section Breaks: Some documents have been pasted on the same page (even though it has Section Break). How do I ensure each document will be pasted on the next page?
- Footers: Preserved, with the exception of Page #s . Some original documents were 2 pages and had "Page # of 2" embedded. Now the page numbers are wonky (e.g. "Page 6 of 3", then a few pages down I see "Page 10 of 2", etc.)

Apologies for being a pain; these documents are filed and thus the formatting must be preserved. I have no clue why some of the selected files preserve the formatting and others don't :\
Thank you...
 
Upvote 0

Excel Facts

Enter current date or time
Ctrl+: enters current time. Ctrl+; enters current date. Use Ctrl+: Ctrl+; Enter for current date & time.
In order to tackle the items to fix, was thinking --

1. Line Spacing/Font Types/Paragraph Alignment Issues: During the LayoutTransfer Sub, would I be able to add something like this after With ... PageSetup lines?
(I'm getting Expression error so I don't think they're quite right...)
Code:
With wdDocTgt.Sections.Last.Paragraph
    .LineSpacing = Single
    .SpaceAfter = 0pt
    .Alignment = Justify
    .FontStyle = Times New Roman
    .FontSize = 10pt
End With

2. Page # for Footers: I tried adding a Page # property... getting a "Value out of range" error...
Code:
Dim lPages As Integer
With wdDocSrc.Sections.Last.PageSetup
lPages = .LinesPage

With wdDocTgt.Sections.Last.PageSetup
.LinesPage = lPages
End With
 
Last edited:
Upvote 0
Your formatting inconsistencies in the output are probably due to either:
1. The source documents being created by people/processes that didn't respect the Style definitions of the paragraphs concerned; and/or
2. Being defined differently from the template and/or each other.
Using .Copy & .PasteAndFormat wdFormatOriginalFormatting instead of the .FormattedText method will probably overcome most of the issues you're having with Line/Paragraph Spacing, Fonts, & Paragraph Alignment. However, it also creates other problems, including breaking anything related to references based on Style Definitions (e.g. Tables of Contents, Caption references, StyleRef fields).

Regarding the page #s , there really isn't anything that can be done about that without investing a lot of coding effort in analysing all of the field coding & settings for each header/footer, then figuring out how to preserve that in the combined document.

Regarding the Section breaks, that suggests the source documents have their own 'continuous' Section breaks and Word is having trouble respecting the 'next page' Section breaks that are required to achieve the desired outcome. Short of deleting the 'continuous' Section breaks in the source documents, there's not a lot that can be done about that.

Given the above, you may need to invest in software like Adobe Acrobat Pro and use that to combine the individual PDFs after Word has converted them to that format. That way, issues regarding Line/Paragraph Spacing, Fonts, & Paragraph Alignment, Section breaks & page #s simply don't arise.
 
Upvote 0
I agree with your conclusions... I did try to change the formatting of source documents for testing but the end result was the same.
Initially my approach was to convert every Word file in a folder into individual PDFs [https://www.datanumen.com/blogs/3-macro-ways-quickly-convert-word-documents-pdf-files/], then merge them into 1... but kept getting this:
Run-Time Error '429': ActiveX component can't create object
Set objDoc = Documents.Open(FileName:=strFolder & strFile)

I do have Adobe Pro but most users have Nitro as PDF software. I have yet to find a set of codes that combine PDFs into 1 using Nitro :\

Code:
Sub BatchConvertDocxToPDF()
  Dim objDoc As Document
  Dim strFile As String, strFolder As String
 
  'Initialization
  strFolder = "C:\Users\Desktop\Test\"
  strFile = Dir(strFolder & "*.docx", vbNormal)
 
  'Precess each file in the file folder and convert them to pdf.
  While strFile <> ""
    Set objDoc = Documents.Open(FileName:=strFolder & strFile)
 
    objDoc.ExportAsFixedFormat OutputFileName:=Replace(objDoc.FullName, ".docx", ".pdf"), _
      ExportFormat:=wdExportFormatPDF, OpenAfterExport:=False, OptimizeFor:=wdExportOptimizeForPrint, _
      Range:=wdExportAllDocument, Item:=wdExportDocumentContent
 
    objDoc.Close
    strFile = Dir()
  Wend
End Sub
 
Upvote 0
Try:
Code:
Sub BatchConvertDocsToPDF()
'Note: This code required a reference to Word, via Tools|References in the VBE.
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String
strFolder = GetFolder: If strFolder = "" Then Exit Sub
Dim wdApp As New Word.Application, wdDoc As Word.Document: wdApp.Visible = True
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
  Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
  With wdDoc
    .SaveAs Filename:=strFolder & "\" & Split(strFile, ".doc")(0) & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
    .Close SaveChanges:=False
  End With
  strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing
Application.ScreenUpdating = True
End Sub
 
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
 
Last edited:
Upvote 0
Actually, am wondering again...
Using .Copy & .PasteAndFormat wdFormatOriginalFormatting instead of the .FormattedText method will probably overcome most of the issues you're having with Line/Paragraph Spacing, Fonts, & Paragraph Alignment. However, it also creates other problems, including breaking anything related to references based on Style Definitions (e.g. Tables of Contents, Caption references, StyleRef fields).

I could probably use the earlier codes for the set of Word Docs without any Style Definitions. How would I go about changing the ".FormattedText" method to ".Copy & .PasteAndFormat wdFormatOriginalFormatting" method?
 
Upvote 0
Try:
Code:
Sub BatchConvertDocsToPDF()
'Note: This code required a reference to Word, via Tools|References in the VBE.
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String
strFolder = GetFolder: If strFolder = "" Then Exit Sub
Dim wdApp As New Word.Application, wdDoc As Word.Document: wdApp.Visible = True
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
  Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
  With wdDoc
    .SaveAs Filename:=strFolder & "\" & Split(strFile, .doc)(0) & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
    .Close SaveChanges:=False
  End With
  strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing
Application.ScreenUpdating = True
End Sub
 
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function

I'm getting a Run-Time Error '438': Object doesn't support this property or method :(
Code:
.SaveAs FileName:=strFolder & "\" & Split(strFile, .doc)(0) & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
 
Upvote 0
It's the little things ...

Change:
Code:
.SaveAs Filename:=strFolder & "\" & Split(strFile, .doc)(0) & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
to:
Code:
.SaveAs Filename:=strFolder & "\" & Split(strFile, ".doc")(0) & ".pdf", FileFormat:=wdFormatPDF, AddToRecentFiles:=False
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,827
Messages
6,181,194
Members
453,021
Latest member
pingpong7117

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top