Search string in First Page Header of Word docs in Folder (sub folders)

Sleeplol

Board Regular
Joined
Apr 10, 2019
Messages
194
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Hi,
I'm hoping for help modifying the below (or possibly scrap) script.

Routine:
Return all MS Word File Names from a folder (sub folders would be even better) that contain a string in the First Page Header.

For instance the user would like all files that pertain to "Pellet Tech". So, she adds the folder path to cell Worksheets("Data").Range("I1") and adds the string ("Pellet Tech") to Worksheets("Data").Range("J1")
The Front Header is the below which notes "Pellet Tech" in the Apply to:
1681841881180.png


This script only returns all the file names from a particular folder regardless of extension:
VBA Code:
Sub LoopThroughFiles()

Dim oFSO As Object
Dim oFolder As Object
Dim oFile As Object
Dim i As Integer

Set oFSO = CreateObject("Scripting.FileSystemObject")

Set oFolder = oFSO.getfolder(Worksheets("Data").Range("I1"))

For Each oFile In oFolder.Files

    Worksheets("Data").Cells(i + 1, 1) = oFile.Name

    i = i + 1

Next oFile

End Sub

Thanks for any help with this
 

Attachments

  • 1681841874953.png
    1681841874953.png
    12.5 KB · Views: 28
For funzies, do you want to trial changing this line of code in the code I provided...
Code:
Set oHF = WordApp.ActiveDocument.StoryRanges(7) '7 = wdPrimaryHeaderStory
Using the Instr function will be quicker than using Find. What code are you currently using? To search subfolders, the search need to be recursive which is what it seem like you're doing. We should be able to come up with something if the original part works. Dave
 
Upvote 0

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Ok, so I made the update using the Instr function
However, it is only returning 1 of the documents with the string in the header.
Also, I'm testing with a folder with 60 word docs and your code is returning a max of 9 documents... Hmmm.

Your Code
VBA Code:
Sub LoopThroughFiles2()
    Dim oFSO As Object, oHF As Object
    Dim oFolder As Object, WordApp As Object
    Dim oFile As Object
    Dim i As Integer
    
    'initialize i to 1
    i = 1
    
    'create Word app
    On Error Resume Next
    Set WordApp = GetObject(, "Word.Application")
    If Err.Number <> 0 Then
        Set WordApp = CreateObject("Word.Application")
    End If
    On Error GoTo 0
    WordApp.Visible = False
    
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFolder = oFSO.GetFolder(Worksheets("Data").Range("I1"))
    
    For Each oFile In oFolder.Files
        If oFile.Name Like "*" & ".doc" & "*" Then
            WordApp.Documents.Open (oFile)
            Set oHF = WordApp.ActiveDocument.Sections(1).Headers(1)
       
            If InStr(oHF.Range.Text, CStr(Worksheets("Data").Range("J1"))) Then
                Worksheets("Data").Cells(i + 1, 1) = oFile.Name
                i = i + 1
            End If
            WordApp.ActiveDocument.Close SaveChanges:=False
        End If
    Next oFile
    
    WordApp.Quit
    Set WordApp = Nothing
    Set oFolder = Nothing
    Set oFSO = Nothing
End Sub

My tinker code using Instr:
VBA Code:
Sub SearchWordDocs()
    Dim objWord As Object ' Word.Application
    Dim strSearch As String
    Dim strFolderPath As String
    Dim intRow As Integer

    ' Get the search string and folder path from the Info worksheet
    strSearch = Worksheets("Data").Range("J1").Value
    strFolderPath = Worksheets("Data").Range("I1").Value

    ' Display the folder path and search string in a message box
    MsgBox "Folder path: " & strFolderPath & vbCrLf & "Search string: " & strSearch

    ' Initialize the Word application
    On Error Resume Next
    Set objWord = GetObject(, "Word.Application")
    If Err.Number <> 0 Then
        Set objWord = CreateObject("Word.Application")
    End If
    On Error GoTo 0
    objWord.Visible = False ' Set to true if you want to see the Word documents being searched

    ' Loop through all files in the specified folder and its subfolders
    RecursiveFolderSearch objWord, strSearch, strFolderPath

    ' Close the Word application
    objWord.Quit
    Set objWord = Nothing
End Sub

Sub RecursiveFolderSearch(objWord As Object, strSearch As String, strFolderPath As String)
    Dim objFSO As Object ' Scripting.FileSystemObject
    Dim objFolder As Object ' Scripting.Folder
    Dim objFile As Object ' Scripting.File
    Dim strFilePath As String
    Dim intRow As Integer
    Dim objDoc As Object ' Word.Document

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFolder = objFSO.GetFolder(strFolderPath)

    ' Loop through all files in the folder
    For Each objFile In objFolder.Files
        If LCase(Right(objFile.Name, 4)) Like ".doc*" Then ' Check if the file is a Word document
            ' Open the Word document
            strFilePath = objFile.path
            Debug.Print "Opening document: " & strFilePath
            On Error Resume Next
            Set objDoc = objWord.Documents.Open(strFilePath, , , , , , , , , , , True)
            If Err.Number <> 0 Then
                Debug.Print "Error opening document: " & strFilePath
                Err.Clear
                Set objDoc = Nothing
            End If
            On Error GoTo 0

            If Not objDoc Is Nothing Then
                ' Search the front page header of the document for the search string
                With objDoc.Sections(1).Headers(wdHeaderFooterFirstPage).Range
                    Debug.Print "Searching header in document: " & strFilePath
                    If InStr(1, .Text, strSearch, vbTextCompare) > 0 Then
                        ' If the search string is found, add the file path to the Data worksheet
                        intRow = Worksheets("Data").Range("A" & Rows.Count).End(xlUp).Row + 1
                        Worksheets("Data").Range("A" & intRow).Value = strFilePath
                        Debug.Print "Document found: " & strFilePath
                    Else
                        Debug.Print "Search string not found in header of document: " & strFilePath
                    End If
                End With

                ' Close the Word document
                objDoc.Close False 
                Set objDoc = Nothing
            End If
        End If
    Next objFile
End Sub
 
Upvote 0
The actual folder could have hundreds of Word Docs, and depending on the string, could need to return several dozen docs
 
Upvote 0
Maybe trial the story range thing..
Code:
With objDoc.StoryRanges(7)
If InStr(.Range.Text, strSearch) Then
I found some subfolder search code if this part works. Dave
 
Upvote 0
Here is your recommended update, also with removing case sensitivity
VBA Code:
Sub LoopThroughFiles2()
    Dim oFSO As Object, oHF As Object
    Dim oFolder As Object, WordApp As Object
    Dim oFile As Object
    Dim i As Integer
   
    'initialize i to 1
    i = 1
   
    'create Word app
    On Error Resume Next
    Set WordApp = GetObject(, "Word.Application")
    If Err.Number <> 0 Then
        Set WordApp = CreateObject("Word.Application")
    End If
    On Error GoTo 0
    WordApp.Visible = False
   
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFolder = oFSO.GetFolder(Worksheets("Data").Range("I1"))
   
    For Each oFile In oFolder.Files
        If oFile.Name Like "*" & ".doc" & "*" Then
            WordApp.Documents.Open (oFile)
            Set oHF = WordApp.ActiveDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range
            If InStr(1, oHF.Text, CStr(Worksheets("Data").Range("J1")), vbTextCompare) > 0 Then
                Worksheets("Data").Cells(i + 1, 1) = oFile.Name
                i = i + 1
            End If
            WordApp.ActiveDocument.Close SaveChanges:=False
        End If
    Next oFile
   
    WordApp.Quit
    Set WordApp = Nothing
    Set oFolder = Nothing
    Set oFSO = Nothing
End Sub
 
Upvote 0
Looks like it's capturing everything; I'll try a larger file set
 
Upvote 0
I made a test folder and I'll give it some time with the subfolder search tomorrow... hockey playoffs are on in a few minutes. Dave
 
Upvote 0
Cool! Who is your team? Leafs or Jets?

Ok, so this is the last bit of tinkering I have in me for tonight, it seems to work, I'll test more tomorrow:
1) Error handling incase of a corrupted Word doc
2) Non case sensitivity
3) Subfolder search

Dave, I really appreciate you helping me through this one!!!

Here's the final script (Hopefully LOL)

VBA Code:
Sub LoopThroughFiles2()
'SearchFilesInSubfolders
    Dim oFSO As Object, oHF As Object
    Dim oFolder As Object, WordApp As Object
    Dim oFile As Object
    Dim i As Integer
   
    'initialize i to 1
    i = 1
   
    'create Word app
    On Error Resume Next
    Set WordApp = GetObject(, "Word.Application")
    If Err.Number <> 0 Then
        Set WordApp = CreateObject("Word.Application")
    End If
    On Error GoTo 0
    WordApp.Visible = False
   
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFolder = oFSO.GetFolder(Worksheets("Data").Range("I1"))
   
    'search through all files in the root folder
    For Each oFile In oFolder.Files
        If oFile.Name Like "*" & ".doc" & "*" Then
            On Error Resume Next
            WordApp.Documents.Open oFile.path
            If Err.Number = 0 Then
                Set oHF = WordApp.ActiveDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range
                If InStr(1, oHF.Text, CStr(Worksheets("Data").Range("J1")), vbTextCompare) > 0 Then
                    Worksheets("Data").Cells(i + 1, 1) = oFile.Name
                    i = i + 1
                End If
                WordApp.ActiveDocument.Close SaveChanges:=False
            End If
            On Error GoTo 0
        End If
    Next oFile
    
    'recursively search through all subfolders
    For Each oFolder In oFSO.GetFolder(Worksheets("Data").Range("I1")).SubFolders
        LoopThroughSubFolders oFolder.path, i, WordApp
    Next oFolder
   
    WordApp.Quit
    Set WordApp = Nothing
    Set oFolder = Nothing
    Set oFSO = Nothing
End Sub

Sub LoopThroughSubFolders(sFolder As String, i As Integer, WordApp As Object)
    Dim oFSO As Object, oHF As Object
    Dim oSubFolder As Object, oFile As Object
    
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oSubFolder = oFSO.GetFolder(sFolder)
   
    'search through all files in the current subfolder
    For Each oFile In oSubFolder.Files
        If oFile.Name Like "*" & ".doc" & "*" Then
            On Error Resume Next
            WordApp.Documents.Open oFile.path
            If Err.Number = 0 Then
                Set oHF = WordApp.ActiveDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range
                If InStr(1, oHF.Text, CStr(Worksheets("Data").Range("J1")), vbTextCompare) > 0 Then
                    Worksheets("Data").Cells(i + 1, 1) = oFile.Name
                    i = i + 1
                End If
                WordApp.ActiveDocument.Close SaveChanges:=False
            End If
            On Error GoTo 0
        End If
    Next oFile
    
    'recursively call the function to search through all subfolders of the current subfolder
    For Each oSubFolder In oSubFolder.SubFolders
        LoopThroughSubFolders oSubFolder.path, i, WordApp
    Next oSubFolder
   
End Sub
 
Upvote 0
Intermission.... Leaf's are up 3-0 but... there's 2 periods to go. There's always time for another epic collapse. I'm a Leaf's fan, but I'm from Manitoba, so the Jet's are pretty much my favourite. It's going to be a great night! I'll test out your code tomorrow. Dave
 
Upvote 0
A semi successful night of hockey with a Leafs win and a Jets loss but... the Jets are coming home with a series split and the Leafs are leaving town with the split. Anyways, I trialed the storyrange thing but it seemed mired with catastrophic error (XL opened up a backup copy and continued?). I made a few changes to your code and it seems to work pretty slick. I added some corrupt file management and it doesn't seem to be case sensitive for some reason? HTH. Dave
Code:
Option Explicit
Dim WordApp As Object
Sub LoopThroughFiles2()
'Search sht Data "I1" Folder(s)for Word doc headers for string in sht Data "J1"
Dim oFSO As Object, oHF As Object, oFile As Object
Dim oFolder As Object, TFolder As Object, LastRow As Integer

'create Word app
On Error Resume Next
Set WordApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then
    Set WordApp = CreateObject("Word.Application")
End If
WordApp.Visible = False

Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFolder = oFSO.GetFolder(Worksheets("Data").Range("I1"))

'search through all files in the root folder
For Each oFile In oFolder.Files
    If oFile.Name Like "*" & ".doc" & "*" Then
        On Error Resume Next
        WordApp.Documents.Open oFile.Path
        If Err.Number = 0 Then
            Set oHF = WordApp.ActiveDocument.Sections(1).Headers(1).Range
            If InStr(1, oHF.Text, CStr(Worksheets("Data").Range("J1")), vbTextCompare) > 0 Then
                LastRow = Sheets("Data").Range("A" & Sheets("Data").Rows.Count).End(xlUp).Row
                Worksheets("Data").Cells(LastRow + 1, 1) = oFile.Name
            End If
            WordApp.ActiveDocument.Close SaveChanges:=False
        Else
        On Error GoTo 0
        MsgBox "Corrupt File " & oFile.Name
        End If
    End If
Next oFile

'recursively search through all subfolders
For Each TFolder In oFSO.GetFolder(oFolder).SubFolders
    LoopThroughSubFolders TFolder.Path
Next TFolder

If Err.Number <> 0 Then
On Error GoTo 0
MsgBox "Error 1"
End If
WordApp.Quit
Set WordApp = Nothing
Set TFolder = Nothing
Set oHF = Nothing
Set oFile = Nothing
Set oFolder = Nothing
Set oFSO = Nothing
End Sub

Sub LoopThroughSubFolders(sFolder As String)
Dim oFSO As Object, oHF As Object, LastRow As Integer
Dim oSubFolder As Object, oFile As Object, TFolder As Object

Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oSubFolder = oFSO.GetFolder(sFolder)

'search through all files in the current subfolder
For Each oFile In oSubFolder.Files
If oFile.Name Like "*" & ".doc" & "*" Then
    On Error Resume Next
    WordApp.Documents.Open oFile.Path
    If Err.Number = 0 Then
        Set oHF = WordApp.ActiveDocument.Sections(1).Headers(1).Range
        If InStr(1, oHF.Text, CStr(Worksheets("Data").Range("J1")), vbTextCompare) > 0 Then
            LastRow = Sheets("Data").Range("A" & Sheets("Data").Rows.Count).End(xlUp).Row
            Worksheets("Data").Cells(LastRow + 1, 1) = oFile.Name
        End If
        WordApp.ActiveDocument.Close SaveChanges:=False
    Else
    On Error GoTo 0
    MsgBox "Corrupt File " & oFile.Name
    End If
End If
Next oFile

'recursively call the function to search through all subfolders of the current subfolder
For Each TFolder In oSubFolder.SubFolders
    LoopThroughSubFolders TFolder.Path
Next TFolder

If Err.Number <> 0 Then
On Error GoTo 0
MsgBox "Error 2"
End If
Set oFSO = Nothing
Set oSubFolder = Nothing
Set TFolder = Nothing
Set oHF = Nothing
Set oFile = Nothing
End Sub
 
Upvote 0

Forum statistics

Threads
1,225,073
Messages
6,182,700
Members
453,132
Latest member
nsnodgrass73

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top