Can VBA be used to search a folder of .docx's for text strings and copypaste to Excel?

Sadsmileyface

New Member
Joined
Mar 21, 2013
Messages
11
I've got kind of a complicated problem and I'm not so sure if there is an answer to it.

I have to do a manual process of going through lots of .docx files, finding text strings and recording information in an Excel spreadsheet. It's done daily. The Word files contain (amongst a lot of other things) this information (explanatory parenthesis is mine for the example, no brackets in the actual files):

Inspection Type: (this is a category)
Inspection Classification: (another category)
Inspected: (this is a location name)
Inspector: (this is a username)
Inspection Start: (this is a date / time stamp)
Inspection End: (this is a date / time stamp)

So I would have an Excel sheet with 6 columns, and upon each successive row, the contents I've taken from each document. This takes up a lot of time over a month.

Could a macro do this task- search though a folder of Word files and grab whatever comes after ": " to paste to Excel?
 

Excel Facts

Waterfall charts in Excel?
Office 365 customers have access to Waterfall charts since late 2016. They were added to Excel 2019.
Just in case that doesn't work for you.
Code:
Sub ImportPhrases()
'---------------------------------------------------------------------------------------
' Procedure : ImportPhrases
' Author    : David
' Date      : 12/28/2016
' Purpose   : Imports phrase data from file folder of Word files.
'---------------------------------------------------------------------------------------
'
Dim WS As Worksheet
Dim NextRow As Long
Dim Phrase As Variant
Dim WordWasNotRunning As Boolean
Dim oWord As Word.Application
Dim oDoc As Word.Document
Dim wRng As Word.Range
Dim MyPath As String
Dim FN As Variant

Set WS = ActiveSheet
MyPath = "C:\Users\David\Documents\My Documents\VBA\ExcelForum\"
'Data imports to this row
NextRow = 2
'Or if you want to import to next row
With WS
    NextRow = .Cells(.Rows.Count, 2).End(xlUp).Row + 1
End With
'Get existing instance of Word if it's open; otherwise create a new one

On Error Resume Next

Set oWord = GetObject(, "Word.Application")
If Err Then
    Set oWord = New Word.Application
    WordWasNotRunning = True
End If

On Error GoTo Err_Handler

'Inspection Type: (this is a category)
'Inspection Classification: (another category)
'Inspected: (this is a location name)
'Inspector: (this is a username)
'Inspection Start: (this is a date / time stamp)
'Inspection End: (this is a date / time stamp)

'All phrases seperated by comma.
Phrase = Split("Inspection Type:,Inspection Classification:,Inspected:,Inspector:,Inspection Start:,Inspection End:", ",")

'MyPath = "C:\"

FN = Dir(MyPath & "*.doc?")

Do
    Set oDoc = oWord.Documents.Open(MyPath & FN, , True)
    
    For A = 0 To UBound(Phrase)
        'Use Word's find method for locate phrase.
        Set wRng = oDoc.Range
        With wRng.Find
            .Text = Phrase(A)
            .Forward = True
            .Wrap = wdFindStop
            .MatchCase = False
            .MatchFuzzy = False
            .MatchWholeWord = True
            .MatchWildcards = False
        End With
        
        wRng.Find.Execute
        If wRng.Find.Found Then
            'We found a phrase.
            wRng.Collapse wdCollapseEnd
            wRng.MoveStart wdCharacter, 1
            wRng.MoveEndUntil vbCr
            WS.Range("B" & NextRow) = Application.Trim(wRng)
            NextRow = NextRow + 1
        End If
    Next
    oDoc.Close False
    FN = Dir
Loop Until FN = ""

'Make sure you release object references.
Set oWord = Nothing
Set oDoc = Nothing
Set myDialog = Nothing

'quit
Exit Sub

Err_Handler:
    MsgBox "Word caused a problem. " & Err.Description, vbCritical, "Error: " _
            & Err.Number
    If WordWasNotRunning Then
        oDoc.Close False
        oWord.Quit
    End If
End Sub
 
Upvote 0
Or, based on the problem description, you could use:
Code:
Sub GetWordData()
'Note: this code requires a reference to the Word object model
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String
strFolder = GetFolder
If strFolder = "" Then Exit Sub
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim WkSht As Worksheet, r As Long, c As Long, StrTmp As String
Set WkSht = ActiveSheet
r = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
strFile = Dir(strFolder & "\*.docx", vbNormal)
While strFile <> ""
  Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
  With wdDoc
    With .Range
      With .Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = "Inspection Type:*Inspection End:*^13"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindstop
        .Format = False
        .MatchWildcards = True
        .Execute
      End With
      Do While .Find.Found
        r = r + 1
        StrTmp = Replace(.Text, vbTab, " ")
        Do While InStr(StrTmp, "  ")
          StrTmp = Replace(StrTmp, "  ", " ")
        Loop
        Do While InStr(StrTmp, vbCr & " ")
          StrTmp = Replace(StrTmp, vbCr & " ", vbCr)
        Loop
        Do While InStr(StrTmp, vbCr & vbCr)
          StrTmp = Replace(StrTmp, vbCr & vbCr, vbCr)
        Loop
        For c = 1 To UBound(Split(.Text, ":"))
          WkSht.Cells(r, c) = Trim(Split(Split(.Text, ":")(c), vbCr)(0))
        Next
        .Collapse wdCollapseEnd
        .Find.Execute
      Loop
    End With
    .Close SaveChanges:=False
  End With
  strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing
Application.ScreenUpdating = True
End Sub

Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
Even that probably has some redundant code. For example, it processes multiple inspection 'sets' in the same document and has code to eliminate tabs, empty paragraphs, etc. from the matched data - none of which may be present in the source files.
 
Upvote 0

Forum statistics

Threads
1,223,803
Messages
6,174,687
Members
452,577
Latest member
Filipzgela

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top