Search closed word document and excel file without opening and check for match

bradyboyy88

Well-known Member
Joined
Feb 25, 2015
Messages
562
Hi,

I am wondering if there is a way to use the find keyword function in a word document and excel file without actually opening it using vba?
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
I will have to assume you mean "Opening in Word" or "Opening with Excel" because the act of loading file contents in to memory is called "Opening". File contents needs to be in memory to perform string evaluation. You can attempt to load the raw contents of a word file or excel file in to memory and do your analysis but it won't be easy. Crack open the file you are talking about with a good text editor or "Hex editor" if you want to see what I mean. Using the Word API from VBA allows you to use the "find" function. That is what the "find" function is. Without the Word or Excel API there is no "Find" function. You can use the "Find" function of Excel on the text from a Word document but you need to get that text string out of the file somehow.

Notice that the most basic way to load file contents in to memory is a command called "Open"

https://stackoverflow.com/questions/11528694/read-parse-text-file-line-by-line-in-vba
 
Upvote 0
I will have to assume you mean "Opening in Word" or "Opening with Excel" because the act of loading file contents in to memory is called "Opening". File contents needs to be in memory to perform string evaluation. You can attempt to load the raw contents of a word file or excel file in to memory and do your analysis but it won't be easy. Crack open the file you are talking about with a good text editor or "Hex editor" if you want to see what I mean. Using the Word API from VBA allows you to use the "find" function. That is what the "find" function is. Without the Word or Excel API there is no "Find" function. You can use the "Find" function of Excel on the text from a Word document but you need to get that text string out of the file somehow.

Notice that the most basic way to load file contents in to memory is a command called "Open"

https://stackoverflow.com/questions/11528694/read-parse-text-file-line-by-line-in-vba

So how would you approach this if you did not want to load the word or excel api? I am just worried about resource usage of this program. I need to go through every excel file and word document in a subdirectory to find if it contains a certain keyword and if it does then I am basically saving the hyperlink to another excel file. . There are hundreds of these files so I just wanted to approach this in the quickest and resource friendly solution possible so I thought a function similar to excel4macro that lets you read from a closed workbook existed in a search functionality. I am open to suggestions.
 
Last edited:
Upvote 0
I have made a program that does something just like you say. It opens each workbook one by one, performs it's function, and then closes the workbook before moving on to the next one. Only one workbook is open at a time. My program takes a few minutes to run and I put a lot of work in to optimizing it. The slowest part is I\O from disk and there is no way to optimize that. It's limited by the read\write speed of your hardware and the functions of your software environment. In my case, I am accessing workbooks on a share and so I have network lag added to my run time. I use "FileSystemObject" which makes calls to the OS for finding the file and I use the built-in "Open" feature of excel because I wouldn't want to re-write either of those functions from scratch.

If you want the absolute fastest then you need stop using VBA for excel and code your own Office file handlers in assembly language. You would be able to make something faster than the built-in Microsoft functions this way but it does require learning assembly language.

It's the words that you are using that make this discussion difficult. excel4macro reads from a closed workbook by opening it. The act of reading a closed workbook is called opening. That's the problem I have with this line of questioning.
 
Upvote 0
So any ideas what could be causing a 5792 run time error with application defined or object defined error? Only word documents in the folders so only the top if statement is being hit. If i remove the documents.open line then everything works but I need to open them so what might be leading to this error? It opens one document then hits the error basically. Also, for a recursive approach like this is it less optimized to createobject at this level or the nest ?

Code:
Sub Search()


    Dim FileSystem As Object
    Dim HostFolder As String


    HostFolder = Worksheets("Search Interface").Range("D2").Value


    Set FileSystem = CreateObject("Scripting.FileSystemObject")
    RecursiveFolderSearch FileSystem.GetFolder(HostFolder)


End Sub


Sub RecursiveFolderSearch(Folder)
    
    Dim SubFolder
    Dim File
    
    Dim obj
    Dim objDoc


    For Each SubFolder In Folder.SubFolders
        RecursiveFolderSearch SubFolder
    Next
    
    For Each File In Folder.Files


        'Check if word document
        If Right(File.Name, 4) = "docx" Or Right(File.Name, 4) = ".doc" Or Right(File.Name, 4) = "docm" Or Right(File.Name, 4) = "dotm" Then
        
            Set obj = CreateObject("Word.Application")
            obj.Visible = False
            Set objDoc = obj.Documents.Open(File.Path)
            
            
            'code to search for keyword
            
            
            obj.Quit
            Set obj = Nothing
            Set objDoc = Nothing
        End If
        
        'Check if excel document
        If Right(File.Name, 4) = "xlsm" Or Right(File.Name, 4) = "xlsx" Or Right(File.Name, 4) = ".xls" Or Right(File.Name, 4) = ".xlt" Then
            
            Set obj = CreateObject("Excel.Application")
            obj.Visible = False
            Set objDoc = obj.Workbooks.Open(File.Path)
            
            
            'code to search for keyword
            
            
            obj.Quit
            Set obj = Nothing
            Set objDoc = Nothing


        End If
    Next
    
End Sub
 
Last edited:
Upvote 0
Ok,
There is a lot going on here. First thing I notice is "obj.Quit" will close word after each doc which is undesirable because then you have to wait for Word to open before you can open the next doc. If we find a Word file we will leave word open until we are done. It's also seen in the excel branch so if it opens an excel doc it will close excel when done. If we are running this macro from excel then the rug is pulled out from under us.

In the Excel section we don't need to create an application object because we can use the default "Application" object. (Unless this macro is running from Word)

Here is something for you to try:

Code:
Sub RecursiveFolderSearch(Folder)
    
    Dim SubFolder As Object
    Dim File As Object

    Dim WordApp As Object
    Dim WordDoc As Object

    For Each SubFolder In Folder.SubFolders
        RecursiveFolderSearch SubFolder
    Next
    
    For Each File In Folder.Files
        Select Case Right(File.name, 4)
            Case "docx" Or ".doc" Or "docm" Or "dotm" 'Check if word document
                'If Word is open we will use it, if not we will open it
                On Error Resume Next
                Set WordApp = GetObject(, "Word.Application")
                On Error GoTo 0
                If WordApp Is Nothing Then Set WordApp = CreateObject("Word.Application")
                'WordApp.Visible = True
                Set WordDoc = WordApp.Documents.Open(File.Path)
            
            'code to search for keyword
            
                WordDoc.Close
                Set WordDoc = Nothing
            Case "xlsm" Or "xlsx" Or ".xls" Or ".xlt" 'Check if excel document
                Application.Workbooks.Open (File.Path)
                
            'code to search for keyword
                           
                Application.Workbooks(File.Path).Close savechanges:=False
        End Select
    Next
End Sub
 
Upvote 0
I keep getting a type 13 mismatch error at
Rich (BB code):
 Case "docx" Or ".doc" Or "docm" Or "dotm" 'Check if word document 
for some reason?

I like the changes you made though. Makes sense not to keep recreating the word api if I do not have to.
 
Last edited:
Upvote 0
Hello bradyboyy88,

The syntax for the Select Case is wrong. It should...

Code:
        Select Case LCase(Right(File.name, 4))        ' Make everything lowercase.
            Case Is ="docx",".doc","docm","dotm"      'Check if word document
                'If Word is open we will use it, if not we will open it
                On Error Resume Next
                    Set WordApp = GetObject(, "Word.Application")
                On Error GoTo 0
                If WordApp Is Nothing Then Set WordApp = CreateObject("Word.Application")
                'WordApp.Visible = True
                Set WordDoc = WordApp.Documents.Open(File.Path)
            
            'code to search for keyword
            
                WordDoc.Close
                Set WordDoc = Nothing
            Case Is="xlsm","xlsx",".xls",".xlt"       'Check if excel document
                Application.Workbooks.Open (File.Path)
                
            'code to search for keyword
                           
                Application.Workbooks(File.Path).Close savechanges:=False
        End Select
 
Last edited:
Upvote 0
The case syntax is much cleaner so thanks for that edit. That fixed the error but now it draws another. So stepping into code it runs through the first loop and opens the first word doc then closes it. Then it goes to open the next one in the loop and hits a 5792 run time error application defined or object defined error at Set WordDoc = WordApp.Documents.Open(File.Path) . This is the same error I ran into before under the original code. Not sure what it could be.
 
Upvote 0

Forum statistics

Threads
1,218,086
Messages
6,140,370
Members
450,285
Latest member
CoryCrandall

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top