Excel file search engine

cknnugget

Board Regular
Joined
Jun 29, 2020
Messages
50
Office Version
  1. 365
Platform
  1. Windows
I'm fairly new to vba and mainly learn from Mr.Excel forums and youtube. No information could be found to make a pseudo search engine in Excel vba to return results from a folder. I have a folder with several sub folders which contain policies in PDF and word documents. In excel, I would like to make basically a search engine where someone can type a keyword and it would look inside of the documents to find the keyword then return hyperlinked results with a short paraphrase for the user to select. Is this feasible? If so does anyone know where I can find information to build it? Or am I asking too much from VBA?
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
try this, run sub "EnterKeyWords" then you can enter multiple keywords with ";" as determine then select folder that you need to search:
VBA Code:
Sub EnterKeyWords()
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Dim Keyword As String
    Dim matched As Long
    Keyword = Application.InputBox(prompt:="Multiple keywords determine with ';'", Title:="What do you find?", Type:=2)
    Set folder = Application.FileDialog(msoFileDialogFolderPicker)
    folder.Title = "Choose parent folder to search"
    If folder.Show <> -1 Then Exit Sub
    xDir = folder.SelectedItems(1)
    Call SearchInFolder(xDir, Keyword, matched)
    MsgBox matched & " results was found", vbInformation
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
End Sub

Private Sub SearchInFolder(ByVal xFolderName As String, ByVal Keyword As String, ByRef matched As Long)
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Dim xFSO As Object, xFolder As Object, xSubFolder As Object, xFile As Object
    Dim i As Integer
    Dim xName As String, splKeyword() As String
    Dim cIndex As Range
    On Error Resume Next
    splKeyword = Split(Keyword)
    Set xFSO = CreateObject("Scripting.FileSystemObject")
    Set xFolder = xFSO.GetFolder(xFolderName)
    For Each xFile In xFolder.Files
        xName = xFSO.GetAbsolutePathName(xFile)
        For i = LBound(splKeyword) To UBound(splKeyword)
            If InStr(LCase(xName), LCase(splKeyword(i))) Then
                Set cIndex = Cells(Rows.Count, 1).End(xlUp).Offset(1)
                ActiveSheet.Hyperlinks.Add anchor:=cIndex, Address:=xName, TextToDisplay:=splKeyword(i)
                matched = matched + 1
            End If
        Next i
    Next xFile
    For Each xSubFolder In xFolder.SubFolders
        SearchInFolder xSubFolder.Path, Keyword, matched
    Next xSubFolder
    Set xFile = Nothing
    Set xFolder = Nothing
    Set xFSO = Nothing
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
End Sub
 
Upvote 0
try this, run sub "EnterKeyWords" then you can enter multiple keywords with ";" as determine then select folder that you need to search:
VBA Code:
Sub EnterKeyWords()
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Dim Keyword As String
    Dim matched As Long
    Keyword = Application.InputBox(prompt:="Multiple keywords determine with ';'", Title:="What do you find?", Type:=2)
    Set folder = Application.FileDialog(msoFileDialogFolderPicker)
    folder.Title = "Choose parent folder to search"
    If folder.Show <> -1 Then Exit Sub
    xDir = folder.SelectedItems(1)
    Call SearchInFolder(xDir, Keyword, matched)
    MsgBox matched & " results was found", vbInformation
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
End Sub

Private Sub SearchInFolder(ByVal xFolderName As String, ByVal Keyword As String, ByRef matched As Long)
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Dim xFSO As Object, xFolder As Object, xSubFolder As Object, xFile As Object
    Dim i As Integer
    Dim xName As String, splKeyword() As String
    Dim cIndex As Range
    On Error Resume Next
    splKeyword = Split(Keyword)
    Set xFSO = CreateObject("Scripting.FileSystemObject")
    Set xFolder = xFSO.GetFolder(xFolderName)
    For Each xFile In xFolder.Files
        xName = xFSO.GetAbsolutePathName(xFile)
        For i = LBound(splKeyword) To UBound(splKeyword)
            If InStr(LCase(xName), LCase(splKeyword(i))) Then
                Set cIndex = Cells(Rows.Count, 1).End(xlUp).Offset(1)
                ActiveSheet.Hyperlinks.Add anchor:=cIndex, Address:=xName, TextToDisplay:=splKeyword(i)
                matched = matched + 1
            End If
        Next i
    Next xFile
    For Each xSubFolder In xFolder.SubFolders
        SearchInFolder xSubFolder.Path, Keyword, matched
    Next xSubFolder
    Set xFile = Nothing
    Set xFolder = Nothing
    Set xFSO = Nothing
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
End Sub
oh sorry, i have misunderstanding that you need to search for file name
 
Upvote 0
One alternative would be to identify an Object Model you can use for each type of file whose contents you want to search through. For Word documents, there's the Microsoft Word Object Model which you can interact with from any Office app such as Excel. If you have Acrobat, it looks like there might be something similar, although I haven't used it myself. But if you're talking about opening & closing dozens of different files in different apps in succession, and searching their contents... then whenever your user enters a search, he/she is in for a lonnng wait - and meanwhile your user's computer will likely grind to a halt.

Up at the OS level, Windows spends a lot of its "down time" indexing files in the background, so that when you search for something from the Start menu, it doesn't take an eternity even to just iterate over file metadata. (Stop hogging my **** CPU, Microsoft Windows Search Indexer!) You might be able to find an API to interact with Windows Search from VBA. But I have to imagine that's not gonna offer the level of detail you're looking for (i.e. a specific content location within a specific file).

Between "prohibitively slow" and "not detailed enough": in the big picture I think you're ultimately looking to solve the wrong problem here. You mentioned these are "policy" documents. IMHO what you really need is some sort of structure for the data in these documents. If this is for a company: can you use Microsoft SharePoint (or a similar product) to manage them? "SharePoint is primarily sold as a document management and storage system, although it is also used for sharing information through an intranet, implementing internal applications, and for implementing business processes". Features include "tooling and compliance for document and record management, word automation services, and content management" and "search results, search customization, mobile search, 'Did you mean?', OS search integration, Faceted Search, and metadata/relevancy/date/location-based refinement options". An enterprise solution tailored to the underlying business need will inevitably be a lot more sustainable & scalable, vs. any ad-hoc solution (no matter how brilliant) you can code up in Excel VBA.
 
Upvote 0
Solution
Sounds like there are better ways to solve the problem. Thank you for your input.
 
Upvote 0

Forum statistics

Threads
1,223,162
Messages
6,170,432
Members
452,326
Latest member
johnshaji

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top