VBA script: Extracting specific text from a word document and putting them in an excel table

Omar23j

New Member
Joined
Jan 26, 2018
Messages
8
Hi everyone,

I am a beginner when it comes to VBA programming. I only know basic functions and know how to record a macro.
I need your help to facilitate me a task.

I am trying to extract specific information from many word documents and put them in a common excel table.
I attached a sample screenshot of what each word document looks like and removed sensitive data. Please pay attention to the text in red, as this is the text I want to extract through a VBA script:

String #1 - Text at the right of ''Risk Name :''
--> string of variable length

String #2 - The number next to COST EFFECT(k$) ( in this case 200)
Extracting any of the two strings shown will work.

Now I want to extract these 2 strings from about 40 word docx and put them in an excel table where column A will contain 40 rows of each word documents with their corresponding string #1 . And column B will have 40 rows of each word document with corresponding string #2 .

Assume all word docx have the same format and only the strings value changes.
Any help will be greatly appreciated.
Thanks!

word document to extract strings from : https://omar23j.imgur.com/all/?third_party=1

all
 
Last edited:

Excel Facts

Get help while writing formula
Click the italics "fx" icon to the left of the formula bar to open the Functions Arguments dialog. Help is displayed for each argument.
Your links are bad, unfortunately. :sad:

I'll give you some general information, but you'll still have to hit google for details. First, for this application, you would want to manipulate Word from Excel, not the other way around. You'll need to define variables for the Word application object and the Word document object. You'll need some kind of loop to go through the Word file names, opening each one and extracting the information you want.

There is a Range object in Word just like there is in Excel, but there is a big difference. In Excel, a range can be any random area of the spreadsheet, and indeed, can even contain several random areas of the spreadsheet. In Word, however, a range can only be a single consecutive string of text, from this character to that character.

To extract substrings from text, use the InStr function. For example, given the text, "The quick brown fox jumped over the lazy dog," and the assignment to extract "fox" from it, you could do this.
Code:
Dim text as String
Dim found as String
Dim pos as Integer

text = "The quick brown fox jumped over the lazy dog"

pos = InStr(text, "fox")

If pos = 0 Then
    'not found
Else
    found = Mid(text, pos, Len(pos))
End If

I know this is not a solution, but I hope it sets you on the path to one.
 
Upvote 0
You could use an Excel macro like:
Code:
Sub GetTableData()
'Note: this code requires a reference to the Word object model.
'See under the VBE's Tools|References.
Application.ScreenUpdating = False
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim strFolder As String, strFile As String
Dim WkSht As Worksheet, r As Long
strFolder = GetFolder
If strFolder = "" Then Exit Sub
Set WkSht = ActiveSheet
r = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
'Disable any auto macros in the documents being processed
wdApp.WordBasic.DisableAutoMacros
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
  r = r + 1
  Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
  With wdDoc
    With .Tables(1)
      WkSht.Cells(r, 1) = Trim(Split(Split(.Cell(1, 1).Range.Text, "Risk Name:")(1), vbCr)(0))
      WkSht.Cells(r, 2) = Trim(Split(.Cell(5, 2).Range.Text, vbCr)(0))
    End With
    .Close SaveChanges:=False
  End With
  strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing
Application.ScreenUpdating = True
End Sub
 
Function GetFolder() As String
    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function
Without access to an actual document, I've had to do a bit of guesswork as to your cell addresses & content, but the code should be pretty close to what you want.
 
Upvote 0

Forum statistics

Threads
1,223,702
Messages
6,173,961
Members
452,539
Latest member
delvey

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top