# VBA script: Extracting specific text from a word document and putting them in an excel table



## Omar23j (Feb 16, 2018)

Hi everyone,

I am a beginner when it comes to VBA programming. I only know basic functions and know how to record a macro.
I need your help to facilitate me a task.

I am trying to extract specific information from many word documents and put them in a common excel table.
I attached a sample screenshot of what each word document  looks like and removed sensitive data. Please pay attention to the text in red, as this is the text I want to extract through a VBA script:

String #1 - Text at the right of ''Risk Name :'' 
--> string of variable length

String #2 - The number next to COST EFFECT(k$) ( in this case 200)
Extracting any of the two strings shown will work.

Now I want to extract these 2 strings from about 40 word docx and put them in an excel table where column A will contain 40 rows of each word documents with their corresponding string #1 . And column B will have 40 rows of each word document with corresponding string #2 .

Assume all word docx have the same format and only the strings value changes.
Any help will be greatly appreciated.
Thanks!

word document to extract strings from : https://omar23j.imgur.com/all/?third_party=1


----------



## Omar23j (Feb 16, 2018)

Link https://imgur.com/a/MS6wT


----------



## RobertSF (Feb 16, 2018)

Your links are bad, unfortunately.  

I'll give you some general information, but you'll still have to hit google for details. First, for this application, you would want to manipulate Word from Excel, not the other way around. You'll need to define variables for the Word application object and the Word document object. You'll need some kind of loop to go through the Word file names, opening each one and extracting the information you want.

There is a Range object in Word just like there is in Excel, but there is a big difference. In Excel, a range can be any random area of the spreadsheet, and indeed, can even contain several random areas of the spreadsheet. In Word, however, a range can only be a single consecutive string of text, from this character to that character.

To extract substrings from text, use the InStr function. For example, given the text, "The quick brown fox jumped over the lazy dog," and the assignment to extract "fox" from it, you could do this.

```
Dim text as String
Dim found as String
Dim pos as Integer

text = "The quick brown fox jumped over the lazy dog"

pos = InStr(text, "fox")

If pos = 0 Then
    'not found
Else
    found = Mid(text, pos, Len(pos))
End If
```

I know this is not a solution, but I hope it sets you on the path to one.


----------



## Macropod (Feb 16, 2018)

You could use an Excel macro like:

```
Sub GetTableData()
'Note: this code requires a reference to the Word object model.
'See under the VBE's Tools|References.
Application.ScreenUpdating = False
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim strFolder As String, strFile As String
Dim WkSht As Worksheet, r As Long
strFolder = GetFolder
If strFolder = "" Then Exit Sub
Set WkSht = ActiveSheet
r = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
'Disable any auto macros in the documents being processed
wdApp.WordBasic.DisableAutoMacros
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
  r = r + 1
  Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
  With wdDoc
    With .Tables(1)
      WkSht.Cells(r, 1) = Trim(Split(Split(.Cell(1, 1).Range.Text, "Risk Name:")(1), vbCr)(0))
      WkSht.Cells(r, 2) = Trim(Split(.Cell(5, 2).Range.Text, vbCr)(0))
    End With
    .Close SaveChanges:=False
  End With
  strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing
Application.ScreenUpdating = True
End Sub
 
Function GetFolder() As String
    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function
```
Without access to an actual document, I've had to do a bit of guesswork as to your cell addresses & content, but the code should be pretty close to what you want.


----------

