# Word 2003 - VBA to find specific terms



## Spurious (May 15, 2012)

Hi all,

I am looking to code a macro that goes through a text and finds every word that starts with a capital letter.
Those words represent defined terms.

The problem here is, that sometimes two or more words are capitalized and they combined are the defined term.

E.g. Maturity Date would be an example.

So, I need a macro that goes through a text and finds all of those terms. Hickups include starts of sentences, punctuation in between and different number of words building a defined term.


I am not sure, if I was clear with what I wanted to code. Please ask questions, if something is unclear.

Thanks!


----------



## Macropod (May 15, 2012)

So how is the macro supposed to be able to tell whether the first word of a sentence, etc. is or is not part of a 'defined term'? And what about names & honorifics - are they 'defined terms'?

Ultimately, to get reliable results, you need to have a means of differentiating your defined terms from other text. This could be by having a list of such terms in another file, or identifying features such as doubles quotes, bold text or particular Style names that can be used to identify them.

And of course, once you've got that sorted, what do you want done with them? For some code ideas see: http://social.technet.microsoft.com/Forums/en-US/word/thread/228d49ed-53a4-487f-9829-316f76abbe13


----------



## Spurious (May 15, 2012)

Yeah, I shouldnt have mentioned the problems part, because at the moment, I am looking for a way to get every word that starts with a capital letter.

In the second step, I am comparing them to an index I've created with all the defined terms (basically what the linked article tries to do, have I already done).
I am now looking for a way to find defined terms in the text, which are not yet indexed.


----------



## Macropod (May 15, 2012)

Hi Spurious,

Try:

```
Sub GetTerms()
Dim Rng As Range, i As Long, StrTxt As String
StrTxt = Chr(11)
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Text = "<[A-Z]*>"
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = True
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    .Execute
  End With
  Do While .Find.Found
    Set Rng = .Duplicate
    While Rng.Words.Last.Next.Characters.First Like "[A-Z]"
      Rng.MoveEnd wdWord, 1
    Wend
    If InStr(StrTxt, Chr(11) & Trim(Rng.Text) & Chr(11)) = 0 Then
      StrTxt = StrTxt & Trim(Rng.Text) & Chr(11)
      i = i + 1
    End If
    .Start = Rng.End
    .Find.Execute
  Loop
  StrTxt = Left(StrTxt, Len(StrTxt) - 1)
End With
  If Len(StrTxt) > 1 Then
    ActiveDocument.Range.InsertAfter vbCr & Chr(12) & "Possible Defined Terms" & StrTxt
  End If
MsgBox i & " possible 'Defined Term' expressions found."
End Sub
```


----------



## Spurious (May 16, 2012)

Thank you very much!
Does exactly what I want.

Now, I got a further problem, because some terms have "(*)" directly afterwards, e.g.
Maturity(i) Date or Maturity(i,t) and other things in parenthesis. If there is no space between the word and the them, they should be counted as part of the term as well (if there is a space, they shouldnt!).

Is it possible to ammend them in this part of the code?


```
While Rng.Words.Last.Next.Characters.First Like "[A-Z]"
```
 
So basically, if like [A-Z] then do what it does now, if like ( then select everything until ) and go further from there.


Thanks again for you help and thanks in advance.


----------



## Macropod (May 16, 2012)

You could do that by changing the Do While ... Loop to:

```
Do While .Find.Found
    Set Rng = .Duplicate
    With Rng
      While .Words.Last.Next.Characters.First Like "[(A-Z]"
        .MoveEnd wdWord, 1
      Wend
      If InStr(.Text, "(") Then
        .MoveEndUntil ")", wdForward
        .End = Rng.End + 1
      End If
      If InStr(StrTxt, Chr(11) & Trim(.Text) & Chr(11)) = 0 Then
        StrTxt = StrTxt & Trim(.Text) & Chr(11)
        i = i + 1
      End If
    End With
    .Start = Rng.End
    .Find.Execute
  Loop
```


----------



## Spurious (May 16, 2012)

Thanks that works as a start, but it now has the problem, that it ignores spaces between the defined term and the parenthesis.

E.g.
Maturity(i) Date should be a defined term.
Maturity (Day) should not be a defined term.


----------



## Macropod (May 16, 2012)

In that case, change:
While .Words.Last.Next.Characters.First Like "[(A-Z]"
to:
While .Words.Last.Next.Characters.First Like "[A-Z]"
and change:
If InStr(.Text, "(") Then
to:
If .Characters.Last.Next = "(" Then


----------



## Spurious (May 16, 2012)

Unfortunately, the problem still exists.

I tried:

```
If Not .Characters.Last.Next = " (" And .Characters.Last.Next = "(" Then
```
 
but it still ignores the spaces.


----------



## Macropod (May 16, 2012)

That suggests you didn't make the first of the last two changes I suggested. It certainly works in my testing.

In any event, please bear in mind the code is only meant to identify _possible _terms, not to provide an exact list.


----------



## Spurious (May 15, 2012)

Hi all,

I am looking to code a macro that goes through a text and finds every word that starts with a capital letter.
Those words represent defined terms.

The problem here is, that sometimes two or more words are capitalized and they combined are the defined term.

E.g. Maturity Date would be an example.

So, I need a macro that goes through a text and finds all of those terms. Hickups include starts of sentences, punctuation in between and different number of words building a defined term.


I am not sure, if I was clear with what I wanted to code. Please ask questions, if something is unclear.

Thanks!


----------



## Spurious (May 16, 2012)

I did make two changes and your code is working very well, I am now in the process of tweaking it.

Does your code really ignore Maturity (Day), but counts Maturity(i)?


----------



## Spurious (May 16, 2012)

Ok, after trying a few things and getting more and more how Word VBA works, I found a solution:


```
If .Characters.Last.Next = "(" And .Characters.Last <> " " Then
```
 
This works for what I want it to work.


Thank you very much for your excellent help here!
I will update the thread with other problems I encounter.


----------



## Macropod (May 16, 2012)

Spurious said:


> I did make two changes and your code is working very well, I am now in the process of tweaking it.
> 
> Does your code really ignore Maturity (Day), but counts Maturity(i)?


I've been out for a few hours. Mine counts Maturity(i) as one term as Maturity (Day) as two terms - Maturity and Day.

Here is the complete code:

```
Sub GetTerms()
Dim Rng As Range, i As Long, StrTxt As String
StrTxt = Chr(11)
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Text = "<[A-Z]*>"
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = True
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    .Execute
  End With
  Do While .Find.Found
    Set Rng = .Duplicate
    With Rng
      While .Words.Last.Next.Characters.First Like "[A-Z]"
        .MoveEnd wdWord, 1
      Wend
      If .Characters.Last.Next = "(" Then
        .MoveEndUntil ")", wdForward
        .End = Rng.End + 1
      End If
      If InStr(StrTxt, Chr(11) & Trim(.Text) & Chr(11)) = 0 Then
        StrTxt = StrTxt & Trim(.Text) & Chr(11)
        i = i + 1
      End If
    End With
    .Start = Rng.End
    .Find.Execute
  Loop
  StrTxt = Left(StrTxt, Len(StrTxt) - 1)
End With
  If Len(StrTxt) > 1 Then
    ActiveDocument.Range.InsertAfter vbCr & Chr(12) & "Possible Defined Terms" & StrTxt
  End If
MsgBox i & " possible 'Defined Term' expressions found."
End Sub
```


----------



## Spurious (May 16, 2012)

Do you have a newer version of Office? I dont know, but it didnt work for me. I got it to work now, which is fine and I only added a small thing.

I got another question. Is there a way to select the words (instead of adding them to a string)? Basically, my end goal is, I want the user to go through the text and highlight/select every term.


----------



## Macropod (May 16, 2012)

Developed in Word 2010, Tested in Word 2003. Identical results.

As for going through the document & highlighting, the macro could:
• higlight the 'possible terms' as it finds them, with no user interaction;
• compile the list (without outputting it), then ask the user whether to highlight all instances of a each term on a term-by-term basis; or
• skip building the list and simply stop at each possible term as it finds them and ask on a case-by-case basis.
Which do you want to do?


----------



## Spurious (May 16, 2012)

Tried your macro again and it doesnt work, doesnt matter.

As for your question:
The 3rd solution would be ideal.


----------



## Macropod (May 16, 2012)

Try:

```
Sub HiglightTerms()
Dim Rng As Range, Rslt
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Text = "<[A-Z]*>"
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = True
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    .Execute
  End With
  Do While .Find.Found
    Set Rng = .Duplicate
    With Rng
      While .Words.Last.Next.Characters.First Like "[A-Z]"
        .MoveEnd wdWord, 1
      Wend
      If .Characters.Last.Next = "(" And .Characters.Last <> " " Then
        .MoveEndUntil ")", wdForward
        .End = Rng.End + 1
      End If
      .Select
      Rslt = MsgBox("Highlight the marked string:" & vbCr & _
        .Text, vbYesNoCancel, "Defined Term Highlighter")
      If Rslt = vbCancel Then Exit Sub
      If Rslt = vbYes Then .HighlightColorIndex = wdBrightGreen
    End With
    .Start = Rng.End
    .Find.Execute
  Loop
End With
MsgBox "Finished"
End Sub
```


----------



## Spurious (May 16, 2012)

Thank you, works wonderful!


----------



## Spurious (May 18, 2012)

Ok, I got another question:
How do I start at the current cursor position rather than the beginning of the document?
I dont really know how I could incoporate that into your code, because it isnt using Selection.


----------



## Spurious (May 18, 2012)

Spurious said:


> Ok, I got another question:
> How do I start at the current cursor position rather than the beginning of the document?
> I dont really know how I could incoporate that into your code, because it isnt using Selection.


 
Nevermind, figured it out.


```
With ActiveDocument.Range
    .Start = Selection.Start
```
 
Easy and logical!


----------

