How to extract words that start with capital letters?

brvnbld

New Member
Joined
Jun 29, 2022
Messages
16
Office Version
  1. 2021
Platform
  1. Windows
I have a column of Text data that has more than 300 words per cell. I am using MS Office 2021. I like to extract the words that start with capitalization from the cells. What is the formula for that?. Or is there any VBA script?
 

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
Does this work for you
=IF(EXACT(LEFT(A2,1),UPPER(LEFT(A2,1))),A2,"")

column E just shows the formula, just to show it working , not needed otherwise
If you want to stop blanks showing up as true then

=If( A2 ="", "" , IF(EXACT(LEFT(A2,1),UPPER(LEFT(A2,1))),A2,""))

May need to also use ISTEXT , as number will be seen as capital

BUT what is the possible contents of the cells
Would it have a capital and a number
Abc123 ????

some examples would help


SparkLine-ETAF.xlsx
ABCDE
1
2AbcAbcTRUE
3abcd FALSE
4CatCatTRUE
5DogDogTRUE
6cat FALSE
7dog FALSE
Sheet2
Cell Formulas
RangeFormula
B2:B7B2=IF(EXACT(LEFT(A2,1),UPPER(LEFT(A2,1))),A2,"")
E2:E7E2=EXACT(LEFT(A2,1),UPPER(LEFT(A2,1)))
 
Upvote 0
Welcome to the MrExcel board!

If cell A2 contained 4 words that start with capitals, where and how would you want those 4 words presented?

Could the same capitalised word appear more than once in the same cell? If so, should it also be repeated in the result(s)?

5-10 cells of smallish but representative data and the expected results with XL2BB would be a great help too.
 
Upvote 0
I have a column of Text data that has more than 300 words per cell. I am using MS Office 2021. I like to extract the words that start with capitalization from the cells. What is the formula for that?. Or is there any VBA script?
VBA Code:
Function StrExtract(Str As String) As String
'Updateby Extendoffice
    Application.Volatile
    Dim xStrList As Variant
    Dim xRet As String
    Dim I As Long
    If Len(Str) = 0 Then Exit Function
    xStrList = Split(Str, " ")
    If UBound(xStrList) >= 0 Then
        For I = 0 To UBound(xStrList)
            If xStrList(I) = StrConv(xStrList(I), vbProperCase) Then
                xRet = xRet & xStrList(I) & " "
            End If
        Next
        StrExtract = Left(xRet, Len(xRet) - 1)
    End If
End Function

Sorry I should have been clear. I want the extracted words seperated by comma in the adjacent cells.
I tried this function, I want a slight modification to this. If there are consecutive words that capital letters, I want them treated as single data, and be put as together.
Thank you in advance!!

Using XLBB to paste the mini cell makes the cells go way long, Since each cell contains more than 300 words. So Am posting a screenshot. If a long cell is ok to be pasted here, I will paste the mini cell.
 

Attachments

  • excelformhelp.png
    excelformhelp.png
    22.6 KB · Views: 22
Upvote 0
An Option
VBA Code:
Sub test()
    Dim a
    Dim I&, II&
    a = Cells(1).CurrentRegion.Resize(, 1)
    ReDim b(1 To UBound(a))
    With CreateObject("VBScript.Regexp")
        .Pattern = "([A-Z])\w+\W"
        .Global = True
        For I = 1 To UBound(a)
            Set x = .Execute(a(I, 1))
            For II = 0 To x.Count - 1
                b(I) = IIf(b(I) = "", x(II), b(I) & "," & x(II))
            Next
        Next
Cells(1, 2).Resize(UBound(b)) = Application.Transpose(b)
    End With
End Sub
 
Upvote 0
Using XLBB to paste the mini cell makes the cells go way long, Since each cell contains more than 300 words. So Am posting a screenshot. If a long cell is ok to be pasted here, I will paste the mini cell.
You could always use some edited cells that contained, say, 20-30 words and ensure they included any of the unusual situations.
 
Upvote 0
I want the extracted words seperated by comma in the adjacent cells.
... If there are consecutive words that capital letters, I want them treated as single data, and be put as together.
Give this a try. I have assumed data in column A starting in row 2 and results in column B.

VBA Code:
Sub GetCapitalised_1()
  Dim RX As Object
  Dim a As Variant
  Dim i As Long
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "( [^A-Z][^ ]*)+"
  With Range("A2", Range("A" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      a(i, 1) = Replace(RX.Replace(a(i, 1), " "), "  ", ", ")
    Next i
    .Offset(, 1).Value = a
  End With
End Sub

Here is my sample data and results

brvnbld.xlsm
AB
1DataResults
2Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.Lorem Ipsum, Lorem Ipsum, It, It, Letraset, Lorem Ipsum, Aldus PageMaker, Lorem Ipsum.
3
4One capitalised words hereOne
5There are many variations of passages of Lorem Ipsum available.There, Lorem Ipsum
Sheet1


Note that my code would not handle an example like this one of yours:

1656491390503.png
 
Last edited:
Upvote 0
Solution
An Option
VBA Code:
Sub test()
    Dim a
    Dim I&, II&
    a = Cells(1).CurrentRegion.Resize(, 1)
    ReDim b(1 To UBound(a))
    With CreateObject("VBScript.Regexp")
        .Pattern = "([A-Z])\w+\W"
        .Global = True
        For I = 1 To UBound(a)
            Set x = .Execute(a(I, 1))
            For II = 0 To x.Count - 1
                b(I) = IIf(b(I) = "", x(II), b(I) & "," & x(II))
            Next
        Next
Cells(1, 2).Resize(UBound(b)) = Application.Transpose(b)
    End With
End Sub
But the Consecutive words that start with capital lettters must be taken as a single unit.
 
Upvote 0
Give this a try. I have assumed data in column A starting in row 2 and results in column B.

VBA Code:
Sub GetCapitalised_1()
  Dim RX As Object
  Dim a As Variant
  Dim i As Long
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "( [^A-Z][^ ]*)+"
  With Range("A2", Range("A" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      a(i, 1) = Replace(RX.Replace(a(i, 1), " "), "  ", ", ")
    Next i
    .Offset(, 1).Value = a
  End With
End Sub

Here is my sample data and results

brvnbld.xlsm
AB
1DataResults
2Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.Lorem Ipsum, Lorem Ipsum, It, It, Letraset, Lorem Ipsum, Aldus PageMaker, Lorem Ipsum.
3
4One capitalised words hereOne
5There are many variations of passages of Lorem Ipsum available.There, Lorem Ipsum
Sheet1


Note that my code would not handle an example like this one of yours:

View attachment 68172
Thanks you, The brackets do not matter,. It can avoid them,. The consecutive words if extracted then fine.
 
Upvote 0

Forum statistics

Threads
1,223,904
Messages
6,175,295
Members
452,633
Latest member
DougMo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top