How to obtain a word occuring many times in excel,multiple cells and long sentences

Bosavon

New Member
Joined
Jul 13, 2024
Messages
13
Office Version
  1. 2021
Platform
  1. Windows
  2. Mobile
Hi!
Am having some sort of a report with multiple cells with multiple sentences each and some reccuring words in them
Cell B3 have "The top five in 1500M marathon are Benson, Alphaxad, Nickson, Bedson and Neema"
Cell B5 have "top three in 100M are Bedson, Neema and Benedicto
Cell C3 have " top 3 in jumping are Nickson, Neema and Denis"
And so on

How do I extract names with higher occurrence in all cells without having to split the sentences and VBA
(Sorry if my english isn't concise"
 

Excel Facts

Excel motto
Not everything I do at work revolves around Excel. Only the fun parts.
Do all the sentences have the same structure ?" ...are X,Y and Z"
Are you still on XL 2013?
 
Upvote 0
Do you have a list of possible names to be searched or it is "just any possible word in the list"?
"No vba" is a must or a wish?
 
Upvote 0
Do you have a list of possible names to be searched or it is "just any possible word in the list"?
"No vba" is a must or a wish?
Thx!
No vba is a must as am going to use most of the time on excel android...
Names vary
 
Upvote 0
OK here is the first thing i could come up with for excel 2021 (probably there is a more elegant solution, but it doesn't ocurre to me right now):

We need a helper column (G in our example). Where in cell G8 we concatenate all the text cells with a " " (space) at the end and finally a " " at the start and end. If you would rearrange you source cells to one column that would be easier like
Excel Formula:
=" "&TRIM(SUBSTITUTE(CONCAT(B3:B5&" "), ",", ""))&" "
if you would have all your text in B3:B5 for example.

Then we use the helper cell G10 and the cells below to get the list of words in your text.

Finally we get the only unique list of names in B2, with the number of occurrence of each name in C8.

The names are filtered form the words as follows: The word hast to be longer than 3 characters long, has to start with an uppercase, and doesn't have to start with a number. If there is any word that meets those criteria it will show up in the list.

Maybe a better approach would be to have a list of names and with a formula we get the number of occurrence of each. But I don't know if you have the list of names.
Let me know what you think and we could change it as needed.


Book10.xlsx
ABCDEFG
1
2
3The top five in 1500M marathon are Benson, Alphaxad, Nickson, Bedson and Neematop 3 in jumping are Nickson, Neema and Denis
4
5top three in 100M are Bedson, Neema and Benedicto
6
7
8Neema3 The top five in 1500M marathon are Benson Alphaxad Nickson Bedson and Neema top three in 100M are Bedson Neema and Benedicto top 3 in jumping are Nickson Neema and Denis
9Nickson2
10Bedson2The
11Benson1top
12Alphaxad1five
13Benedicto1in
14Denis11500M
15marathon
16are
17Benson
Sheet1
Cell Formulas
RangeFormula
B8:B14B8=LET(t,$G$8, n,LEN(t)-LEN(SUBSTITUTE(t," ",""))-1, s,SEQUENCE(n), mt,SUBSTITUTE(SUBSTITUTE(t," ","<",s)," ",">",s), st,SEARCH("<",mt),en,SEARCH(">",mt), w,MID(mt,st+1,en-st-1), uw,UNIQUE(FILTER(w,(LEN(w)>3)*EXACT(PROPER(w),w)*NOT(ISNUMBER(VALUE(LEFT(w,1)))))), f,COUNTIF(G10#,uw), SORTBY(uw,f,-1))
C8:C14C8=COUNTIF(G10#, B8#)
G8G8=" "&TRIM(SUBSTITUTE(CONCAT(B3&" ",B5&" ", C3&" "), ",", ""))&" "
G10:G40G10=LET(t, $G$8, n, LEN(t)- LEN(SUBSTITUTE(t, " ", ""))-1, s, SEQUENCE(n), mt, SUBSTITUTE(SUBSTITUTE(t, " ", "<", s), " ", ">", s), st, SEARCH("<", mt), en, SEARCH(">", mt), w, MID(mt, st+1, en-st-1), w )
Dynamic array formulas.
 
Upvote 0
Solution
OK here is the first thing i could come up with for excel 2021 (probably there is a more elegant solution, but it doesn't ocurre to me right now):

We need a helper column (G in our example). Where in cell G8 we concatenate all the text cells with a " " (space) at the end and finally a " " at the start and end. If you would rearrange you source cells to one column that would be easier like
Excel Formula:
=" "&TRIM(SUBSTITUTE(CONCAT(B3:B5&" "), ",", ""))&" "
if you would have all your text in B3:B5 for example.

Then we use the helper cell G10 and the cells below to get the list of words in your text.

Finally we get the only unique list of names in B2, with the number of occurrence of each name in C8.

The names are filtered form the words as follows: The word hast to be longer than 3 characters long, has to start with an uppercase, and doesn't have to start with a number. If there is any word that meets those criteria it will show up in the list.

Maybe a better approach would be to have a list of names and with a formula we get the number of occurrence of each. But I don't know if you have the list of names.
Let me know what you think and we could change it as needed.


Book10.xlsx
ABCDEFG
1
2
3The top five in 1500M marathon are Benson, Alphaxad, Nickson, Bedson and Neematop 3 in jumping are Nickson, Neema and Denis
4
5top three in 100M are Bedson, Neema and Benedicto
6
7
8Neema3 The top five in 1500M marathon are Benson Alphaxad Nickson Bedson and Neema top three in 100M are Bedson Neema and Benedicto top 3 in jumping are Nickson Neema and Denis
9Nickson2
10Bedson2The
11Benson1top
12Alphaxad1five
13Benedicto1in
14Denis11500M
15marathon
16are
17Benson
Sheet1
Cell Formulas
RangeFormula
B8:B14B8=LET(t,$G$8, n,LEN(t)-LEN(SUBSTITUTE(t," ",""))-1, s,SEQUENCE(n), mt,SUBSTITUTE(SUBSTITUTE(t," ","<",s)," ",">",s), st,SEARCH("<",mt),en,SEARCH(">",mt), w,MID(mt,st+1,en-st-1), uw,UNIQUE(FILTER(w,(LEN(w)>3)*EXACT(PROPER(w),w)*NOT(ISNUMBER(VALUE(LEFT(w,1)))))), f,COUNTIF(G10#,uw), SORTBY(uw,f,-1))
C8:C14C8=COUNTIF(G10#, B8#)
G8G8=" "&TRIM(SUBSTITUTE(CONCAT(B3&" ",B5&" ", C3&" "), ",", ""))&" "
G10:G40G10=LET(t, $G$8, n, LEN(t)- LEN(SUBSTITUTE(t, " ", ""))-1, s, SEQUENCE(n), mt, SUBSTITUTE(SUBSTITUTE(t, " ", "<", s), " ", ">", s), st, SEARCH("<", mt), en, SEARCH(">", mt), w, MID(mt, st+1, en-st-1), w )
Dynamic array formulas.
OK here is the first thing i could come up with for excel 2021 (probably there is a more elegant solution, but it doesn't ocurre to me right now):

We need a helper column (G in our example). Where in cell G8 we concatenate all the text cells with a " " (space) at the end and finally a " " at the start and end. If you would rearrange you source cells to one column that would be easier like
Excel Formula:
=" "&TRIM(SUBSTITUTE(CONCAT(B3:B5&" "), ",", ""))&" "
if you would have all your text in B3:B5 for example.

Then we use the helper cell G10 and the cells below to get the list of words in your text.

Finally we get the only unique list of names in B2, with the number of occurrence of each name in C8.

The names are filtered form the words as follows: The word hast to be longer than 3 characters long, has to start with an uppercase, and doesn't have to start with a number. If there is any word that meets those criteria it will show up in the list.

Maybe a better approach would be to have a list of names and with a formula we get the number of occurrence of each. But I don't know if you have the list of names.
Let me know what you think and we could change it as needed.


Book10.xlsx
ABCDEFG
1
2
3The top five in 1500M marathon are Benson, Alphaxad, Nickson, Bedson and Neematop 3 in jumping are Nickson, Neema and Denis
4
5top three in 100M are Bedson, Neema and Benedicto
6
7
8Neema3 The top five in 1500M marathon are Benson Alphaxad Nickson Bedson and Neema top three in 100M are Bedson Neema and Benedicto top 3 in jumping are Nickson Neema and Denis
9Nickson2
10Bedson2The
11Benson1top
12Alphaxad1five
13Benedicto1in
14Denis11500M
15marathon
16are
17Benson
Sheet1
Cell Formulas
RangeFormula
B8:B14B8=LET(t,$G$8, n,LEN(t)-LEN(SUBSTITUTE(t," ",""))-1, s,SEQUENCE(n), mt,SUBSTITUTE(SUBSTITUTE(t," ","<",s)," ",">",s), st,SEARCH("<",mt),en,SEARCH(">",mt), w,MID(mt,st+1,en-st-1), uw,UNIQUE(FILTER(w,(LEN(w)>3)*EXACT(PROPER(w),w)*NOT(ISNUMBER(VALUE(LEFT(w,1)))))), f,COUNTIF(G10#,uw), SORTBY(uw,f,-1))
C8:C14C8=COUNTIF(G10#, B8#)
G8G8=" "&TRIM(SUBSTITUTE(CONCAT(B3&" ",B5&" ", C3&" "), ",", ""))&" "
G10:G40G10=LET(t, $G$8, n, LEN(t)- LEN(SUBSTITUTE(t, " ", ""))-1, s, SEQUENCE(n), mt, SUBSTITUTE(SUBSTITUTE(t, " ", "<", s), " ", ">", s), st, SEARCH("<", mt), en, SEARCH(">", mt), w, MID(mt, st+1, en-st-1), w )
Dynamic array formulas.
Thanks a lot
This is doing it, but I'll have to re-read it a couple of times so as to comprehend how it's doing it...
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,286
Members
452,631
Latest member
a_potato

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top