Counting commas between multiple specific strings of text in a single cell

Country_Calc_2

New Member
Joined
Sep 16, 2023
Messages
25
Office Version
  1. 365
Platform
  1. Windows
The objective is to count the total number of investigators per country.

All source text is in a single cell.

Here is the basic logic...

Find Country Name "*Argentina*"

Count the number of commas between the country name and the next colon ":"

Then add 1.

Argentina has 8 commas + 1 = 9

A colon was added manually to count the last country.

Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.:Argentina9
Austria3
Belgium10
Brazil7
Canada11
 

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
How about
Fluff.xlsm
ABC
1
2Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.Argentina9
3Austria3
4Belgium10
5Brazil8
6Canada12
Data
Cell Formulas
RangeFormula
C2:C6C2=COLUMNS(TEXTSPLIT(TEXTBEFORE(TEXTAFTER(A$2,B2&":"),":",,,1),","))
 
Upvote 1
Solution
With Power Query

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter", {{"Column1", Splitter.SplitTextByEachDelimiter({"."}, QuoteStyle.Csv, true), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
    #"Added Custom" = Table.AddColumn(#"Split Column by Delimiter1", "Custom", each if Text.Contains([Column1],",") then null else [Column1]),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each Text.Contains([Column1], ",")),
    #"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
    #"Grouped Rows" = Table.Group(#"Split Column by Delimiter2", {"Custom"}, {{"Count of Investigators", each Table.RowCount(_), Int64.Type}}),
    #"Replaced Value" = Table.ReplaceValue(#"Grouped Rows","Principal Investigators","",Replacer.ReplaceText,{"Custom"}),
    #"Renamed Columns" = Table.RenameColumns(#"Replaced Value",{{"Custom", "Country"}})
in
    #"Renamed Columns"

Book9
AB
1CountryCount of Investigators
2 Argentina9
3 Australia8
4 4 Austria3
5 Belgium10
6 Brazil8
7 Bulgaria7
8 Canada12
Table1
 
Upvote 0
Please try this to start
Book2
CDE
4StringCountry# Investigators
5Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.: Argentina9
Sheet2
Cell Formulas
RangeFormula
E5E5=LET(Colon1,FIND(D5&":",C5)+11,Colon2,FIND(":",C5,Colon1),Str,MID(C5,Colon1,Colon2-Colon1),LEN(Str)-LEN(SUBSTITUTE(Str,",","")))+1
 
Upvote 0
=COLUMNS(TEXTSPLIT(TEXTBEFORE(TEXTAFTER(A$2,B2&":"),":",,,1),","))
This works great and is can be modified based on the source data's separator. Anyone using this needs to keep in mind the different naming conventions for countries, but this can be fixed with a standardized list and vlookup.
 
Upvote 0
@Country_Calc_2 initially came up with the same solution as @Fluff
However, I think you will find that the count is wrong where some countries have more than one 'and' at the end?
If you agree then try the below.

AutoTraining.xlsm
ABC
1
2Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.Argentina9
3Austria3
4Belgium11
5Brazil8
6Canada13
Sheet2
Cell Formulas
RangeFormula
C2:C6C2=COLUMNS(TEXTSPLIT(TEXTBEFORE(TEXTAFTER(SUBSTITUTE(A$2,"and",",")&":",B2&":"),":"),","))-1
 
Upvote 0
Please try this to start
Book2
CDE
4StringCountry# Investigators
5Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.: Argentina9
Sheet2
Cell Formulas
RangeFormula
E5E5=LET(Colon1,FIND(D5&":",C5)+11,Colon2,FIND(":",C5,Colon1),Str,MID(C5,Colon1,Colon2-Colon1),LEN(Str)-LEN(SUBSTITUTE(Str,",","")))+1
Thanks Jeffery, this also works - but I think Fluff's formula is going to be easier for most novice users (including myself) to modify as publications use all kinds of different separators.
 
Upvote 0
Glad we could help & thanks for the feedback.
 
Upvote 0
@Country_Calc_2 initially came up with the same solution as @Fluff
However, I think you will find that the count is wrong where some countries have more than one 'and' at the end?
If you agree then try the below.

AutoTraining.xlsm
ABC
1
2Principal Investigators Argentina: J.L.C. Navarro Estrada, E.A. Duronto, S.M. Macin, M.S. Trivi, F. Paoletti, J.P. Albisu Di Gennaro, S.A. Sassone, J.A.A. Egido, and E. Hasbani. Australia: J. Rogers, S. Nicholls, J. Amerena, D. Walters, G. Oldfield, D. Eccleston, G. Nelson, and P. Shetty. 4 Austria: K. Huber, U. Hoppe, and T. Sturmberger. Belgium: D. Schoors, C. Borin, I. Buysschaert, F. De Man, J. Voet, P. Debruyne, F. De Vroey, A. De Wolf, S. Verheye, and J. Castro Rodriguez and G. Flores-Vivian. Brazil: A.A.V. de Paola, L. Moreira Baracioli, L.F. Leite Tanajura, L.N. Maia, S. Zimmermann, O. Dutra, R. Botelho, and D.B. Précoma. Bulgaria: F. Nikolov, E. Docheva, S. Marchev, S. Dzhurova-Stoyneva, Z. Kamenova, I. Petrov, and C. Velikov. Canada: F. Ayala-Paredes, A. Cheema, M. Hartleib, R. Labonte, M. Madan, S. Mehta, G. Martucci, L. Bilodeau, J. Ducas, F. Charbonneau, O. Bertrand, A. MacDougall and I. Billingsley.Argentina9
3Austria3
4Belgium11
5Brazil8
6Canada13
Sheet2
Cell Formulas
RangeFormula
C2:C6C2=COLUMNS(TEXTSPLIT(TEXTBEFORE(TEXTAFTER(SUBSTITUTE(A$2,"and",",")&":",B2&":"),":"),","))-1
Hi Snakehips, I do agree that the count is potentially wrong as there are multiple "and" for some countries and not others. Most publications do this consistently so it is not an issue. I consider this to be a bad source data issue. To further complicate things sometimes the 'and' means that 1 investigator retired and another person took over - so they should be counted as 1 person and not 2 depending on what you are trying to calculate.
 
Upvote 0

Forum statistics

Threads
1,223,884
Messages
6,175,175
Members
452,615
Latest member
bogeys2birdies

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top