Count Distinct across two columns

lljones

New Member
Joined
Aug 24, 2019
Messages
10
Hi - this the last question I have on a project, and it's a tricky one (to me!)

Description: I keeping a spreadsheet for monthly Reading Group that shows month/author(s)/title(s)/page count/author gender/etc.

[TABLE="width: 764"]
<colgroup><col><col span="2"><col><col span="2"><col><col></colgroup><tbody>[TR]
[TD]Month[/TD]
[TD]Author1[/TD]
[TD]Author2[/TD]
[TD]Title1[/TD]
[TD]Title2[/TD]
[TD]TitleCt[/TD]
[TD]BookPageCt[/TD]
[TD]AuthorGender[/TD]
[/TR]
[TR]
[TD="align: right"]1[/TD]
[TD]Steinbeck, John[/TD]
[TD] [/TD]
[TD]The Grapes of Wrath[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]423[/TD]
[TD]m[/TD]
[/TR]
[TR]
[TD="align: right"]2[/TD]
[TD]Atwood, Margaret[/TD]
[TD] [/TD]
[TD]Cat's Eye[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]423[/TD]
[TD]f[/TD]
[/TR]
[TR]
[TD="align: right"]3[/TD]
[TD]Gaiman, Neil[/TD]
[TD]Pratchett, Terry[/TD]
[TD]Good Omens[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]432[/TD]
[TD]m[/TD]
[/TR]
[TR]
[TD="align: right"]4[/TD]
[TD]Selvon, Sam[/TD]
[TD] [/TD]
[TD]The Lonely Londoners[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]432[/TD]
[TD]m[/TD]
[/TR]
[TR]
[TD="align: right"]5[/TD]
[TD]Lively, Penelope[/TD]
[TD] [/TD]
[TD]Moon Tiger[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]432[/TD]
[TD]f[/TD]
[/TR]
[TR]
[TD="align: right"]6[/TD]
[TD]Stephanie Meyer[/TD]
[TD]Bram Stoker[/TD]
[TD]Twilight[/TD]
[TD]Dracula[/TD]
[TD="align: right"]2[/TD]
[TD="align: right"]867[/TD]
[TD]f[/TD]
[/TR]
[TR]
[TD="align: right"]7[/TD]
[TD]Austen, Jane[/TD]
[TD] [/TD]
[TD]Persuasion[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]423[/TD]
[TD]f[/TD]
[/TR]
[TR]
[TD="align: right"]8[/TD]
[TD]Pratchett, Terry[/TD]
[TD] [/TD]
[TD]The Colour of Magic[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]423[/TD]
[TD]m[/TD]
[/TR]
[TR]
[TD="align: right"]9[/TD]
[TD]Steinbeck, John[/TD]
[TD] [/TD]
[TD]Tortilla Flat[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]432[/TD]
[TD][/TD]
[/TR]
[TR]
[TD="align: right"]10[/TD]
[TD]Highsmith, Patricia[/TD]
[TD][/TD]
[TD="colspan: 2"]The Talented Mr. Ripley[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]432[/TD]
[TD]f[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]


[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
MonthCt = 10
TitleCt = sum(titlect) = 11
AuthorCt = 9 (male = 4, female = 5)

Problem 1: I want a count of distinct authors across TWO columns. In the sample data above, Steinbeck & Pratchett are repeated and two months have co-authors. Correct count of unique authors = 9.

Problem 2: Gender and other demographic data is kept for the first author only, but I don't want duplicates. I'm managing this by leaving demo data blank for the second or subsequent instance of the author. It's working for me, but is there a better way to do this?

Thanks for your help - you guys are great!
 

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Just an observation . . .
Correct count of unique authors = 9.

Looks like 10 to me, or am I missing something ?
[TABLE="width: 180"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Atwood, Margaret
[/TD]
[/TR]
[TR]
[TD]Austen, Jane
[/TD]
[/TR]
[TR]
[TD]Bram Stoker
[/TD]
[/TR]
[TR]
[TD]Gaiman, Neil[/TD]
[/TR]
[TR]
[TD]Highsmith, Patricia[/TD]
[/TR]
[TR]
[TD]Lively, Penelope
[/TD]
[/TR]
[TR]
[TD]Pratchett, Terry
[/TD]
[/TR]
[TR]
[TD]Selvon, Sam
[/TD]
[/TR]
[TR]
[TD]Steinbeck, John
[/TD]
[/TR]
[TR]
[TD]Stephanie Meyer
[/TD]
[/TR]
</tbody>[/TABLE]
 
Upvote 0
I also came up with 10 authors using this formula:

=SUMPRODUCT((B2:C11<>"")/(COUNTIF(B2:C11,B2:C11&"")))

As far as your second question, you could add another column with the heading: Author2Gender. It might be quite tricky though to figure out the number of unique male authors and the number of unique female authors. If what you have works, you could stick with it though. But I'm sure someone here can figure something out if you want that.
 
Upvote 0
Yes, sorry, I keep confusing myself

Overall count of author instances = 12
Unique author count = 10 <-- this is the count I'm trying to get
 
Upvote 0
Yay! That worked perfectly for the unique authors. Is there a way to do it with B:B/C:C or named ranges, though.

As for the the second question, managing it with blanks works fine for me, but feels kludgy if I ever try to hand the spreadsheet off to someone else. So, if anyone has any ideas without spending too much time on it, I'd appreciate it.
 
Upvote 0
I wouldn't recommend full column references because of performance, but you can use a name range like so:


Book1
ABCDEFGHIJKLM
1MonthAuthor1Author2Title1Title2TitleCtBookPageCtAuthorGenderAuthor2GenderNumber of unique authorsNumber of unique male authorsNumber of unique female authors
21Steinbeck, JohnThe Grapes of Wrath1423m1055
32Atwood, MargaretCat's Eye1423f
43Gaiman, NeilPratchett, TerryGood Omens1432mm
54Selvon, SamThe Lonely Londoners1432m
65Lively, PenelopeMoon Tiger1432f
76Stephanie MeyerBram StokerTwilightDracula2867fm
87Austen, JanePersuasion1423f
98Pratchett, TerryThe Colour of Magic1423m
109Steinbeck, JohnTortilla Flat1432m
1110Highsmith, PatriciaThe Talented Mr. Ripley1432f
Sheet5
Cell Formulas
RangeFormula
K2=SUMPRODUCT((Authors<>"")/(COUNTIF(Authors,Authors&"")))
L2=SUMPRODUCT((Authors<>"")*(Genders="m")/(COUNTIF(Authors,Authors&"")))
M2=K2-L2
Named Ranges
NameRefers ToCells
Authors=Sheet5!$B$2:INDEX(Sheet5!$C:$C,MATCH("zzzzz",Sheet5!$B:$B))
Genders=Sheet5!$H$2:INDEX(Sheet5!$I:$I,MATCH("zzzzz",Sheet5!$B:$B))


If you add an Author2Gender column, the L2 formula will give you the number of unique male authors. You can do the same for female authors, but just subracting (M2) works just as well.
 
Upvote 0
You can do the same for female authors, but just subracting (M2) works just as well.
Until our Reading Group takes on the first non-binary author, that is. ;)

Thanks for all your help, Eric. I will save this thread for future updates, but I've got what I need to go live for now. (I'm building this spreadsheet in Excel then publishing it on Google Docs to share with the group. Haven't quite figured out how to do some things on Google, like named ranges, so I'll keep poking away.)

Really do appreciate your help.
 
Upvote 0

Forum statistics

Threads
1,224,821
Messages
6,181,163
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top