Formula to find average number of occurrences each day in a given month

kcmuppet

Active Member
Joined
Nov 2, 2005
Messages
439
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I'm stuck trying to work out the average number of patients we get per day (each month) from a list of appointments like below.

I started by doing a pivot table with dates as rows grouped by month, but 'Count of Patients' seemed to actually produce a count of rows, not specific / unique patients.

So I resorted to creating a list of unique dates, then counting a filtered list of unique patients against each date, then creating a pivot table of that produced what I think are the right answers that I'm looking for

But how can I do it with a formula for a given month, instead of extra lists and pivot tables?


Book1
BCDEFGHIJ
5DateDayPatientUnique datesUnique Patients Per day
603 Aug 21Tue0000803 Aug 211Pivot table source is Cols F & G
704 Aug 21Wed0006004 Aug 211
805 Aug 21Thu0002405 Aug 211Row LabelsAverage Patients per day
907 Aug 21Sat0006107 Aug 21120214.0
1010 Aug 21Tue0000810 Aug 211Aug1.0
1111 Aug 21Wed0005811 Aug 211Sep3.8
1212 Aug 21Thu0006212 Aug 211Oct5.6
1314 Aug 21Sat0005714 Aug 211Nov5.0
1417 Aug 21Tue0000817 Aug 211Dec4.5
1518 Aug 21Wed0000818 Aug 21120223.9
1619 Aug 21Thu0006319 Aug 211Jan4.9
1721 Aug 21Sat0003321 Aug 211Feb3.7
1824 Aug 21Tue0001524 Aug 211Mar3.9
1925 Aug 21Wed0001425 Aug 211Apr3.3
2026 Aug 21Thu0005726 Aug 211May3.6
2128 Aug 21Sat0000228 Aug 211Grand Total3.9
2201 Sep 21Wed0000101 Sep 215
2301 Sep 21Wed0000202 Sep 212
2401 Sep 21Wed0000304 Sep 211
2501 Sep 21Wed0000407 Sep 212
2601 Sep 21Wed0000508 Sep 213
2701 Sep 21Wed0000509 Sep 212
2801 Sep 21Wed0000111 Sep 218
2901 Sep 21Wed0000214 Sep 213
3001 Sep 21Wed0000315 Sep 214
3101 Sep 21Wed0000416 Sep 214
Sheet1
Cell Formulas
RangeFormula
F6:F162F6=SORT(UNIQUE(Table1[Date]))
C6:C31C6=TEXT([@Date],"ddd")
G6:G31G6=COUNTA(UNIQUE(FILTER(Table1[Patient],Table1[Date]=F6)))
Dynamic array formulas.
 

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Try:

Book3
ABCDEFG
1
2
3
4
5DateDayPatientMonthAverage patients per day
68/3/2021Tue8Aug 20211
78/4/2021Wed60Sep 20215
88/5/2021Thu24
98/7/2021Sat61
108/10/2021Tue8
118/11/2021Wed58
128/12/2021Thu62
138/14/2021Sat57
148/17/2021Tue8
158/18/2021Wed8
168/19/2021Thu63
178/21/2021Sat33
188/24/2021Tue15
198/25/2021Wed14
208/26/2021Thu57
218/28/2021Sat2
229/1/2021Wed1
239/1/2021Wed2
249/1/2021Wed3
259/1/2021Wed4
269/1/2021Wed5
279/1/2021Wed5
289/1/2021Wed1
299/1/2021Wed2
309/1/2021Wed3
319/1/2021Wed4
32
Sheet15
Cell Formulas
RangeFormula
F6:F7F6=UNIQUE(Table1[Date]-DAY(Table1[Date])+1)
G6:G7G6=LET(u,UNIQUE(FILTER(Table1,(MONTH(Table1[Date])=MONTH(F6))*(YEAR(Table1[Date])=YEAR(F6)))),f,FREQUENCY(INDEX(u,0,1),SEQUENCE(31,,F6)),SUM(f)/ROWS(FILTER(f,f>0)))
C6:C31C6=TEXT([@Date],"ddd")
Dynamic array formulas.


The F6 formula gives you a list of unique months. It's actually the 1st of each month, but I formatted it to just show the month/year. The G6 formula you'll need to drag down. It finds the average number of unique patients per day. Note that it only includes days that have patients. It does not include 8/8/2021 or 8/9/2021 in the average.
 
Upvote 0
Wow!

That's quite a formula. If you have time could you help me understand it?

I think the LET function is creating variables u, and array of the dates and patients for the month and year in question, derived from filtering the main table, and f, a list of the number of times a unique patient appears for each of the days in a given month.

I can follow the first part OK:
Excel Formula:
LET(u,UNIQUE(FILTER(Table1,(MONTH(Table1[Date])=MONTH(F6))*(YEAR(Table1[Date])=YEAR(F6))))

The second part is hard for me to follow, in particular the FREQUENCY & INDEX
Excel Formula:
f,FREQUENCY(INDEX(u,0,1),SEQUENCE(31,,F6)),SUM(f)/ROWS(FILTER(f,f>0)))

Excel Formula:
SEQUENCE(31,,F6)
seems to be generating a list of all the 31 days from the start of the given month (which makes me wonder what happens if its in a month with fewer than 30 days, but it seems to work perfectly).

Excel Formula:
FREQUENCY(INDEX(u,0,1),SEQUENCE(31,,F6))
then seems to count the number of times each of the next 31 days from the beginning of the month features in the first list of dates and patients for the month - variable u. I'm wondering if it's possible to to do this using COUNTIF, but I guess it needs to be an array, so I would have next wrestled with FILTER. I read somewhere that FREQUENCY is more efficient/faster - is that why you chose it?

Excel Formula:
SUM(f)/ROWS(FILTER(f,f>0)))
calculates the average of the counts.

An amazing piece of work. Thank you
 
Upvote 0
As you've figured out, the first part creates a variable named u which contains all the rows of the table in the same month and year as the date in F6. The UNIQUE removes rows where the same patient comes in on the same day.

INDEX(u,0,1) is a way to get just the first column of the array. u is the array, 1 is the column we want, and the second parameter is the row that we want. But if we use 0, that tells INDEX to take all of the rows. So now we have a list of dates where we had patients. If there is more than 1 patient for a day, the date is in there more than once.

Now we need to count how many times each day of the month appears in the array. As you noted, COUNTIF only works on ranges, not arrays. FILTER might work, but I think we'd have to do it 31 times, for each day of the month. FREQUENCY does all the sorting for all the days at a time. The fact that it's pretty efficient is just a bonus. I use SEQUENCE to create a list of all the days in the month. These are the "buckets" that FREQUENCY will fill. Yes, it's possible if we have a month with 28 or 30 days, we'll end up including a few days of the next month. But since we've already restricted the input array to days in the current month, all this means is that we'll have a few buckets on the end that will have 0 entries.

So now we have a list of patient-days, a list of days in the month, and FREQUENCY does its magic and fills each bucket with the number of patients we saw on that day. So now we take the array that came out of FREQUENCY and calculate the average, which you've also figured out.

I played around with it a little more and got it a tad shorter, although the logic is pretty much the same:

Book3
FG
5MonthAverage patients per day
6Aug 20211
7Sep 20215
Sheet15
Cell Formulas
RangeFormula
F6:F7F6=UNIQUE(Table1[Date]-DAY(Table1[Date])+1)
G6:G7G6=LET(u,UNIQUE(FILTER(Table1,(Table1[Date]>=F6)*(Table1[Date]<=EOMONTH(F6,0)))),f,FREQUENCY(INDEX(u,0,1),SEQUENCE(31,,F6)),SUM(f)/SUM(SIGN(f)))
Dynamic array formulas.


Hope this helps! :biggrin:
 
Upvote 0
That's fantastic Eric.

About the modifications, EOMONTH, I understand, but what's going on with the SUM(f)/SUM(SIGN(f)))?
 
Upvote 0
Let's say we have an array like this {0;1;0;0;3;5}. The previous way I filtered out the 0s, leaving {1;3;5}. Then ROWS just said I have 3 rows in the array. SIGN returns a -1 if the number is negative, 0 if the number is 0, or 1 if the number is positive. So if you do SIGN({0;1;0;0;3;5}), it returns {0;1;0;0;1;1} and when you SUM that up, you get 3. The way FREQUENCY works, we know we don't have any negative numbers to worry about.

There are 2 or 3 other ways to count non-zero elements in an array. Here's another:

SUM(--(f<>0))

This would return {FALSE;TRUE:FALSE;FALSE;TRUE:TRUE} for the above array. Then the -- converts that to {0;1;0;0;1;1} and the SUM gives us 3 again.

SUM(IF(f<>0,1))

also works. I don't know if there's any real reason to choose one way or another. I've seen people do all of these, I think it comes down to personal preference. I've thought about trying to measure the performance difference for each method, but I haven't gotten around to that yet.
 
Last edited:
Upvote 0
That's very helpful, thank you for taking the time to help us all learn
 
Upvote 0
I kind of hate to keep going on when we have a working solution, but it occurred to me that from an algebraic point of view, since we're calculating an average, we don't need to break the counts into days. In other words, the FREQUENCY splits the counts into days, but the SUM(f) puts them right back together again. Given that, I came up with a still shorter formula:

Excel Formula:
=LET(u,INDEX(UNIQUE(FILTER(Table1,(Table1[Date]>=F6)*(Table1[Date]<=EOMONTH(F6,0)))),0,1),ROWS(u)/ROWS(UNIQUE(u)))

Anyway, I'm done with this now! Glad I could help, and thanks for the feedback! 😎
 
Upvote 0
Solution

Forum statistics

Threads
1,224,816
Messages
6,181,143
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top