Tricky Power Pivot Calculation, summing the occurrence of different system states at different times

GundayMonday

New Member
Joined
May 13, 2013
Messages
4
Hello,

I have a weird problem and I'm not even sure if you can solve it using Power Pivot's functionality.

I'm looking at how ~500 unique IDs are changing over time within a system. The IDs can enter one of three different states within the system (let's call them A, B, and C for simplicity). Every row in my data starts with a time stamp telling what state the unique Id is in at that time. I'm trying to create a column that tells me how many times a specific ID has entered a specific state up to the current time.

So my data looks like this:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Timestamp[/TD]
[TD]Unique ID[/TD]
[TD]State[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]123[/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]124[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]123[/TD]
[TD]A[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]124[/TD]
[TD]C[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]123[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]124[/TD]
[TD]B[/TD]
[/TR]
[TR]
[TD]04[/TD]
[TD]123[/TD]
[TD]A[/TD]
[/TR]
</tbody>[/TABLE]

I want to return a column that counts the number of times a specific ID has newly entered each state:
[TABLE="width: 500"]
<tbody>[TR]
[TD]Timestamp[/TD]
[TD]Unique ID[/TD]
[TD]State[/TD]
[TD]A total[/TD]
[TD]B total[/TD]
[TD]C total[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]0[/TD]
[TD]1[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]124[/TD]
[TD]C[/TD]
[TD]0[/TD]
[TD]1[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]123[/TD]
[TD]B[/TD]
[TD]1[/TD]
[TD]1[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]0[/TD]
[TD]2[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]04[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]2[/TD]
[TD]1[/TD]
[TD]0[/TD]
[/TR]
</tbody>[/TABLE]

My pseudo code would look like:

At each time stamp, check and see if the state of each ID has changed. Ignore any time stamp > than the current row.
If it has changed, add 1 to whatever the new state total is for that ID.
If it hasn't changed, keep counts from the last time step.

Is there a way to do this in PowerPivot? Excel I can do it no problem, but my data set has ~9million rows in it ;). I've been googling for the past few hours and trying out all these weird DAX functions, but can't find anything that works.

Can anyone help me figure this out?
 
Last edited:

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
It's probably a bit of an 'Excel mindset' but the first thing I would do is add a calculated column to determine whether or not the current state if the same as the previous for that [Unique ID]. I'll give you 2 options - would be very interested to know if you have any performance issues and which works better. Theoretically the only impact should be on refresh times.

=if(LOOKUPVALUE(Table1[State],
Table1[Timestamp],
CALCULATE(MAX(Table1[Timestamp]), FILTER(ALL(Table1),table1[Timestamp]<earlier(table1[timestamp]))),
Table1[Unique ID], Table1[Unique ID]
)<>Table1[State],
"New",
BLANK()
)

<earlier(table1[timestamp])&&
=IF(CALCULATE(VALUES(Table1[State]),
TOPN(1,
FILTER(ALL(Table1),table1[Timestamp]<earlier(table1[timestamp])&&
table1[unique ID] = earlier(table1[unique ID])
),
Table1[Timestamp]
)
)<>Table1[State],
"New",
BLANK()
)


Then using that as a flag I can write a relatively straightforward measure. You will probably need a separate Dates/Timestamp table to get decent performance out of a relatively large dataset. The dates table should have a column containing unique, contiguous dates/time units for the required period and be related to the main table.

Code:
CALCULATE(COUNTROWS(Table1),
                             FILTER(ALL(Timestamp),Timestamp[Timestamp]<=MAX(Timestamp[Timestamp])),
                             Table1[A]="New"
                  )

Hope this gets you on the way, I uploaded my quick test model here if it helps: https://docs.google.com/file/d/0Bz5yMU2oooW2S1RlTi1ONzVhZU0/edit?usp=sharing
</earlier(table1[timestamp])&&
</earlier(table1[timestamp])&&
</earlier(table1[timestamp]))),
 
Upvote 0
<earlier(table1[timestamp]))),
Definitely an excel mindset thing </earlier(table1[timestamp]))),
:). Thank you very much! I'm trying this out now on my non-generic data. Extremely useful to have example code to base similar problems off of too!
 
Last edited:
Upvote 0
My data set is currently just shy of 10 million rows. Your first suggestion takes about 30s to calculate and works great. I tried the second one a few times but it never finished (I gave up after ~10 mins). First suggestion works great though :).
 
Upvote 0
<earlier(table1[timestamp]))),
Okay, I have a follow up question now. Jacob's solution totally works, but won't scale the way my dataset's growing with a simple excel pivot table (I'm going to have way more rows than excel can display because of way too many unique IDs).

My thought on a solution is to build some new categories for different states in my power pivot dataset, based on the number of times a unique ID has entered a specific state. Let me explain with a data example:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Timestamp[/TD]
[TD]Unique ID[/TD]
[TD]State[/TD]
[TD]New to State Dummy?[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]124[/TD]
[TD]C[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]123[/TD]
[TD]B[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]04[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[/TR]
</tbody>[/TABLE]

Same data as before, but now I have an extra column (thanks to Jacob!) that tells me if the Unique ID has newly entered the State for the row's specific timestamp. I'd like to use this dummy variable to figure out how many times a unique ID has newly entered a specific state for all timestamps <= the current row's timestamp. Psuedo-code would look like this:

For a specific unique ID, and a specific state, and all timestamps <= timestamp of current row, sum 'New to State Dummy?'. Would look something like this:
[TABLE="width: 500"]
<tbody>[TR]
[TD]Timestamp[/TD]
[TD]Unique ID[/TD]
[TD]State[/TD]
[TD]New To State Dummy?[/TD]
[TD]New To State Sum[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]01[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]1[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]0[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]02[/TD]
[TD]124[/TD]
[TD]C[/TD]
[TD]1[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]123[/TD]
[TD]B[/TD]
[TD]1[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]03[/TD]
[TD]124[/TD]
[TD]B[/TD]
[TD]1[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]04[/TD]
[TD]123[/TD]
[TD]A[/TD]
[TD]1[/TD]
[TD]2[/TD]
[/TR]
</tbody>[/TABLE]

Having that new column would let me use the CONCATENATE function to combine my 'State' and my 'New To State Sum' columns to get new categories like A1, A2, B1, B2, etc. Then I could use a pivot table to visualize everything without over loading excel's row limit! (There might be a more elegant way to do this, I'm all ears if you know of it :))

I can't figure out a way to get that 'New To State Sum' to work. I think I'm messing up some syntax in my DAX functions. There was a nice thread with a similar problem a few months back, but I can't get their proposed solutions to work on my dataset:
http://www.mrexcel.com/forum/powerp...records-table-ignoring-blanks-blank-rows.html

Any insight into how to tackle this?</earlier(table1[timestamp]))),
 
Upvote 0

Forum statistics

Threads
1,223,952
Messages
6,175,595
Members
452,657
Latest member
giadungthienduyen

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top