Dynamic Counts

ellison

Active Member
Joined
Aug 1, 2012
Messages
356
Office Version
  1. 365
Platform
  1. Windows
Good afternoon, we are tidying up some data & the data seems to be growing quicker than we are tidying it up!

What we confirm (by reviewing it manually) is whether the relationship between our "old data" and our "new data" is TBC, Yes or No.

***What we'd like to put in are dynamic counts of i) the total number of possibilities ii) # TBC, iii) # Yes iv) # No***

By the way, we have looked at various ways of concatenating info, then doing pivot tables & counts... But we do this so often every day on all sorts of differently updated files that we are hoping there may be a better solution? Maybe involving VBA? The files are generally 20K-50K rows and formula type options don't seem to work I'm afraid.

Our Data is arranged in 3 columns:

NB, the explanation column is only to help illustrate what's here, we wouldn't actually need that!

Thanks for any help

Old DataNew DataStatusTotal # Possibilities# TBC# Yes#NoExplanation (only for illustration)
compositeplasticTBC1100there is only one possible relationship between the old and new data. And it is "TBC".
woodoakTBC2110There are 2 possiblities for "wood" = oak & rustic. And one of those is "TBC" and one is "yes"
woodrusticYes2110ditto above
414878plasticNo1001Sometimes our old data is not a text string, it's an order code which is a number. It has only one possiblity and that is a "no"
raw materialsfinished productNo2011For "raw materials", there are 2 possible "new data's". and one is "yes" and one is "no".
raw materialscheck quarantineYes2011ditto above
 
More optimizing and updated comments.
VBA Code:
Sub Counting_Stuff()

Dim Count_Storage_DCT As Object, Worksheet_Data() As Variant, T As Long, OldData_Main_Key() As Variant, Count_R() As Variant, _
Output_Array() As Variant, Top_Left_Corner_Of_Destination_Range As Range, Y As Long, Status() As Variant, Update_Status_Counts As Boolean, _
Valid_Status() As String, VS_Number As Long, B As Long, NewData_SubKey() As Variant, P_Count_Update As Boolean

'Dim GG As Double: GG = Timer

Worksheet_Data = ActiveSheet.UsedRange.Value 'Loads worksheet data to an array

If UBound(Worksheet_Data, 2) < 3 Then
    MsgBox "Not enough columns."
    Exit Sub
End If

Set Count_Storage_DCT = CreateObject("Scripting.Dictionary")

With WorksheetFunction
    OldData_Main_Key = .Transpose(.Index(Worksheet_Data, 0, 1))
    NewData_SubKey = .Transpose(.Index(Worksheet_Data, 0, 2))
    Status = .Transpose(.Index(Worksheet_Data, 0, 3))
End With

Set Top_Left_Corner_Of_Destination_Range = ThisWorkbook.ActiveSheet.Range("I1") '<<< Edit this if needed

Valid_Status = Split("TBC,YES,NO", ",")     'Upper case versions of valid statuses

For T = 1 To UBound(Worksheet_Data, 1)      'Loop all rows and create dictionary array if Status is valid

    Status(T) = Replace(UCase(Status(T)), " ", vbNullString) 'converts Status to Uppercase version,removes spaces and overwrites array element
    
    On Error Resume Next
    
    Y = WorksheetFunction.Match(Status(T), Valid_Status, 0)  'Find base 0 location of status within array + 1 [Possibility count held at 0]

    If Err.Number <> 0 Then
    
        Err.Clear
        Status(T) = "0" 'Will be used in next for statement to determine whether or not ot execute code
        
    Else
    
        On Error GoTo 0
        
        OldData_Main_Key(T) = Replace(LCase(OldData_Main_Key(T)), " ", vbNullString)  'Old data will be used as a key if the status is valid
        
        NewData_SubKey(T) = Replace(LCase(NewData_SubKey(T)), " ", vbNullString)
        
        Status(T) = "1" 'Will be used in next FOR statement to determine whether or not to execute code
            
        With Count_Storage_DCT
        
            If Not .exists(OldData_Main_Key(T)) Then 'Create a Sub-Dictionary keyed to old data that will hold New Data
            
                .Add OldData_Main_Key(T), CreateObject("Scripting.Dictionary") 'Dictionary within a dictionary
                
                .Item(OldData_Main_Key(T)).Add "Status Counts", Array(0, 0, 0, 0) 'Array order Possibility count TBC count, Yes count, No count
                
            End If
            
            With .Item(OldData_Main_Key(T))         'With sub-Dictionary for current Old_Data
                
                If Not .exists(NewData_SubKey(T)) Then 'If "New Data" key doesn't exist within the created sub-dictionary
                    
                    Count_R = Array(0, 0, 0, 0)  'Array order possibility count, TBC count, Yes count, No count
                    Count_R(Y) = 1               'Overwrite array element based on location of status within array Valid Status base
                                                 'Note Y will be at least  1 so first element of array [0] won't be altered
                    .Add NewData_SubKey(T), Count_R     'Overwrite contents of stored sub-Array within the current sub-dictionary keyed to NEW Data
                    
                    Erase Count_R
                    
                    Update_Status_Counts = True ''Will allow the increment of the relevant status count
                    
                    P_Count_Update = True 'Will allow the increment of the Possibilities count since a NEW sub array
                                          'was added within the current Old Data dictionary
                Else
                    'This section of code is what prevents duplicate counts.
                    Count_R = .Item(NewData_SubKey(T))
                    
                    If Not Count_R(Y) = 1 Then    'Being equal to 1 means that the macro has already
                                                  'recorded the same combination of Old,New and Status.
                        Count_R(Y) = 1            'So test instead if it isn't 1 and change status within array if necessary
                        
                        .Item(NewData_SubKey(T)) = Count_R
                        
                        Update_Status_Counts = True 'Will allow the increment  of the relevant status count
    
                    End If
                    
                    Erase Count_R
                    
                End If
                
                If Update_Status_Counts = True Or P_Count_Update = True Then 'Update counts for all Sub Dictionaries linked to Old Data
                
                    Count_R = .Item("Status Counts")
                    
                    If Update_Status_Counts Then        'Update relevant status count for Unique combo of OD,ND and Status
                        Count_R(Y) = Count_R(Y) + 1
                        Update_Status_Counts = False
                    End If
                    
                    If P_Count_Update Then              'Update possibility count since a unique array[keyed to New Data]
                        Count_R(0) = Count_R(0) + 1     'was added as a combination for old data.
                        P_Count_Update = False
                    End If
                    
                    .Item("Status Counts") = Count_R
                    
                    Erase Count_R
                    
                End If
                
            End With ' End use of .Item(OldData_Main_Key(T))
          
        End With 'End use of Count_Storage_DCT
        
    End If 'End conditional error check

Next T

On Error GoTo 0

VS_Number = UBound(Filter(Status, "1", True), 1) + 2 '# of Valid Statuses..+ because 0 based AND need an extra row to hold headers

ReDim Output_Array(1 To VS_Number, 1 To 6)

For T = 1 To 6                          'Column Headers will be placed in first row of array
    Output_Array(1, T) = Array("Old Data", "New Data", "Total # Possibilities", "# TBC", "# Yes", "# No")(T - 1)
Next T

B = 1: 'B is conditionally used in case a non-valid status is found and so that data output is 1 block rather than several chunks

With Count_Storage_DCT

    For T = 1 To UBound(Worksheet_Data, 1)
    
        If Status(T) = "1" Then                              'Determining if status for the current row is valid
            
            With .Item(OldData_Main_Key(T))
                
                Count_R = .Item("Status Counts")             'Should always exist within the Sub-Dictionary
                
                B = B + 1                                    'Assign to next empty row of Output_Array
        
                For Y = 1 To 6
          
                    If Y <= 2 Then                           'Fill in the Old and New data columns
                    
                        Output_Array(B, Y) = Worksheet_Data(T, Y)

                    Else
                    
                        Output_Array(B, Y) = Count_R(Y - 3)  'Count_R array is base 0 so subtract 3 to compensate
                    
                    End If
              
                Next Y
                
            End With
 
        End If
    
    Next T

End With

Top_Left_Corner_Of_Destination_Range.Resize(UBound(Output_Array, 1), UBound(Output_Array, 2)).Value2 = Output_Array

'Debug.Print Timer - GG

End Sub
 
Upvote 0

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
More optimizing and updated comments.
VBA Code:
Sub Counting_Stuff()

Dim Count_Storage_DCT As Object, Worksheet_Data() As Variant, T As Long, OldData_Main_Key() As Variant, Count_R() As Variant, _
Output_Array() As Variant, Top_Left_Corner_Of_Destination_Range As Range, Y As Long, Status() As Variant, Update_Status_Counts As Boolean, _
Valid_Status() As String, VS_Number As Long, B As Long, NewData_SubKey() As Variant, P_Count_Update As Boolean

'Dim GG As Double: GG = Timer

Worksheet_Data = ActiveSheet.UsedRange.Value 'Loads worksheet data to an array

If UBound(Worksheet_Data, 2) < 3 Then
    MsgBox "Not enough columns."
    Exit Sub
End If

Set Count_Storage_DCT = CreateObject("Scripting.Dictionary")

With WorksheetFunction
    OldData_Main_Key = .Transpose(.Index(Worksheet_Data, 0, 1))
    NewData_SubKey = .Transpose(.Index(Worksheet_Data, 0, 2))
    Status = .Transpose(.Index(Worksheet_Data, 0, 3))
End With

Set Top_Left_Corner_Of_Destination_Range = ThisWorkbook.ActiveSheet.Range("I1") '<<< Edit this if needed

Valid_Status = Split("TBC,YES,NO", ",")     'Upper case versions of valid statuses

For T = 1 To UBound(Worksheet_Data, 1)      'Loop all rows and create dictionary array if Status is valid

    Status(T) = Replace(UCase(Status(T)), " ", vbNullString) 'converts Status to Uppercase version,removes spaces and overwrites array element
  
    On Error Resume Next
  
    Y = WorksheetFunction.Match(Status(T), Valid_Status, 0)  'Find base 0 location of status within array + 1 [Possibility count held at 0]

    If Err.Number <> 0 Then
  
        Err.Clear
        Status(T) = "0" 'Will be used in next for statement to determine whether or not ot execute code
      
    Else
  
        On Error GoTo 0
      
        OldData_Main_Key(T) = Replace(LCase(OldData_Main_Key(T)), " ", vbNullString)  'Old data will be used as a key if the status is valid
      
        NewData_SubKey(T) = Replace(LCase(NewData_SubKey(T)), " ", vbNullString)
      
        Status(T) = "1" 'Will be used in next FOR statement to determine whether or not to execute code
          
        With Count_Storage_DCT
      
            If Not .exists(OldData_Main_Key(T)) Then 'Create a Sub-Dictionary keyed to old data that will hold New Data
          
                .Add OldData_Main_Key(T), CreateObject("Scripting.Dictionary") 'Dictionary within a dictionary
              
                .Item(OldData_Main_Key(T)).Add "Status Counts", Array(0, 0, 0, 0) 'Array order Possibility count TBC count, Yes count, No count
              
            End If
          
            With .Item(OldData_Main_Key(T))         'With sub-Dictionary for current Old_Data
              
                If Not .exists(NewData_SubKey(T)) Then 'If "New Data" key doesn't exist within the created sub-dictionary
                  
                    Count_R = Array(0, 0, 0, 0)  'Array order possibility count, TBC count, Yes count, No count
                    Count_R(Y) = 1               'Overwrite array element based on location of status within array Valid Status base
                                                 'Note Y will be at least  1 so first element of array [0] won't be altered
                    .Add NewData_SubKey(T), Count_R     'Overwrite contents of stored sub-Array within the current sub-dictionary keyed to NEW Data
                  
                    Erase Count_R
                  
                    Update_Status_Counts = True ''Will allow the increment of the relevant status count
                  
                    P_Count_Update = True 'Will allow the increment of the Possibilities count since a NEW sub array
                                          'was added within the current Old Data dictionary
                Else
                    'This section of code is what prevents duplicate counts.
                    Count_R = .Item(NewData_SubKey(T))
                  
                    If Not Count_R(Y) = 1 Then    'Being equal to 1 means that the macro has already
                                                  'recorded the same combination of Old,New and Status.
                        Count_R(Y) = 1            'So test instead if it isn't 1 and change status within array if necessary
                      
                        .Item(NewData_SubKey(T)) = Count_R
                      
                        Update_Status_Counts = True 'Will allow the increment  of the relevant status count
  
                    End If
                  
                    Erase Count_R
                  
                End If
              
                If Update_Status_Counts = True Or P_Count_Update = True Then 'Update counts for all Sub Dictionaries linked to Old Data
              
                    Count_R = .Item("Status Counts")
                  
                    If Update_Status_Counts Then        'Update relevant status count for Unique combo of OD,ND and Status
                        Count_R(Y) = Count_R(Y) + 1
                        Update_Status_Counts = False
                    End If
                  
                    If P_Count_Update Then              'Update possibility count since a unique array[keyed to New Data]
                        Count_R(0) = Count_R(0) + 1     'was added as a combination for old data.
                        P_Count_Update = False
                    End If
                  
                    .Item("Status Counts") = Count_R
                  
                    Erase Count_R
                  
                End If
              
            End With ' End use of .Item(OldData_Main_Key(T))
        
        End With 'End use of Count_Storage_DCT
      
    End If 'End conditional error check

Next T

On Error GoTo 0

VS_Number = UBound(Filter(Status, "1", True), 1) + 2 '# of Valid Statuses..+ because 0 based AND need an extra row to hold headers

ReDim Output_Array(1 To VS_Number, 1 To 6)

For T = 1 To 6                          'Column Headers will be placed in first row of array
    Output_Array(1, T) = Array("Old Data", "New Data", "Total # Possibilities", "# TBC", "# Yes", "# No")(T - 1)
Next T

B = 1: 'B is conditionally used in case a non-valid status is found and so that data output is 1 block rather than several chunks

With Count_Storage_DCT

    For T = 1 To UBound(Worksheet_Data, 1)
  
        If Status(T) = "1" Then                              'Determining if status for the current row is valid
          
            With .Item(OldData_Main_Key(T))
              
                Count_R = .Item("Status Counts")             'Should always exist within the Sub-Dictionary
              
                B = B + 1                                    'Assign to next empty row of Output_Array
      
                For Y = 1 To 6
        
                    If Y <= 2 Then                           'Fill in the Old and New data columns
                  
                        Output_Array(B, Y) = Worksheet_Data(T, Y)

                    Else
                  
                        Output_Array(B, Y) = Count_R(Y - 3)  'Count_R array is base 0 so subtract 3 to compensate
                  
                    End If
            
                Next Y
              
            End With

        End If
  
    Next T

End With

Top_Left_Corner_Of_Destination_Range.Resize(UBound(Output_Array, 1), UBound(Output_Array, 2)).Value2 = Output_Array

'Debug.Print Timer - GG

End Sub
 
Upvote 0
Mr Moshi, you are an absolute legend....

Not only has that given us the EXACT counts which we were expecting but it devoured a set of data 50K rows long in about 10 seconds.

That's absolutely superb - thank you very much indeed.

With Warm Regards!
 
Upvote 0

Forum statistics

Threads
1,225,754
Messages
6,186,827
Members
453,377
Latest member
JoyousOne

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top