VBA to sum parent/child hierarchy values based on given hierarchy level

jahsquare

Board Regular
Joined
Jan 22, 2014
Messages
51
Hi, I have data that looks like columns 1 and 2 of the following:

[TABLE="class: grid, width: 354"]
<tbody>[TR]
[TD]Level[/TD]
[TD]Value[/TD]
[TD]Tree Index[/TD]
[TD]Current Output[/TD]
[TD]Expected Output[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]3[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]1.1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]2[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]1[/TD]
[TD]1.1.1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]2[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]6[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]2.1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]5[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]3[/TD]
[TD]2.1.1[/TD]
[TD="align: right"]3[/TD]
[TD="align: right"]3[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]0[/TD]
[TD]2.1.2[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]1[/TD]
[TD]2.1.2.1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD]3[/TD]
[TD="align: right"]2[/TD]
[TD="align: right"]4[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]3.1[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]0[/TD]
[TD]3.1.1[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]3.2[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
</tbody>[/TABLE]

The script below works to build column 3 as an index, and I am trying to build column 4 to match the expected values in column 5, where each output is the sum of its own Value plus all direct child values. Obviously I am missing something with that for/do nested loop, I just can't get my head around it. I have done a lot of searching on flattened hierarchies and nested data but and couldn't come up with an answer I could understand.

For each level it should scan downwards, summing Values until it reaches a level that is less than or equal to itself (indicating the end of that branch of the tree).

Also here is an array formula that works to build the expected output in column 5. I just can't use this because there are too many records, computer can't handle it...
=IFERROR(SUM(OFFSET(B2,,,MATCH(TRUE,A3:$A$14<=A2,0))),SUM(B2:$B$14))

Any help is much appreciated!

Code:
Sub CalculateHierarchy()


    Dim rLevels As Range, rLevel As Range, rVals As Range, rVal As Range
    Dim level As Integer, maxLevels As Integer, val As Integer, cur As Integer, i As Integer
    Dim h As String, j As Long, counts() As Integer


    Set rLevels = Range("A2:A" & Range("A1").End(xlDown).Row)
    maxLevels = WorksheetFunction.Max(rLevels)
    
    Set rVals = Range("B2:B" & Range("B1").End(xlDown).Row)
    
    ReDim counts(1 To maxLevels)
    
    cur = 1
    
    For Each rLevel In rLevels
    
        level = rLevel.Value
        If level > cur + 1 Then
            rLevel.Activate
            MsgBox "error at row " & rLevel.Row & " level increase by more than 1"
            Exit Sub
        End If


        h = ""
        counts(level) = counts(level) + 1
        For i = 1 To level
            h = h & "." & counts(i)
        Next
        h = Mid(h, 2)
        
        For i = level + 1 To UBound(counts)
            counts(i) = 0
        Next
        
        j = 0
        For Each rVal In rVals
            val = rLevel.Offset(, 1).Value
            Do Until j = level
                val = val + rLevel.Offset(, 1).Value
                j = j + 1
            Loop
        Next
        
        rLevel.Offset(, 2).Value = h
        rLevel.Offset(, 3).Value = val


        val = 0
        cur = level
        
    Next


End Sub
 
Last edited:

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
Try this for results in column "E".
Code:
[COLOR=navy]Sub[/COLOR] MG15May26
[COLOR=navy]Dim[/COLOR] Lst [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] n [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] nn [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] num [COLOR=navy]As[/COLOR] Double
Lst = Range("A" & Rows.Count).End(xlUp).Row
[COLOR=navy]For[/COLOR] n = 2 To Lst
    [COLOR=navy]For[/COLOR] nn = n To Lst
[COLOR=navy]        If[/COLOR] Not n = nn And Cells(nn, 1) <= Cells(n, 1) [COLOR=navy]Then[/COLOR]
            [COLOR=navy]Exit[/COLOR] For
        [COLOR=navy]Else[/COLOR]
            num = num + Cells(nn, 2)
        [COLOR=navy]End[/COLOR] If
    [COLOR=navy]Next[/COLOR] nn
Cells(n, 5) = num: num = 0
[COLOR=navy]Next[/COLOR] n
[COLOR=navy]End[/COLOR] [COLOR=navy]Sub[/COLOR]
Regards Mick
 
Last edited:
Upvote 0
Also here is an array formula that works to build the expected output in column 5. I just can't use this because there are too many records, computer can't handle it...
=IFERROR(SUM(OFFSET(B2,,,MATCH(TRUE,A3:$A$14<=A2,0))),SUM(B2:$B$14))
Since you are already working with code, I suspect ..
a) you will want to continue with that method for output you are looking for, and
b) in any case the code will be less burdensome if your data is large
.. however, does this formula (still an array formula) offer any performance advantage?

Excel Workbook
ABCD
21113
3211.12
4311.1.11
51126
6212.15
7332.1.13
8302.1.21
9412.1.2.11
101234
11213.11
12303.1.10
13213.21
14
15
Sum hierarchy
 
Last edited:
Upvote 0
Try this for results in column "E".
Code:
[COLOR=navy]Sub[/COLOR] MG15May26
[COLOR=navy]Dim[/COLOR] Lst [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] n [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] nn [COLOR=navy]As[/COLOR] [COLOR=navy]Long,[/COLOR] num [COLOR=navy]As[/COLOR] Double
Lst = Range("A" & Rows.Count).End(xlUp).Row
[COLOR=navy]For[/COLOR] n = 2 To Lst
    [COLOR=navy]For[/COLOR] nn = n To Lst
[COLOR=navy]        If[/COLOR] Not n = nn And Cells(nn, 1) <= Cells(n, 1) [COLOR=navy]Then[/COLOR]
            [COLOR=navy]Exit[/COLOR] For
        [COLOR=navy]Else[/COLOR]
            num = num + Cells(nn, 2)
        [COLOR=navy]End[/COLOR] If
    [COLOR=navy]Next[/COLOR] nn
Cells(n, 5) = num: num = 0
[COLOR=navy]Next[/COLOR] n
[COLOR=navy]End[/COLOR] [COLOR=navy]Sub[/COLOR]
Regards Mick

Hi MickG, Thank you. While this appears to work on the sample data I am seeing some errors in the tabulation of my actual dataset. I can post a larger sample set perhaps but I am wondering if there is an inherent limit to the number of hierarchy levels (my data has 12 levels) or limit to the number of records (I'm dealing with ~250k records). I'll admit I can't quite parse what your code is doing without comments..
 
Last edited:
Upvote 0
Please post the list of data That it fails on!!, and the expected results.

So it turns out this works fine, I just had some mismatched formatting in my data, and it was missing the top level "1" as the source data starts on level 2. Would it be possible to modify this slightly to
a) convert any numbers stored as text (otherwise I will have to convert manually, i.e. the text-to-columns trick, every time I get new source data)
b) ignore/treat as "0" any blanks "" in the value column
c) sum from the lowest present level (if not 1) ?

As an observation, this takes about 8-10 minutes to run on my data - much better than ~1hr with the formula! Would something like turning off screen updating help, possibly at the expense of using more RAM? Right now it's only using ~40% cpu power and 10% of available ram during processing.
Thanks again.
 
Upvote 0
.. however, does this formula (still an array formula) offer any performance advantage?

Peter, thank you for this. It does appear to work on a subset of my source data, although I am afraid to run it on the full data as with ~250k records, my machine tends to crash/lock up for a long time until it finishes. When I have a break in work tonight I will let it run and see how it does compared to the original formula.

I was using a somewhat similar (though not as simple) approach at one point but wasn't getting much better performance:

=SUM(OFFSET(B2,,,IFERROR(MATCH(0,N(A2<(A3:A$13)),),)))
 
Upvote 0
Try this, it should be a lot faster:-
Not quite sure about (c), but perhaps you can give me more details.
Code:
[COLOR="Navy"]Sub[/COLOR] MG15May49
[COLOR="Navy"]Dim[/COLOR] Ray [COLOR="Navy"]As[/COLOR] Variant, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] nn [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] Num [COLOR="Navy"]As[/COLOR] Double
[COLOR="Navy"]With[/COLOR] ActiveSheet.Cells(1).CurrentRegion.Resize(, 2)
    Ray = .Value
    .NumberFormat = "0.0"
[COLOR="Navy"]End[/COLOR] With
ReDim nray(1 To UBound(Ray, 1))
[COLOR="Navy"]For[/COLOR] n = 2 To UBound(Ray, 1)
    [COLOR="Navy"]For[/COLOR] nn = n To UBound(Ray, 1)
        [COLOR="Navy"]If[/COLOR] Not n = nn And Val(Ray(nn, 1)) <= Val(Ray(n, 1)) [COLOR="Navy"]Then[/COLOR]
            [COLOR="Navy"]Exit[/COLOR] For
        [COLOR="Navy"]Else[/COLOR]
            Num = Num + Val(Ray(nn, 2))
        [COLOR="Navy"]End[/COLOR] If
    [COLOR="Navy"]Next[/COLOR] nn
nray(n - 1) = Num: Num = 0
[COLOR="Navy"]Next[/COLOR] n
Range("E2").Resize(UBound(Ray, 1)).Value = Application.Transpose(nray)
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0
Try this, it should be a lot faster:-
Not quite sure about (c), but perhaps you can give me more details.

This version appears to break down after about ~380 records, after that all output is "#N/A"

Here is a link to a .csv file of my input data - two columns: Level, Value. (Note 0 values are all blank)

https://www.dropbox.com/s/o3k49za8gt84oam/hierarchy-test-data-2018-05-15.csv?dl=0


Regarding (c), note that the input doesn't have a top-level "1" - the first level is "2". To get your original script working I just inserted a row at the top of the data and set the level to "1", which would then be the root parent of the entire hierarchy. This isn't a big deal, although it would be nice if this wasn't required (mainly because it would allow me to drill into branches of the hierarchy without worrying about whether the root is present).
edit: although having that root level present is a nice check on the success of the script, since the sum of all values in col. B should equal the output sum for level 1.
 
Last edited:
Upvote 0
I edited the file above to add a column including the output from your original script, which is correct.
Strange thing, each time I re-run your faster script, one more record is returned before the #N/A errors start... 370, 371, 372 etc...
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,182
Members
453,021
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top