In Excel, Sum hierarchical data based on existence of subdata

0

I have a set of rows like this that display some hierarchical data.

      A  |    B    | C | D | E | F 

1   Task |         | 1 |   | 2 |  
2   Task |         | 1 |   | 2 |  
3   Task |         |   |   | 8 |  
4        | Subtask |   | 2 |   | 4  
5        | Subtask |   | 2 |   | 4  
6   Task |         | 1 |   | 2 |   
7   Task |         | 1 |   | 2 | 

Columns A and B are tasks and subtasks, columns C and D are inputs corresponding to each task/subtask. Columns E and F are transformations columns C & D (here they're simply doubled).

Notice that C3 is empty, because the task is not given a value directly; rather it is the sum of the subtasks. Correspondingly, E3 is =SUM(F4:F5).

Currently I do this manually. I would like a function that would do something like the following pseudocode:

IF(
  NOT(C3=""),
  C3,
  SUM_UNTIL(E:E,NOT(""),F:F)
)

So the cell E3 would:

  • use the value in C3 if C3 is not blank.
  • If C3 is blank, SUM() the cells in F4:F for any number of contiguous blank cells in C.

I am pretty good to very good in excel. Also, I am a programmer, but I don't know VB at all (linux guy), so I'd be happy with a VB solution. I am fluent in JavaScript so that would work too if there's some sort of javascript plugin for excel).

I'm using the latest version of Excel (maybe 2016? The one in the latest business edition of office).

Sir Robert

Posted 2017-04-27T17:53:00.060

Reputation: 183

Answers

1

This may not be the most efficient formula, but this works

=IF(AND(ISBLANK(B3); ISBLANK(C3));
    SUM(INDIRECT("F"&ROW()+1&":
                  F"&MATCH(TRUE; INDEX($B4:$B$16=""; 0); 0)+ROW()-1));
    C3*2)

Explanation

AND(ISBLANK(B3); ISBLANK(C3)); detects when to start sum formula. It requires both cell B3 and C3 to be empty (if you only use C3, then C4 and C5 will also trigger the sum formula, instead of doubling their value)

INDIRECT( ... ) forms the range to be used in the SUM

"F"&ROW()+1&": the range start at F and below the current row (in this example F4)

F"&MATCH(TRUE; INDEX($B4:$B$16=""; 0); 0)+ROW()) and ends at the F row on the next blank cell on column B

Note: please remember that the range starts and ends one row below the current row. Also note the position of absolute signs ($)


Further explanation on MATCH(TRUE; INDEX($B4:$B$16=""; 0); 0)+ROW()-1 I found the formula here

First, INDEX($B4:$B$16=""; 0) creates an array whether B4 = "" and B5 = "" and so on. You can see this array by putting the formula on row 1 on any column, and change 0 to ROW()-1 and dragging the formula down.

FALSE
FALSE
FALSE
TRUE
and so on...

Then, MATCH(TRUE; ... ; 0) will find the first TRUE, that is the first next blank row after this row (returning the index 3)

However, the result is the index of the array 3, instead of row number. So we add the index with current row, 3 + 3, and voila! We have the row number of the next blank row. Just add -1 to make sure we are summing the correct range of F4:F5, not F4:F6.

Vylix

Posted 2017-04-27T17:53:00.060

Reputation: 1 651

Thanks! This worked very well. For anyone else finding it useful: The big trick to remember is that the INDIRECT("F"&ROW()+1":F"... portion will not drag horizontally (the "F" has to be modified manually each time you change rows). – Sir Robert – 2017-04-29T13:04:40.770