How to flag rows with the pattern non-blank, blank, non-blank using Excel

2

0

I would like your help with something I have been trying to figure out for the past two days.

I have data with 1000's of rows and several columns. I want to insert a column with a (array) formula that flags all rows that have "holes"; that is any row that has one or more blank columns in between non-blank columns such as these examples:

enter image description here

My actual data set could have between 3 and 7 columns. For the example above (4 columns) I tried the following formula:

IF(OR(AND(SUM(B13:C13)=0,COUNT(A13,D13)>1),AND(B13=0,COUNT(A13,C13)>1),AND(C13=0,COUNT(B13,D13)>1)),"Issue","OK") 

It seems to work but I need something that can work with more columns, text or numeric data and that accounts for the possibility of other non-target columns surrounding the data. I don't feel confident I can identify all the possible patterns (permutations) and then accurately modify this formula. I need a more robust method for doing this.

daniellopez46

Posted 2015-11-11T02:18:30.570

Reputation: 233

Welcome to Super User. We are not a script writing service. We expect users to tell us what they have tried so far (including any scripts they are using) and where they're stuck so that we can help with specific problems. Questions that only ask for scripts are too broad and are likely to be put on hold or closed. Please read How do I ask a good question?.

– DavidPostill – 2015-11-11T13:12:31.863

My apologies. I edited my post. – daniellopez46 – 2015-11-11T18:46:51.493

How would you count these situations: 1) FY12 and FY13 blank but values for FY14 and FY15 (or the reverse pattern)? 2) Value only in one FY (could be a situation with fewer columns if unlikely here)? – fixer1234 – 2015-11-12T19:18:25.920

@fixer1234 Great questions. 1) This would not be counted. 2) Value only in one FY is possible. That case would not be flagged. – daniellopez46 – 2015-11-12T19:22:52.547

So the first situation also would not be flagged? – fixer1234 – 2015-11-12T19:25:10.470

Only rows with 1 or multiple consecutive blank columns in-between non-blank columns should be flagged. – daniellopez46 – 2015-11-12T19:27:54.153

@fixer1234 yes that is correct the first situation would also not be flagged. – daniellopez46 – 2015-11-12T19:37:05.627

Just discovered that LO Calc (which I used for the solution), has a slight difference that produces an error in Excel. See the revised answer. Sorry if you spent frustrating time trying to get it working. – fixer1234 – 2015-11-13T04:44:47.847

I'm curious. Can you give me some feedback on whether you got my solution working and if it did what you need? If not, we can work on getting you a solution. – fixer1234 – 2015-11-14T09:50:19.193

Will do. But I won't be able to test it until I'm back in the office on Monday. – daniellopez46 – 2015-11-14T15:10:28.130

Answers

1

Here's a general solution. I purposely positioned it in the middle of the sheet to illustrate the handling of blank columns before or after the range.

screenshot

I've used some helper columns to make the solution easier to see and troubleshoot. The formulas get unwieldy if you try to do everything in one formula, although you could consolidate this into a single formula if you're a glutton for punishment.

Rather than re-enter your data, I used four columns (C through F) of the numbers 1 through 4, with different missing element patterns in the rows to verify the various conditions.

Input Data

The first and last data columns are entered as variables, so you can use as many columns as you want, positioned anywhere you want. I arbitrarily picked I1 and K1 to store the first and last column letters.

Identify first populated column

For each row, you need to find the first and last actual value because "holes" are counted only between them. I used column H to identify the first column with a value in the row. I started my data in row 3, so H3 contains:

=MATCH(TRUE,LEN(C3:F3)>0,0)+COLUMN(INDIRECT($I$1&":"&$I$1))-1

EDIT: Note that I tested this is LO Calc, which changes the TRUE in this formula to 1. In Calc, this formula becomes:

=MATCH(1,LEN(C3:F3)>0,0)+COLUMN(INDIRECT($I$1&":"&$I$1))-1

which was the original formula in this answer, and it turns out that the 1 gives Excel indigestion. For Excel, you will need the first formula.

This is an array formula, so it needs to be entered using Ctrl + Shift + Enter.

Embedding the INDIRECT function inside the LEN function doesn't work, so I had to hard code the range. You would need to modify the column range if you use a different number of columns.

The COLUMN function and -1 adjust for the starting position of the range.

Number of values if no holes

Column I is the number of columns from the first value to the last (the potential number of values if there are no holes). Cell I3 contains:

=LOOKUP(9.99E+307,INDIRECT($I$1&ROW()&":"&$K$1&ROW()))+COLUMN(INDIRECT($I$1&":"&$I$1))-H3

The LOOKUP function identifies the last column containing a value. Again, it adds an adjustment for the location of the data range on the sheet. It subtracts the number of first populated column from the last.

Number of populated values

Column J contains the count of values in the populated range. Cell J3 contains:

=COUNT(INDIRECT($I$1&ROW()&":"&$K$1&ROW()))

This gives you what you need to identify your holes.

Identify rows with holes

The desired result is what is in column G. Cell G3 contains:

=IF(I3>J3,"hole","")

EDIT: I did some consolidating and got this to a single formula. You can use this in cell G3:

=LOOKUP(9.99E+307,INDIRECT($I$1&ROW()&":"&$K$1&ROW()))
 +COLUMN(INDIRECT($I$1&":"&$I$1))-MATCH(TRUE,LEN(C3:F3)>0,0)
 -COLUMN(INDIRECT($I$1&":"&$I$1))+1
 >COUNT(INDIRECT($I$1&ROW()&":"&$K$1&ROW()))

I've added line breaks for readability. It you want to copy and paste, you will need to remove those and the extra spacing.

This is an array formula, so you need to enter it with Ctrl + Shift + Enter. The result is just TRUE or FALSE for whether there is a hole:

screenshot

fixer1234

Posted 2015-11-11T02:18:30.570

Reputation: 24 254

This is a great response and works for the example I illustrated. Do you have a version that will work with either numeric or text? No, is an ok answer since I only recently updated my post. BTW my apologies for forgetting to respond earlier. – daniellopez46 – 2016-02-03T01:54:06.527

@daniellopez46: I didn't save the spreadsheet, but looking briefly at the formulas in this answer, I think this will also work for text. If you've reproduced it, just try replacing a row with text values and see what happens. If it works, you've got a solution that also covers your expanded question (and you could consider accepting the answer). If it doesn't work, it would require a new solution. Generally, if you modify a question so that existing answers would no longer be a (complete) solution, it's more appropriate to post a new question (which will also attract fresh readers). – fixer1234 – 2016-02-03T05:28:33.390

1

I came up with a new formula that works better then the original formula I came up with. This is the new formula:

SUMPRODUCT(--NOT(ISBLANK(OFFSET(B2:E2,0,1))),--ISBLANK(B2:E2),--NOT(ISBLANK(OFFSET(B2:E2,0,-1))))+SUMPRODUCT(--(SUM(C2:D2)=0),--(SUM(B2,E2)>0))

enter image description here

I feel more confident this picks up all the non-blank,blank{,blank},non-blank patterns when using 4 columns. But here are the two main issues:

  • it is still a specific solution for a specific number of columns vs something that can work for any number of columns
  • I have to add blank columns to work with the offset function. In this case I added a blank column before and after the range. This can be a problem because I typically have other columns not of interest to the problem at hand that are located before and after the range.

The new formula I came up with isn't an ideal solution but when compared to the original formula it does at least provide a framework that can be more easily adjusted and scaled up to accommodate more columns.

I am sharing this solution in hopes of motivating others to come up with a more ideal solution.

daniellopez46

Posted 2015-11-11T02:18:30.570

Reputation: 233