Setting up a search tool to match any cell containing a 1 in certain digit location(s)

1

I have a search tool I am developing in Excel. In Column "EU", I have cells, each containing a string of 124 digits. These strings consist of 0's and 1's. I have 124 characteristics that produce the 1 or 0..., and users may want to search for 1,2,3,4,5 etc. characteristics. For simplicity, let's say EU1 has string 0100100011.....and I want to find any cell that contains 0100100000 or possibly 0000100010 etc. In other words I want to find any string containing a "1" in a specific digit location, possibly several 1's in several digit locations.......... Example I want to test if any of these 3 combinations.....

0000010000000001000000000000010010000010000000100000010010010000000000101100100000100000000000000000000000000000000000000100 0000000000001010001000000010010010010010000000100000010010010000000000100100100000100100000100000000000000000000000001000100 0000000000000110000010000010001001010000010000100000010010010000000000100100100001000100000100000000000000000000000001000100

have 1's in these locations......

1000000000000000000000000000000000000000000000000000000000000000000001000000000000000000001000100000000000000000000000000000

if there are 1's in other locations that is fine, but not if the test condition has a 1 and the tested locations have a zero in that digit.


Putting it another way, and using more comprehensible examples, pretend that my strings are only five characters long, and that my first five strings are

EU1       10000
EU2       10001
EU3       11000 
EU4       11011
EU5       00001

I want to search for strings where all of a designated list of character positions contain 1s, by comparing each of them to a variable search string that contains 1 at all the positions where I want there to be a 1.  Zeros in the variable are wildcards.  Another way to put it is that the search string acts as a mask.

For example, to find strings whose first two digits are 1s (don’t care about the rest) I would want to specify 11000 and get matches to strings EU3 and EU4.  (This might represent a search for objects that are Angular AND Blue.)  A search on 10000 (all objects that are Angular) should match on all the strings whose first digit is 1, namely EU1, EU2, EU3, and EU4.  11100 (all objects that are Angular AND Blue AND Circular) should yield no match, because no strings contain 1 in the place of the third digit.  Conversely, 00000 should match all the strings, because an all-zeros search string is a wildcard.

In table form:

Search string:   11000       10000       11100       00000
                 ↓↓          ↓           ↓↓↓
EU1              10000     → 10000       10000     → 10000
EU2              10001     → 10001       10001     → 10001
EU3            → 11000     → 11000       11000     → 11000
EU4            → 11011     → 11011       11011     → 11011
EU5              00001       00001       00001     → 00001
                                        (none)       (all)

BearOnFire

Posted 2015-10-22T21:48:40.583

Reputation: 23

I'm sorry that your data are 128 characters long.  A solution for your 128-character problem will probably work for 8 or 18 characters as well, and vice versa.  I suggest that "For simplicity" you scale down your problem so that your strings fit on one line — your actual data hurt my eyes. – Scott – 2015-10-26T02:06:58.797

I have 5 strings. String A)=10000, String B)=10001, String C)=11000 String, D)=11011, String E)=00001. I would like to compare each of these strings to a variable. So if Variable A)=11000 I would get a match to string C and D. If Var A)=10000, match on strings A,B,C, and D. If Var A)=11100, no match, because no strings contain "1" in the place of the 3rd digit. So any time a "1" shows up in the variable it is critical that the string contains the 1 as well. If the Variable A)=00000 then strings A,B,C,D, and E are all a match, as zeros in the variable are wildcards. – BearOnFire – 2015-10-27T03:19:22.283

Answers

0

For simplicity, I put your data into Column E, starting in Row 2.  I put the search string into cell D2.  (Of course D2 and Column E must be formatted as text, or you must type a single quote (') before every number.)

I chose the approach of using Excel’s native search capabilities.  A search string of 11000 means we’re looking for values where the first digit is a 1, the second digit is a 1, and the third through fifth characters are anything.  As you said, zeros in the variable (search string) are wildcards.

The standard representation of this wildcard is 11??? — so I set cell D3 to

=SUBSTITUTE(D2, "0", "?")

The question didn’t specify how the result(s) should be presented.  One way would be, for each value in Column E, to indicate whether it is a match.  We can do this by searching each E value for the wildcard string.  If it is found, SEARCH returns 1, otherwise it returns an error.  So we can get an appropriate TRUE or FALSE value by putting

=NOT(ISERROR(SEARCH($D$3, E2)))

into F2, and dragging it down to cover all the E data.  (Note that FIND doesn’t allow wildcard characters.)

But it might be more convenient to have a list of the indices of the matching values.  We can get that by using MATCH to search the column.  I put a 1 in cell G1 and

=MATCH($D$3, INDEX($E$2:$E$99, G1):$E$99, 0) + G1

into cell G2 (adjusting the 99s to correspond to the last data row) and drag/fill this down.  This searches for the wildcard string (which is in D3) within the E column, starting at the point of the previous match.  This returns the row number of the matches; e.g., “String D” (11011, which is in cell E5), is reported as 5.  If you want “String D” to be reported as 4, that’s an easy adjustment.

Here are the results for the sample data you suggested:

     11000
     10000
     11100
     00000

Of course cells D3 and G1 can be hidden.  And, if you really need to start your data in Row 1, that can be arranged.

P.S. I have tested this with 124-character long values, and both parts of the answer seem to work.

Scott

Posted 2015-10-22T21:48:40.583

Reputation: 17 653

Scott, it's wonderful that you spent so much time on this to help a random stranger. I'm sitting at my desk this morning-all smiles, because I wasn't sure I would find resolution here. I truly appreciate your help. – BearOnFire – 2015-10-27T11:36:53.137

Well, you clarified the question when I asked you to; it’s only fair that I acknowledge your effort.  I’m glad it helped you.  Just out of curiosity, which part of the answer are you using (Column F or G)? – Scott – 2015-10-28T00:56:46.053

I am using column f to hide all rows that do not give a result of true. I ended up with strings of 154 digits, so this would have been a nightmare any other way. Nice work! – BearOnFire – 2015-10-29T12:33:09.050

0

Say we have data in column EU from EU1 to EU100. To find cells with your pattern match use:

Sub Finderr()
   For i = 1 To 100
      v = Cells(i, "EU").Text
      If Mid(v, 1, 1) = "1" And Mid(v, 70, 1) = "1" And Mid(v, 91, 1) = "1" Then
         MsgBox i
      End If
   Next i
End Sub

will output which rows containing matches.

Gary's Student

Posted 2015-10-22T21:48:40.583

Reputation: 15 540

So each search would require the user to program a custom string? It would seem more practical if the user posts a position list (like 1, 70, 91), and the search is derived from that. (Maybe that's the new and improved version 2.0 available in the paid version of the application.) – fixer1234 – 2015-10-23T20:16:26.390

@fixer1234 My code is already compatible with your comments. – Gary's Student – 2015-10-23T20:34:49.560

Yeah, I was referring to the "extra cost" front end so that the test string is built as needed rather than hard coded (freeware vs. commercial versions of your application). – fixer1234 – 2015-10-23T20:43:08.910