How to extract reminder of a match in MS Excel 2003

0

I tried using various combinations of functions but it seems that there's no specific function can return non-searched/matched values. Doing it manually takes days cause of the large amount data that i need to sort through.

I want MS excel 2003 to extract the remainder of List A based on List B.

*List A is 2000 items, List B is only 10-30 max

List A
No.1----1 2 3 4 5 6 (each digit is placed in 1 cell, always 6 digits)
No.2----1 1 2 3 4 5 (each digit is placed in 1 cell, always 6 digits)
No.3----1 3 4 5 6 7 (each digit is placed in 1 cell, always 6 digits)

List B
No.1----1 2 3 (each digit is placed in 1 cell,always 3 digits)
No.2----1 1 4 (each digit is placed in 1 cell,always 3 digits)
No.3----2 3 5 (each digit is placed in 1 cell,always 3 digits)

For example:

In List A, find matches(if any) based on input from List B and return the remainder as output. If no match is found, no output is required.

List A
No.1----1 2 3 4 5 6 (each digit is placed in 1 cell, always 6 digits)

Based on List B
No.1----1 2 3 (match found (1&2&3 is present), then I select the remainder manually # # # 4 5 6 or = 456)
No.2----1 1 4 (no match found (1&1&4 is not present), no output)
No.3----2 3 5 (match found (2&3&5 is present), then i select 1 # # 4 # 6 or output= 146)

I figured that I use COUNT function to count the frequency of each digit 0-9 in every item on List A and then use IF & AND function (specifying which and how many each digit is required to qualify as a match)to tell me which items on List B matches List A.

So, for every item in List A, i require excel to run through the whole List B and the output could range from no output, to max 3 output.

I also want the ability to change the values in List B without altering the formula used to search so that I can use the same worksheet repeatedly with ease.

So far all my other attempts to use other functions have failed to extract the remainder in the manner that I want it to. If you have any suggestions please teach me.

Peter Vermillion

Posted 2014-10-30T07:58:09.167

Reputation: 25

I don't understand how you're picking the numbers from the table. – Raystafarian – 2014-10-30T10:09:49.300

@Raystafarian I refined what i wanted, i hope this is clear enough. – Peter Vermillion – 2014-10-30T12:21:36.860

Each of the characters needs to be evaluated independently? They won't always be in the same order (series) as list B? This is going to be very difficult without multiple helper columns and whatnot, especially if it's excel 2003. A macro might be able to do it better. – Raystafarian – 2014-10-30T14:14:01.200

Your matching rules don't make sense. In most cases, it appears you find a string match and the remainder is any numbers before or after the match. However, your example for the No.2 result looks like the list B 114 was matched to 11##4 to leave 235. – fixer1234 – 2014-10-30T15:20:14.000

So the matching criteria is simply all three digits in list B (always exactly three?), occurring in the same order in List A but not necessarily contiguously? Are both list items always in ascending order? – fixer1234 – 2014-10-31T01:57:15.797

@fixer1234 yes, always exactly 3 not necessarily contiguously.Yes,I painstakingly ordered List A and B to be in ascending order. – Peter Vermillion – 2014-10-31T02:36:41.993

List A is always exactly 6 numbers? – fixer1234 – 2014-10-31T02:51:43.477

@fixer1234 Yes, always exactly 6 digits arranged individually in 6 cells – Peter Vermillion – 2014-10-31T02:52:41.620

Roughly how big are the two lists? I can see a way to do it without VBA, but it would be impractical if the lists were both huge. – fixer1234 – 2014-10-31T03:36:34.613

@fixer1234 2000 items on List A, max 30 on List B – Peter Vermillion – 2014-10-31T04:01:30.437

I caught your comment right before you deleted it. There are 7 Bil people on the earth and problems involving data manipulation have been around for on the order of 70 years. Your question is the first time anyone has presented this problem, at least in this format. And then you have a second one. What are the odds? You've aroused my curiosity; what is the application? Are you doing some type of analysis by a method that could have a wider benefit if others applied it (or are you hogging a clever technique all to yourself)? – fixer1234 – 2014-11-01T15:32:58.263

Answers

1

OK, here is a solution that works but may give you brain damage setting it up. I built it a step at a time, calculating one set of things that were then used by the next calculations. Once I had a working model, I worked backwards, substituting the actual formulas for the cell references so that all of the formulas referred only to your actual lists and not intermediate calculations. The formulas mushroomed. In fact, the first attempt produced formulas that exceeded the cell capacity. I split it up into two tables, the first feeding the second. The tables are very large and you would have gone stark raving bonkers trying to get all of the cell references pointing to the right places to populate the formulas in two directions throughout the tables. So I added some indirect references so the formulas can simply be copied and pasted and will work without manual cleanup. Unfortunately, that produced some pretty big formulas.

I'll explain this as an example located in specific places on a spreadsheet. If you need to locate the pieces elsewhere, edit all of the row and column references in the first cell and then copy and paste to populate the tables. For your own sanity, set up some known examples so you can verify that the first couple of rows and columns in each table are working before you populate the whole thing. Take a couple of prophylactic aspirin and we'll begin.

This is based on your List A in columns A through F with data starting in row 1 (2,000 rows). List B is in columns H through J with data starting in row 1 (30 rows).

The first table starts in L1. This table creates a list of the positions of the List B entries in the List A records. For example:

                          Position:  1 2 3 4 5 6          
    So if a List A record contains:  1 3 3 5 7 9
    and a List B record contains:    1 3     7
    the entry in this table will be: 1 2     5  (stored as a single number: 125)

If the List B record doesn't match the List A record, there will be a #N/A in the cell. The layout of this table is like this:

            [L]     [M]      [N]       [O]  
    [1]             <=======List B Row========>
    [2] List A Row   1        2         3    ...
    [3]     1
    [4]     2
    [5]     3
        ...

You need to actually put the row numbers in as column headings in row 2 of columns M through AP and as row labels in column L. These are what the formulas use as pointers. There are 30 data columns, one for each row of List B entries, and you will have 2,000 rows, representing the entries in List A, starting in row 3. Each cell of the table reflects a List B entry vs. a List A entry. This is the formula for M3:

    =MATCH(INDIRECT("H"&M$2),$A1:$F1,0)&MATCH(INDIRECT("H"&M$2),$A1:$F1,0)
     +MATCH(INDIRECT("I"&M$2),INDIRECT(ADDRESS($L3,MATCH(INDIRECT("H"&M$2),$A1:$F1,0)+1, , )&":$F"&$L3),0)&MATCH(INDIRECT("H"&M$2),$A1:$F1,0)
     +MATCH(INDIRECT("I"&M$2),INDIRECT(ADDRESS($L3,MATCH(INDIRECT("H"&M$2),$A1:$F1,0)+1, , )&":$F"&$L3),0)
     +MATCH(INDIRECT("J"&M$2),INDIRECT(ADDRESS($L3,MATCH(INDIRECT("H"&M$2),$A1:$F1,0)
     +MATCH(INDIRECT("I"&M$2),INDIRECT(ADDRESS($L3,MATCH(INDIRECT("H"&M$2),$A1:$F1,0)+1, , )&":$F"&$L3),0)+1, , )&":$F"&$L3),0)

I've broken the formula up here to make it more readable but it is all one formula. Verify that you have it working in M3 through N4 with some sample data and then copy and paste to fill the table.

The second table starts in AR1. This table is structured the same way:

           [AR]    [AS]      [AT]      [AU]  
    [1]             <=======List B Row========>
    [2] List A Row   1        2         3    ...
    [3]     1
    [4]     2
    [5]     3
        ...

This table works similar to the first--each cell represents the results of a List B record vs. a List A record. This table contains your remainder. So in the example I gave for the first table, the remainder would be 359:

    So if a List A record contains:  1 3 3 5 7 9
    and a List B record contains:    1 3     7
    the remainder is:                    3 5   9

The formula that goes in cell AS3 is:

    =IF(ISNA(M3),"",IF(ISERROR(FIND(COLUMN(INDIRECT("a"&$AR3)),M3)),INDIRECT("a"&$AR3),"")&
     IF(ISERROR(FIND(COLUMN(INDIRECT("b"&$AR3)),M3)),INDIRECT("b"&$AR3),"")&
     IF(ISERROR(FIND(COLUMN(INDIRECT("c"&$AR3)),M3)),INDIRECT("c"&$AR3),"")&
     IF(ISERROR(FIND(COLUMN(INDIRECT("d"&$AR3)),M3)),INDIRECT("d"&$AR3),"")&
     IF(ISERROR(FIND(COLUMN(INDIRECT("e"&$AR3)),M3)),INDIRECT("e"&$AR3),"")&
     IF(ISERROR(FIND(COLUMN(INDIRECT("f"&$AR3)),M3)),INDIRECT("f"&$AR3),""))

Each cell in this table will contain either the remainder or a null character if there was no match.

You wanted to get a summary of the results for each List A record. Since each row of the table represents a List A record, the summary can go at the end of each row of the table. The 30 columns of the table end at column BV, so the results are in column BW. The formula for BW3 will be:

    =AS3&IF(ISBLANK(AS3),""," ")&AT3&IF(ISBLANK(AT3),""," ")& ... &BV3&IF(ISBLANK(BV3),""," ")

Rather than show all 30 terms here, this shows just the first two and the last. Follow the same pattern to add the rest. It builds the result string by concatenating the results of each match. If there is a value in a cell, it adds a space before the next value. If you want a different delimiter, change the space to something else, like comma space. Copy this formula down column BW for all of the rows.

This probably isn't the most useful place for the results. Once you have everything working you can move things around. Actually, if you move anything, you may have massive cleanup of cell references. It would make more sense to just create the output you want in another location and use cell references to refer to what is already set up.

fixer1234

Posted 2014-10-30T07:58:09.167

Reputation: 24 254

@Peter Vermillion - Column L is the row numbers for List A, which the formulas use as an index. The same for column AR. You need to insert those numbers to act like row labels for the tables. The row numbers for List B you need to enter as column headings in row 2, columns M through AP for the first table and columns AS through BV for the second table. For the formulas to work, you need to actually create the row and column headings to look like the examples in the answer. Take more aspirin and try again. – fixer1234 – 2014-11-01T02:00:01.457

taking more aspirin...trying again – Peter Vermillion – 2014-11-01T02:16:19.540

works like a dream – Peter Vermillion – 2014-11-01T15:17:17.390