Excel - power query (or anything else) - join on partial match between fields?

1

In power query (or MS query, or really anything else that would create a refresh-able table that gives me the desired results in excel), how can I join on partial matches between two fields?

I want to do something similar to this SQL statement:

SELECT T1.SIZES AS SIZES,
       T2.PARTNO as PARTNO
FROM TABLE1 AS T1
LEFT JOIN TABLE2 AS T2
ON T2.APPLICATIONS LIKE '% ' T1.SIZES ' %'

For example: say I have TABLE1 which has the following data in field SIZES

  • SIZES
  • P01
  • P02
  • P03
  • P04

And I have TABLE2, which has the following data in fields PART NO and APPLICATIONS

  • PARTNO__________APPLICATIONS
  • 11111______________P01/P02
  • 22222______________P02/P04
  • 33333______________P01/P04

I want to do a left outer join of TABLE 1 on TABLE2, and return any rows which have a partial match between TABLE1.SIZES and TABLE2.APPLICATIONS. The results would be the following:

  • SIZES___________PART NO
  • P01______________11111
  • P01______________33333
  • P02______________11111
  • P02______________22222
  • P03______________null
  • P04______________22222
  • P04______________33333

I tried to follow along this example of implementing a like function in power query, and was able to create the formula, but can't figure out how to use it as criteria for a "query merge" (join).

I'm a total novice in power query...

CBRF23

Posted 2016-08-18T19:52:41.810

Reputation: 343

Answers

1

I would not use a Function for this, I prefer solutions with minimal/no code as they are easier to test and maintain.

I would start with 2 queries, for TABLE1 and TABLE2. I would use the Add Column / Add Custom Column button on each to add a column. I would call it Dummy Match Key and set the formula as

= 1

Now back on the TABLE1 query, I would add a Merge step and join to TABLE2, using the Dummy Match Key, then Expand the original columns from TABLE2. Now you have a monster cross-join of every row from TABLE1 vs every row from TABLE2. Don't panic. Even at scale PQ handles this scenario well, better IMO than SQL or specialist ETL tools which tend to need huge amounts of memory.

Anyway, next I would use Add Columns / Add Conditional Column and specify:

New Column Name: Matched Row

If (column) APPLICATIONS contains (column) SIZES then (value) Yes

This will give you a column with a value of "Yes" in only the matched rows. Filter for those values and remove any columns you don't want and you are done.

Total lines of code = 0.

Mike Honey

Posted 2016-08-18T19:52:41.810

Reputation: 2 119