Google Sheets RegexReplace and Empty Cells

0

With the help of this website and a couple of others I've nearly tackled this work related project that I originally thought far beyond my capabilities. This community has been amazingly helpful and want to thank everyone for that. Last thing I'm stuck on is currently every time I add/update data on my sheet it takes an excessive amount of time to re-calculate all the formulas being used. I believe the main reason why this is happening is due to lots of blank cells between data in one of my columns.

Here's the deal. I have a google sheet with 2 pages. Page one called Raw has 3 columns for importing "raw" data copy/pasted from elsewhere. Page two called Clean is 3 columns collecting the cleaned up data that I need. Column A and C on the clean sheet are fine and don't have many/any blank cells. Column B due to the source of the raw data and what I'm extracting out of it I'm left with significant amount of blank cells between the strings of data I actually need. Here is the formula that I'm using in column B of my clean data page: =IFERROR(REGEXREPLACE(Raw!B2,"[$][0-9]{1,4} · ","")) which as you can probably see pulls specific strings from the B column of the raw page. All the other info that doesn't match the regex leaves me with blank cells in my clean sheet.

Can anyone recommend the best way to eliminate the empty cells on the clean sheet while still pulling the info I need? Slow re-calculation is the last hurdle I need to solve before I can deploy this to use at work and make my day to day life so much easier.

Thanks in advance for any suggestions and sorry for the wall of text!

Joel

Posted 2019-02-23T06:46:31.237

Reputation: 23

Are the raw data columns unrelated? That is, for any one row of raw data, are the columns of this particular row related to one another? The OP statements' columns A and C on clean do not have many blanks and do not propagate the blank rows of column B where the expression is not found seems to imply a row on the clean sheet will consist of data from different rows of the raw sheet. – Ted D. – 2019-02-23T16:34:37.820

That is correct they are unrelated. I left out specifics in the interest of not making a gigantic wall of text but to put it simply the raw data columns are holding data I copy/paste from 3 different websites. I work at a small mom and pop style retail store and we list all of our inventory on our website as well as FB marketplace and LetGo platforms. We use identical titles for each item on each website. This sheet pulls the titles from all 3 sites and compares them against eachother to check for discrepancies and ensure accurate listings on all 3 sites. – Joel – 2019-02-23T20:40:55.327

Answers

0

Google spreadsheet solution since REGEXREPLACE is not available in Excel. Add an IFERROR wrapper. Copy down so row( $A1) will increment.

=ARRAYFORMULA( INDEX( REGEXREPLACE( $B$2:$B$1000, "[$][0-9]{1,4} · ", ""), SMALL( IF( REGEXREPLACE( $B$2:$B$1000, "[$][0-9]{1,4} · ", "") <> "", ROW($B$2:$B$1000) - ROW($B$2) + 1, ""), ROW($A1))))

Ted D.

Posted 2019-02-23T06:46:31.237

Reputation: 750

This formula actually does work and removed the blank cells from the column in question. It had a side effect of totally breaking the conditional formatting I had setup to highlight when there was discrepancies between the 3 columns so I'll have to play around with it and see if I can fix that issue. Thanks a bunch my friend! – Joel – 2019-02-23T23:43:09.873

0

I would think it easier to make a modification to your original approach. IFERROR isn't really appropriate as it doesn't return an error if there is no match.

Change:

=IFERROR(REGEXREPLACE(Raw!B2,"[$][0-9]{1,4} · ","")) 

To:

=REGEXREPLACE(Raw!B2,"[$][0-9]{1,4} · (.*)","$1")

Basically, create a capturing group for the "keep" stuff by surrounding it in parenthesis.

(.*)

Then drop the match stuff at the beginning. Then you reference the capturing group with:

$1

This will result similar to as follows:

Cell B1 Formula: =REGEXREPLACE(A1,"[$][0-9]{1,4} · (.*)","$1")

  |         A         |       B        |
  |____________________________________|
1 | $55 · Barley      | Barley         |
2 | $2 · Squid        | Squid          |
3 | Bread             | Bread          |
4 | $27 · Gun Powder  | Gun Powder     |
5 | 3" of Filament    | 3" of Filament |

Anything that doesn't match [$][0-9]{1,4} · will pass through your RegEx filter.

Tim C

Posted 2019-02-23T06:46:31.237

Reputation: 1

First of all, welcome to Super User! We are always glad to help, but you apparently have two Super User accounts: this one and this one. Please take the time to utilize the following Help Center tutorial and ask the Super User staff to merge your accounts: I accidentally created two accounts; how do I merge them?

– Run5k – 2020-01-11T18:30:21.050