Parse RegEx in Excel

0

Suppose I have some Excel spreadsheet with the following column:

A-345 hello

hola B-389

nice C-6231

what A-12287 who

just A-7819 C-45556

As you can see - I have a text combined with some RegEx consists of A,B or C, hyphen and some number with up to 5-6 digits. Some of these expressions appear more than once in some lines.

I would like to parse this RegEx (which varies somewhat between lines) to another column. If there are multipile expressions in some line (like the 5th line in my example), then I would like to seperate them by any character which is not the hyphen (space would do it).

Is there any built-in function in Excel that can allow me to do so? Maybe some VBA script?

I found this thread, but I would like to see first whether I can do it without any external add-on or software. Does this specific add-in supports what I need?

gbi1977

Posted 2017-08-08T18:47:19.723

Reputation: 139

What is your expected output? – DavidPostill – 2017-08-08T21:05:31.577

@DavidPostill a seperated column with the regex (A-345, B-389, etc.) in each line. if there are multipile expressions in one line (like the given 5th line) then I would like to have all of them in this single column, seperated by space, or a comma. – gbi1977 – 2017-08-09T20:22:12.093

Answers

1

If you want a built in function, you can do this array formula:

=TEXTJOIN(" ",TRUE,IFERROR(--LEFT(TRIM(MID(SUBSTITUTE(A1,"-",REPT(" ",999)),ROW(INDIRECT("1:" & LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))))*999,999)),FIND(" ",TRIM(MID(SUBSTITUTE(A1,"-",REPT(" ",999)),ROW(INDIRECT("1:" & LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))))*999,999))&" ")-1),""))

Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when Exiting edit mode. If done correctly then Excel will put {} around the formula.

Note:

This will return the first number grouping after each -

enter image description here


TEXTJOIN() is available with a subscription to Office 365 Excel. If you do not have that then add this code to a module attached to the workbook and use the formula as described above:

Function TEXTJOIN(delim As String, skipblank As Boolean, arr)
    Dim d As Long
    Dim c As Long
    Dim arr2()
    Dim t As Long, y As Long
    t = -1
    y = -1
    If TypeName(arr) = "Range" Then
        arr2 = arr.Value
    Else
        arr2 = arr
    End If
    On Error Resume Next
    t = UBound(arr2, 2)
    y = UBound(arr2, 1)
    On Error GoTo 0

    If t >= 0 And y >= 0 Then
        For c = LBound(arr2, 1) To UBound(arr2, 1)
            For d = LBound(arr2, 1) To UBound(arr2, 2)
                If arr2(c, d) <> "" Or Not skipblank Then
                    TEXTJOIN = TEXTJOIN & arr2(c, d) & delim
                End If
            Next d
        Next c
    Else
        For c = LBound(arr2) To UBound(arr2)
            If arr2(c) <> "" Or Not skipblank Then
                TEXTJOIN = TEXTJOIN & arr2(c) & delim
            End If
        Next c
    End If
    TEXTJOIN = Left(TEXTJOIN, Len(TEXTJOIN) - Len(delim))
End Function

Scott Craner

Posted 2017-08-08T18:47:19.723

Reputation: 16 128