Extract number from complex string if it falls in a given range

4

I need an Excel formula (or VBA macro) that will allow me to extract a value from a string. The string is a sequence of words, separated by spaces, in a single cell. I want the word (representing a bike size) that is either

  • a number (presumably an integer, but this is not specified) between 47 and 60 (or some other range, perhaps to be specified dynamically), or
  • one of the strings "sm", "med", or "lg".

I expect that there will be exactly one qualifying word in the string, so any reasonable error-handling response to

  • no qualifying words, or
  • multiple qualifying words

will be acceptable. The size may be at various positions in the string. Examples:

Cervelo P2 105 5800 56 '15                        the number 56 is the desired result
Cervelo P2 105 54 6000 '15                        the number 54 is the desired result
Cervelo P3 105 5800 60 '15                        the number 60 is the desired result
Cervelo P2 105 5800 sm '15                        the string sm is the desired result

I'm interested only in whole words, so 58 (substring of "5800") does not qualify.

Right now I am stripping off the '15 and then extracting the last two digits. But this approach works only if the bike size is the second to last value. However, as shown above, there are cases where the size is at other positions in the string.

How can I do this with a formula or VBA macro in Excel?

carroll

Posted 2015-08-23T15:55:38.650

Reputation: 43

1How do you know the number of interest is the 56 and not the 58? What if, as in this case, the string contains more than one match? Every solution I can think of would match the first instance it finds that meets the criteria. Also, you suggest two ranges in the same question. Is the range intended to be entered as variables so it can be changed without modifying formulas everywhere? – fixer1234 – 2015-08-23T18:26:07.430

Thanks for your help. Each record will only have one occurance of – carroll – 2015-08-25T04:33:21.830

a number that is in the range of 44 - 62. These are bike sizes - so if I can define this range(44-62), the formula will only find one number in this range for each record Thanks Carroll – carroll – 2015-08-25T04:37:24.730

Then can you clarify your example? As described, both the 58 and the 56 meet your criteria. It will be hard for people to provide useful answers if they don't fully understand the requirements. – fixer1234 – 2015-08-25T04:39:31.893

(1) You say that this question is about Excel.  So, please clarify: is that string a single cell value, with embedded spaces, or is it six cell values?  (2) Rather than say, "there are cases where the size is at various positions in the string", please show some additional examples of what the data might look like.  (3) And what's the range?  You have said 44-62, 47-60, and 47-62. – Scott – 2015-08-25T07:25:45.280

fixer1234 - my original post was answered by jcbermu with a VBA program which works great. I have spent the last two days testing the program and had tried to answer the questions asked by scott and others. Today I tried to get back with jcbermu to say the his VBA solution works great and ask a couple of questions, but the post had been closed. I will try to clarify further; two records might be 1. Cervelo P2 105 54 6000 '15 and 2. Cervelo P3 105 5800 60 '15 - where the bike sizes are 54 and 60. I want a solution that will extract the 54 and 60 and ignore other # hence the range44-62. – carroll – 2015-08-26T06:59:03.570

1“I am … not sure what I did wrong.” Seriously?  I asked you to clarify the range, since you had specified multiple values.  Not only did you leave both 47-60 and 47-62 in the question, but you added 44-62 (which you had previously mentioned in a comment) to the question.  (Also, the example inputs (and corresponding desired results) that you put in a comment today should have been edited into the question, and the remarks that you edited into the question (“I am not sure what to do next. I am also not sure what I did wrong.”) should have posted in comments (or not at all).  … (Cont’d) – Scott – 2015-08-26T17:03:42.583

(Also, since the general subject area of a question (e.g., Excel) is identified by the tags, we prefer not to repeat it in the title.)  But you’re new to Super User (note, btw, that we refer to it as a two-word name), so it’s understandable that you haven’t mastered all of our subtle rules in four days.  I have voted to reopen the question.  (It would still be nice if you would clarify the range, and not have conflicting values.) – Scott – 2015-08-26T17:04:51.060

@carroll: The site works a little different from others you might be used to, so a quick response on your comment about jcbermu answering the question. SU's purpose is to build a knowledgebase of questions and answers for other people to reference. In essence, you contribute a question in exchange for possibly getting an answer yourself. What's important is that the question be understandable by others. Your getting an answer is great, but not the driver. Scott did a great job of clearing up the ambiguity and I voted to reopen. – fixer1234 – 2015-08-26T17:48:05.370

fixer1234 - thanks to everyone for their help and suggestions for clarification. Next time I will be able to respond more effectively – carroll – 2015-08-28T16:07:25.267

Answers

5

I made a solution using VBA:

Public Function BikeSize(MinSize As Integer, MaxSize As Integer, datainput As String)
    Dim dataoutput() As Variant
    ReDim dataoutput(0)
    BikeSize = 0
    datasplitted = Split(datainput, " ")
    arraysize = UBound(datasplitted)
    j = 1
    For i = 0 To arraysize
        m = datasplitted(i)
        If m >= MinSize And m <= MaxSize Then
            ReDim Preserve dataoutput(j)
            dataoutput(j) = m
            j = j + 1
        End If
        If m = "sm" Or m = "med" Or m = "lg" Then
            ReDim Preserve dataoutput(j)
            dataoutput(j) = m
            j = j + 1
        End If
    Next i
    totalresults = UBound(dataoutput)
    Select Case totalresults
        Case 0
            BikeSize = 0
        Case 1
            BikeSize = dataoutput(totalresults)
        Case Else
            For i = 1 To totalresults
                wrongresult = wrongresult & dataoutput(i) & " - "
            Next i
            BikeSize = wrongresult
    End Select
End Function

You have to open Macros /Visual Basic go to Modules _> Add Module and paste the code on the right side (also see How do I add VBA in MS Office?).

Then if your string is on cell A1, your min value on C1, your max value on D1 then on B1 you simply put =BikeSize(C1,D1,A1) to get the result.

If the string doesn't have a matching number it outputs a zero 0.

If the string has more than one matching number it outputs all matching separated by a dash.

It also recognizes sm, med and lg.

jcbermu

Posted 2015-08-23T15:55:38.650

Reputation: 15 868

Thanks for your help - I will try it and let you know, Carroll – carroll – 2015-08-25T04:40:06.753

jcbermu - your VBA solution works great. Sorry for the delay in getting back to you. I do have a couple of questions, but I got a message from fixer1234 that the post was closed. I am new at this and not sure of the status of this post – carroll – 2015-08-26T06:17:54.190

@carroll Great!!! What do you want to know? – jcbermu – 2015-08-26T08:11:34.530

jcbermu, Thanks much for the solution. Questions 1) Where you specify the range 44-62, can I reference cells in two adjacent columns so I can chance the low and hi of the range as I copy the "bikesize" VBA down the spreadsheet. 2) Can I also extract sm, med,, lg in this same VBA program. Sometimes the bike size is expressed using sm, med, lg instead of 54 or 58, or 48, etc....Obviously, I know little about VBA programming, but thanks to you, I am learning Your help is very appreciated and your explanation of how to insert the VBA program really helped - where do I rate your solution? – carroll – 2015-08-26T14:46:03.023

@carroll I modified my answer (code and explanations) to reference to cells. Now on B1 you have to put for example: =BikeSize(C1,D1,A1) if C1 is the cell of Min size, D1 is the cell of the Max size and A1 the string. It also recognizes "sm", "med" and "lg". – jcbermu – 2015-08-26T15:07:04.827

1

@carroll: You can express your appreciation/approval for an answer by "accepting" it (click on the check mark to the left) and/or upvoting it (click on the upward-pointing arrowhead).  Notes: only the author of a question can accept an answer, and you can accept only one answer per question.  Anybody who has a reputation of 15 or more can upvote any number of posts.  See the Help Center for more information.

– Scott – 2015-08-26T17:12:57.250