Excel extract substring from string

22

5

I am looking for a way to extract a variable length substring from a string.

My cells will look something like:

ABC - DEF
ABCDE - DEF
ABCD - ABC

I want to split the string at the - character, so the cells will become:

ABC
ABCDE
ABCD

This should be done with a formula and not VBScript.

I am using Excel 2010

EDIT

I found that the dataset doesn't always contain the - character, meaning there should be no change.

Pieter van Niekerk

Posted 2011-11-17T13:40:49.530

Reputation: 643

Answers

26

This problem can be broken down into two steps:

  1. Find the index in the string of your desired split character (in this case, "-" or " - ").
  2. Get the prefix substring from the beginning of the original text to the split index.

The FIND and SEARCH commands each would return the index of a given needle in a haystack (FIND is case-sensitive, SEARCH is case-insensitive and allows wildcards). Given that, we have:

FIND(search_text, source_cell, start_index)

or in this case:

FIND(" - ", A1, 1)

Once we have the index, we need the prefix of source_cell to do the "split". MID does just that:

MID(source_cell, start_index, num_characters)

Putting them both together, we have:

=MID(A1,1,FIND(" - ",A1,1))

with A1 having text of ABC - DEF gives ABC.

Andrew Coleson

Posted 2011-11-17T13:40:49.530

Reputation: 1 835

7

Expanding upon Andrew's answer based on your edit: to find the character string to split at, we are using the FIND function. If the FIND fails to locate the string given, it returns a #VALUE? error. So we will need to check for this value and use a substitute value instead.

To check for any error value including #VALUE, we use the ISERROR function, thus:

=ISERROR(FIND(" - ", A1, 1))

that will be true if the FIND function can't find the " - " string in the A1 cell. So we use that to decide which value to use:

=IF(ISERROR(FIND(" - ", A1, 1)), A1, MID(A1, 1, FIND(" - ", A1, 1)))

That says that if the find command returns an error, use the unmodified A1 cell. Otherwise, do the MID function that Andrew already provided.

Hellion

Posted 2011-11-17T13:40:49.530

Reputation: 241

2

Thank you @AndrewColeson for your answer.

So just to add to that, if you want everything on the right hand side of the -, use this code:

=MID(A1,LEN(B1)+3,LEN(A1))

Which is:

A1 = ABC - DEF
B1 = =MID(A1,1,FIND(" - ",A1,1))
    B1 = ABC
Therefore A1 = DEF

This code is great for if you have an undefined number of characters after the -.

For instance:

If you have:

ABC - DEFG
AB - CDEFGH
...

Kevdog777

Posted 2011-11-17T13:40:49.530

Reputation: 437

this does the trick also =MID(a1,FIND("-",a1)+1,LEN(a1)) without any "extra-cellular" references ) – Mikey – 2016-03-18T18:01:59.153

1

Here is a very simple way to extract the 5th character from the left from a text string in Excel:

Suppose the character string ABCDEFGHIJ is stored in cell A1 in an Excel Spreadsheet, then the following formula

=RIGHT(LEFT(A1,5),1)

produces the 5th character from the left in the string, namely E.

Ken Brown

Posted 2011-11-17T13:40:49.530

Reputation: 11

(Is there no MID function in Excel? I'm sure it exists in legacy Visual Basic, and it would be something like =MID(A1, 5, 1)) – Valmiky Arquissandas – 2014-08-20T01:56:39.183

0

The following formula will remove a substring from [TEXTCOLUMN_1]

e.g: if you want to convert -./thumb/hello.jpg to thumb/hello.jpg then use the following formula

=SUBSTITUTE([TEXTCOLUMN_1],LEFT([TEXTCOLUMN_1],[NUM_OF_CHARACTERS]),)

[TEXTCOLUMN_1] = the column name which you want to change [NUM_OF_CHARACTERS] = number of characters from left side which you want to remove

If you want to remove from right side then use following

=SUBSTITUTE([TEXTCOLUMN_1],RIGHT([TEXTCOLUMN_1],[NUM_OF_CHARACTERS]),)

Alam Zaib

Posted 2011-11-17T13:40:49.530

Reputation: 1