Extracting multiple email address from excel cell /excel file

0

Is there a way to extract multiple email address from excel cell /excel file

    |   A   |   B                                                                  
----|-------------
1   |HTML   |
    |Content|
    |   1   |
----|-------|-----
2   |HTML   |
    |Content|
    |   2   |
----|-------|-----
N Rows|     |

More or less the content is the same in columns

HTML Content in column one row one


cid=9498&peid= **xyz9000@gmail.com** &mbrid=cf7cc75d395cbf901f73967b1267db42 style="text-decoration: none;  color: blue;">Update</a> contact/membership info.<font style="font-size:14px" face="Verdana" color="#000000">&nbsp;&nbsp;|&nbsp;&nbsp;<font style="font-size:12px" face="Verdana" color="#3c3c3c"><a href=https:///fe_web_member_subscription_verify.php?oeid= 
**navee@centra.com**&cid=9498 style="text-decoration: none;  color: blue;">Subscribe</a> to receive requirements & resumes
                                                                    </font>        
                                                            </p>
                                                       </td>
                                            </tr>

                              <tr>
                                                   <td colspan="2"  valign="middle">
                                                            <p style="margin-left:0px;margin-right:0px;margin-top:0px; margin-bottom:0px;"> 
                                                                    <font style="font-size:15px" color="#3C3C3C" face="Verdana" >
                                        <br><u>From</u>:<br>
                                               Navee ,<br>
                                               Centra<br>                                            
                                                                           qwerty@centra.com<br>
                                                                           <span style="background-color: #FFFF00"><b>Reply to:</b></span> &nbsp;&nbsp;navee@centra.com<br><br>

                                                                    </font>        
                                                            </p>
                                                  </td>
                                            </tr>  

                                                  <tr>
                                                    <td  colspan="2" valign="middle">
                                                            <br><p><strong>Big 

HTML Content in column one row two


cid=9498&peid= **xyz9000@gmail.com** &mbrid=cf7cc75d395cbf901f73967b1267db42 style="text-decoration: none;  color: blue;">Update</a> contact/membership info.<font style="font-size:14px" face="Verdana" color="#000000">&nbsp;&nbsp;|&nbsp;&nbsp;<font style="font-size:12px" face="Verdana" color="#3c3c3c"><a href=https:///fe_web_member_subscription_verify.php?oeid= 
**navee@centra.com**&cid=9498

I tried to use online tools available on the internet but they are of no use / they extract only the first available email address in the excel field.

https://www.extendoffice.com/product/kutools-for-excel/excel-extract-email-address-from-string.html

There are n number of rows.

Is there a way to extract multiple email address from excel cell

Required output - extract emails and save them in a separate spread sheet or in the adjacent column

=========================


Input Format 3

#

For the following HTML code present in the excel cell I'm able to extract only the first occurence of the email that I get but not all the multiple email address mentioned.

            <p align="center"  style="margin-left:0px;margin-top:30px; margin-right:0px;margin-bottom:25px;">
                <table width="1024" cellspacing="0" cellpadding="0"  border="0" >
                        <tr>
                        <td colspan="2"  style="border-bottom: solid 1px #ff7373; " valign="bottom">
                            <p style="margin-left:0px;margin-right:0px;margin-top:0px; margin-bottom:0px;"> 
                                <font style="font-size:12px" color="#3C3C3C" face="Verdana" >
                                    <a href=https://house.com/powerhouse/fe_web_member_unsubscription.php?cid=4870&teid=volcan9@gmail.com&mbrid=cf7cc75d395cbf901f73967b1267db42 style="text-decoration: none; color: red;">Remove/unsubscribe</a><font style="font-size:14px" face="Verdana" color="#000000">&nbsp;&nbsp;|&nbsp;&nbsp;<font style="font-size:12px" face="Verdana" color="#3c3c3c"><a href=https://house.com/powerhouse/fe_web_member_update.php?cid=4870&peid=volcan9@gmail.com&mbrid=cf7cc75d395cbf901f73967b1267db42 style="text-decoration: none;  color: blue;">Update</a> contact/membership info.<font style="font-size:14px" face="Verdana" color="#000000">&nbsp;&nbsp;|&nbsp;&nbsp;<font style="font-size:12px" face="Verdana" color="#3c3c3c"><a href=https://house.com/powerhouse/fe_web_member_subscription_verify.php?oeid=mod.s@solutionsinc.com&cid=4870 style="text-decoration: none;  color: blue;">Subscribe</a> to receive requirements
                                </font> 
                            </p>
                            </td>
                    </tr>

                      <tr>
                        <td colspan="2"  valign="middle">
                            <p style="margin-left:0px;margin-right:0px;margin-top:0px; margin-bottom:0px;"> 
                                <font style="font-size:15px" color="#3C3C3C" face="Verdana" >
                                        <br><u>From</u>:<br>
                                       mod,<br>
                                       Solution<br>                                            
                                       mod.s@solutionsinc.com<br>
                                       <span style="background-color: #FFFF00"><b>Reply to:</b></span> &nbsp;&nbsp;mod.s@solutionsinc.com<br><br>

                                </font> 
                            </p>
                        </td>
                    </tr>  

                        <tr >
                            <td  width="800" valign="top"  style="border-top: solid 1px #333399; ">
                                <p align="justify" style="margin-left:0px;margin-right:0px;margin-top:0px; margin-bottom:0px;"> 
                                <font style="font-size:12px" color="#666666" face="Verdana">
                                     <a href="http://www.res.com/#!products/c5n6" style="text-decoration: none; color: red" title="Click to view more details">.</a>
                                </font>
                            </p>
                            </td>

                                <td    valign="top" style="border-top: solid 1px #333399; ">
                                    <p  align="right" style="margin-left:0px;margin-right:0px;margin-top:0px; margin-bottom:0px;">  
                                <font style="font-size:10px" color="#3C3C3C" face="Verdana">

                                </font>
                            </p>
                                </td>
                        </tr>
                </table><br>
            </p>

Matt

Posted 2018-02-11T15:34:45.987

Reputation: 11

Yes there are several ways of doing this. kutools is supposed to work for multiple email addresses. What other options have you tried, and what is the issue. Which of the various tags (unix, vba, google-spreadsheets, excel) do you want to use? What kind of output do you want? – Ron Rosenfeld – 2018-02-11T18:16:30.560

@RonRosenfeld Required output - extract emails and save them in a separate spread sheet or in the adjacent column. I would like to use the excel vba or which ever one is easier – Matt – 2018-02-12T04:56:34.713

Answers

1

This User Defined Function (UDF) will return an array of the existing email addresses in the referenced cell.

The individual addresses can then be retrieved in a variety of methods:

  • Use the INDEX function
  • Enter the function as an array across several cells
  • Use it in a macro to perform specialized actions that you may require.

The character of an email address is determined by the regex pattern sPat and described in detail below.

You put the UDF into a regular module

Option Explicit
Function ExtrEmail(S As String) As String()
    Dim sTemp() As String
    Dim RE As Object, MC As Object, M As Object
    Const sPat As String = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b" 'Regex for emails
    Dim I As Long

'Initialize Regex engine
Set RE = CreateObject("vbscript.regexp")
With RE
    .Pattern = sPat
    .Global = True
    .ignorecase = True
    If .test(S) = True Then
        Set MC = .Execute(S)
        ReDim sTemp(1 To MC.Count)
        I = 0
        For Each M In MC
            I = I + 1
            sTemp(I) = M
        Next M
    End If
End With
ExtrEmail = sTemp
End Function

One way of using it is shown in the screenshot below (using the INDEX function). The formula is entered in B1 and filled to the right and down. The IFERROR clause causes it to return a null string if there are fewer matches than cells filled.

Edit: Picture changed to show all three examples

enter image description here

Regex Pattern Explanation

Emailaddress

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b

Options: Case insensitive; ^$ don’t match at line breaks

Created with RegexBuddy

Ron Rosenfeld

Posted 2018-02-11T15:34:45.987

Reputation: 3 333

@Matt If it is working for you, could you please mark my answer as accepted? Please read What should I do when someone answers my question?

– Ron Rosenfeld – 2018-02-13T11:41:14.660

I have tried your solution for the Input format 3. Unfortunately your solution is not working. It is extracting only the first occurence of the email. Not all the multiple emails present in the cell are being extracted. Not sure if the issue is because of the input data or I'm executing it wrong - =IFERROR(INDEX(ExtrEmail($A1),COLUMNS($A:A)),"") – Matt – 2018-02-19T21:38:48.367

@Matt Seems to work OK here with your just added Input Format 3 data. And your formula appears OK, assuming your data is A1. I'll post a screenshot of my results with all your info – Ron Rosenfeld – 2018-02-19T22:05:21.557