Remove/Change specific Html tags NotePad++

1

i have found many similar posts, but non of them answers my question. I would like to replace/remove/change open and close tag with a specific key word. in this case i am trying to remove all tags whit href="#" in it....

<a href="#">leave this text</a>
<a class="" id="" href="#">leave this text too</a>

<a href="http://......">Dont remove this tag!</a>

I have this code, but i cant figure out how to leave the text...

find: <a[^h]*href="#"[^>]*> (skip content) </a>
replace: (same content)
or
replace: <a href="somthing"> (same content) </a>

Alfonso

Posted 2015-10-14T10:15:00.333

Reputation: 113

By note++ you mean notepad++? – DavidPostill – 2015-10-14T10:26:30.510

Yes, im sorry. NotePad++ – Alfonso – 2015-10-14T10:29:50.650

That was too easy! ;) – DavidPostill – 2015-10-14T10:35:44.767

Answers

0

I am trying to remove all tags containing href="#"

  • Menu "Search" > "Replace" (or Ctrl + H)

  • Set "Find what" to <a .*?href="#">(.*?)</a>

  • Set "Replace with" to \1

  • Enable "Regular expression"

  • Click "Replace All"

    Image

Before:

<a href="#">leave this text</a>
<a class="" id="" href="#">leave this text too</a>
<a href="http://......">Dont remove this tag!</a>

After:

leave this text
leave this text too
<a href="http://......">Dont remove this tag!</a>

As pointed out by AFH in a comment, there is a better regular expression that will catch expression that were not included in the sample data.

  • Set "Find what" to <a .*?href="#" .*?>(.*?)</a>

    This will match lines where there are clauses after the href="#" (and before the first matching >).

    Note:

    It will fail to work correctly if there are any >s in the value field of a subsequent clause (before the > matching <a)


Further reading

DavidPostill

Posted 2015-10-14T10:15:00.333

Reputation: 118 938

1I can't see any reason why your search string would not have worked with the examples the questioner gave, although his answer below does allow for clauses after the href as well as before: for this you would need <a .*?href="#" .*?>(.*?)</a>. In fact, this should be a better solution, as his would fail if there were another clause containing h before the href (eg in the class or id value). Both would fail if there were > in the value field of a subsequent clause, and I see no easy fix for this (unlikely) case. – AFH – 2015-10-14T14:47:42.190

@AFH Agree with all of the above. However, my answer worked with the data the OP provided as input. I will add a note with your regular expression just for completeness. Thanks for your comment. – DavidPostill – 2015-10-14T14:49:56.427

0

Thank you David for the answer! But actually the code: <a .*?href="#">(.*?)</a> did not found any attributes in my file. It may be because of some other configurations or different versions of notepad++. I had to use this code:

 Find:    <a[^h]*href="#"[^>]*>(.*?)</a>
 Replace: <a href="new_url">\1</a>

Alfonso

Posted 2015-10-14T10:15:00.333

Reputation: 113

My answer was correct with the data you provided as input. I don't believe the version of Notepad++ will make any difference with this kind of regular expression. – DavidPostill – 2015-10-14T14:49:06.867