Regexp engine misbehaves in Atom

0

I'm trying to remove all lines starting with whitespace characters from a big text file using Atom. The regular expression I use is ^[\s]+.*$. The problem is, it selects not only lines starting with whitespace, but also one line after them. The file is in UTF-8 and most characters are Cyrillic. What am I doing wrong?

enter image description here

John Ashpool

Posted 2015-10-03T12:36:59.473

Reputation: 189

Answers

0

  • Goal: remove any line that begins with whitespace, including the newline on the end.
  • Pattern to use: ^\n|(^[ \t]+.*\n*)
  • Remark: [\s] will match any whitespace. Whereas [ \t] will match spaces and tabs.

user193661

Posted 2015-10-03T12:36:59.473

Reputation: 499

This matches only non-empty lines. However, empty lines can later be removed with ^(?:[\t ]*(?:\r?\n|\r))+. The problem is solved. – John Ashpool – 2015-10-03T13:08:01.517

Ok, this should do it all: ^\n|(^[ \t]+.*\n*) – user193661 – 2015-10-03T18:27:15.950