4 chars with slashes 2 without
In the TXR language's regex engine, an empty character class []
matches no character, and therefore no string. It behaves this way because the character class requires a character match, and when it is empty it specifies that no character can satisfy it.
Another way is to invert the "set of all strings including empty" regex /.*/
using the complement operator: /~.*/
. The complement of that set contains no strings at all, and so cannot match anything.
This is all documented in the man page:
nomatch
The nomatch regular expression represents the empty set: it
matches no strings at all, not even the empty string. There is
no dedicated syntax to directly express nomatch in the regex
language. However, the empty character class [] is equivalent
to nomatch, and may be considered to be a notation for it. Other
representations of nomatch are possible: for instance, the regex
~.* which is the complement of the regex that denotes the set of
all possible strings, and thus denotes the empty set. A nomatch
has uses; for instance, it can be used to temporarily "comment
out" regular expressions. The regex ([]abc|xyz) is equivalent to
(xyz), since the []abc branch cannot match anything. Using [] to
"block" a subexpression allows you to leave it in place, then
enable it later by removing the "block".
The slashes are not part of the regex syntax per se; they are just punctuation which delimits regexes in the S-expression notation. Witness:
# match line of input with x variable, and then parse that as a regex
#
$ txr -c '@x
@(do (print (regex-parse x)) (put-char #\newline))' -
ab.*c <- input from tty: no slashes.
(compound #\a #\b (0+ wild) #\c) <- output: AST of regex
This inspired a question from me. I'm going to wait a few days though. Don't want 2 regex questions active at the same time – Cruncher – 2014-01-13T21:31:59.417
13
"Valid" according to which implementation? I've just found an amusing one that Perl is okay with (and that is valid according to the only RE grammar I can find, but that grep and Python's re module refuse.
– jscs – 2014-01-13T22:24:13.4631Yes, which dialect(s) of regex? There are many many different ones. – hippietrail – 2014-01-14T04:53:22.680
1
But what about Presidents' names? http://xkcd.com/1313/
– Carl Witthoft – 2014-01-14T14:05:59.143@CarlWitthoft You need to be a program to participate in that contest: http://codegolf.stackexchange.com/q/17718/2180
– boothby – 2014-01-14T16:18:30.293@boothby I might be an AI :-) – Carl Witthoft – 2014-01-14T16:32:49.800
Which syntax? Perl or POSIX? – Braden Best – 2014-02-05T18:37:48.237