Regular Expression to find a pangram

1

Awhile back I encountered a simple programming puzzle to determine whether a string was a pangram, or a perfect pangram. (Although it's worth noting that this particular challenge gave a different meaning than the standard definition of a "perfect pangram," to mean any sentence where every letter appears n times for some positive n). And it was a fun challenge. But recently it got me wondering whether it would be possible to solve this challenge purely with regular expressions. If it is, I expect it will be a pretty gnarly regex. But I'm not even sure it is possible.

Still, if anyone is able to do it, it would be the folks here at Code Golf. So here's the challenge:

Write a regular expression which can determine whether or not a given string is a pangram. You can assume that all non-alpha characters have already been stripped out (or don't, if you really want to challenge yourself), and that everything has been transformed into lowercase.

Bonus challenge: write another regular expression to determine whether the given string is a "perfect pangram" by the definition given above.

Examples of an ordinary pangram:

thequickbrownfoxjumpsoverthelazydog

Example of perfect pangrams:

abcdefghijklmnopqrstuvwxyz

jocknymphswaqfdrugvexblitz

squdgykilpjobzarfnthcwmvexhmfjordwaltzcinqbuskpyxveg

I'm not really sure if this is even possible just with regex alone. If you find that it is, kudos!

soapergem

Posted 2018-09-25T21:33:23.690

Reputation: 263

Question was closed 2018-09-26T03:05:54.200

2Pity that Martin is on leave – Luis Mendo – 2018-09-25T21:53:24.280

1Pangram, not anagram – soapergem – 2018-09-25T22:02:30.227

Quoted verbatim from the question above (read the question): "any sentence where every letter appears n times for some positive n." – soapergem – 2018-09-25T22:05:57.683

8Let me rephrase, as my previous comment was a mess: 1) This needs a winning criterion. Code golf? 2) Bonuses are almost always a bad idea. There should be a single task/objective to the challenge. Having two is confusing, and probably useless, as people will go fot the easier one. 3) What flavours of regexes are allowed? – Luis Mendo – 2018-09-25T22:20:57.937

2

Not sure what regexes are able to do, but the grammar is certainly not regular which can easily be proven with the Pumping Lemma (just set $w = a^p b^p \cdots z^p$) - so it's likely a regex won't not do the job (it's regular-expression after all).

– ბიმო – 2018-09-26T16:09:27.180

Thinking about this a bit more: I don't have background in theoretical CS/automata theory, so I could be missing something, but doesn't the pumping lemma allow you to duplicate the entire string? In your example, ww would also be a valid perfect pangram. – Jack Brounstein – 2018-09-26T19:24:25.360

@BMO I'm well aware that regex is not well-suited for this task... but that's part of the fun, and why I'm asking here at code golf. – soapergem – 2018-09-26T19:26:23.963

Answers

5

Assuming you're using a flavor of regex that include look-aheads, detecting a pangram is straightforward, but tedious. Using Python syntax:

r"(?=.*a)(?=.*b)(?=.*c)(?=.*d)(?=.*e)(?=.*f)(?=.*g)(?=.*h)(?=.*i)(?=.*j)(?=.*k)(?=.*l)(?=.*m)(?=.*n)(?=.*o)(?=.*p)(?=.*q)(?=.*r)(?=.*s)(?=.*t)(?=.*u)(?=.*v)(?=.*w)(?=.*x)(?=.*y)(?=.*z).*"

https://regex101.com/r/Z7NjEQ/2

The (?=.*a) block tests if what follows after that point matches .*a; another way to say that is, is this expression followed by any characters and then an a, or just, does this include an a? Then repeat that check for every letter of the alphabet.

Without lookaheads, it's still possible, but even more tedious, because you have to account for every possible ordering of the alphabet. If you were limited to the letters abc, here's what it might look like:

r".*(a.*b.*c|a.*c.*b|b.*a.*c|b.*c.*a|c.*a.*b|c.*b.*a).*"

https://regex101.com/r/OnXQe7/1/

This checks for an a, any set of characters, a b, any set of characters, then a c, OR an a, any set of characters, a c, any set of characters, then a b, OR...you get the idea.

You could use the same approach to check for the standard meaning of perfect pangram (every character exactly once), but not the modified version used here.

Jack Brounstein

Posted 2018-09-25T21:33:23.690

Reputation: 381

This doesn't work for abcdefghijklmnopqrstuvwxyza..

– ბიმო – 2018-09-26T16:06:35.177

That link shows abcdefghijklmnopqrstuvwxyza as matching the regex, which is correct, though? I'm specifically answering the ordinary pangram question, not the perfect pangram. – Jack Brounstein – 2018-09-26T16:11:31.850

Nvm, I misunderstood the question.. I thought that an ordinary pangram is just a generalized pangram where $n = 1$. Sorry about that! – ბიმო – 2018-09-26T16:15:38.063

No worries. If you want a perfect pangram with n=1, you could specify that the matching string has exactly 26 characters: https://regex101.com/r/Z7NjEQ/4 You could also extend this to work for any given n, but I don't know of a way to extend it to all n.

– Jack Brounstein – 2018-09-26T16:22:20.853

True, yeah I think it's not possible for all n (see my comment on the challenge itself), though I'd be happy to be proven wrong. – ბიმო – 2018-09-26T16:30:04.700

1

I believe you're right for pure regular expressions, but many regex engines have support for additional features/types of match that might allow it. Regular expressions shouldn't be able to match palindromes, either, but PCRE can do it. I don't know enough about those advanced features to see how it would work, but I'd lean very slightly toward it being possible.

– Jack Brounstein – 2018-09-26T18:09:18.223

1

Regexp: 45 43 20 19 18 bytes

(?!.*(.).*\1).{26}

Assuming the input is [a-z] (lowercase, only alphabets a to z, spaces etc removed)

A byte saved, thanks to Neil

FatalError

Posted 2018-09-25T21:33:23.690

Reputation: 119

1If you're assuming the input format then . instead of \w? – Neil – 2018-09-25T22:41:04.077

@Neil oh, thanks, stupid me – FatalError – 2018-09-25T23:23:28.260

@Shaggy "You can assume that all non-alpha characters have already been stripped out", just notified about the assumption – FatalError – 2018-09-25T23:24:11.857

1This doesn’t work for either the regular challenge or the bonus. Doesn’t match thequickbrownfoxjumpsoverthelazydog (regular pangram) or squdgykil... (the bonus). – FrownyFrog – 2018-09-26T00:29:23.983