In XKCD #936: Short complex password, or long dictionary passphrase? Jeff claimed that password cracking with "dictionary words separated by spaces", or "a complete sentence with punctuation", or "leet-speak numb3r substitution" is "highly unlikely in practice".
However, there are certainly John-the-Ripper rulesets that perform l33t-speak substitutions and "pre/append punctuation or 1-4 digits (some are even included in the default set, and more are discussed e.g. at https://www.owasp.org/images/a/af/2011-Supercharged-Slides-Redman-OWASP-Feb.pdf, which talks about cracking ~50K "corporate" passwords using such techniques; http://contest-2010.korelogic.com/rules.html has some of the specific JtR rules used), and I can't see any reason why an attacker wouldn't use them.
Jeff used rumkin.com in part to justify his claim that Tr0ub4dor&3 is in practice as secure as a 4-word-from-2K-wordlist passphrase. But rumkin.com doesn't seem to take into account l33t-speak substitutions in determining the entropy.
So my question is: Are there any password strength checkers which take account of the limited entropy added by l33t-speak and similar substitutions and concatentations?
Ideally the rating would closely correspond with the amount of time that an actual attacker (using "state of the art" techniques, not just character-by-character brute-forcing) would take to find the password. Those techniques would include l33t-speak transformations, common passwords, wordlists, etc.
Open-source non-web-based preferred, for obvious reasons. While a good password generation algorithm (by definition) is secure even when the attacker has seen other passwords produced by the same algorithm, the idea is to point naive users at this checker, and the whole point is that their passwords are unlikely to have been generated via a sound algorithm.
Extra points if it learns from passwords submitted to it... unless it then automatically breaks into the users' paypal accounts and uses the money to fund Skynet...