Suppose I use MT19937 to choose random words out of (say) the Diceware word list. I know MT19937 is not considered a cryptographically secure PRNG, but Wikipedia suggests the weakness is rather uninteresting for this purpose:
The algorithm in its native form is not cryptographically secure. The reason is that observing a sufficient number of iterations (624 in the case of MT19937, since this is the size of the state vector from which future iterations are produced) allows one to predict all future iterations.
Since it seems rather unlikely that I would need anything like that many iterations to generate a strong passphrase, I don't see how an attacker could exploit this weakness, assuming my password is securely hashed and salted to industry standards (e.g. PBKDF2 with 10k+ rounds and a unique salt per account). It looks as if this is only considered a weakness because it makes MT19937 unsuitable for use in long-running cryptographic protocols and other situations where an attacker could plausibly observe a lot of iterations.
Since MT19937 has a much longer period than the number of possible passphrases, intuitively it sounds like it ought to be safe for this kind of usage. But I don't like to rely on intuition.
My naive estimate of the passphrase's entropy is log2(N!/(N - r)!) for N = the number of words in the dictionary and r = the number of words in the passphrase (we're sampling without replacement; if sampling with replacement the numbers go up a bit, but not by much, and I don't want a passphrase with repeated words anyway). This is simultaneously large enough to be impractical to attack, and substantially smaller than 19937 bits for reasonable values of N and r (e.g. N=2^12 and r=5), so it looks safe. But I didn't account for MT19937's use because I don't know how to do that.
Are there any known or reasonably likely attacks on passphrases generated, salted, and hashed in this fashion? In particular, does knowledge of the PRNG used allow us to significantly reduce the entropy?