The spammers are automatically generating new comments by taking existing comments and running them through a thesaurus program that replaces words with synonyms or related parts of speech. The result is a sentence which makes sense, but has word choices that no native speaker would ever make:
Where else may I am getting ...
is clearly not something a native speaker would write, but
Where else could she be getting...
is, and can be transformed by a simple substitution of pronouns and synonyms into the spam text.
This way, even if anti-spam forces have a huge database of known-spam comments, the spammers can generate infinitely many new ones that are plausibly English.
I long suspected this was the case but I recently got proof. I now occasionally get comment spam containing the entire substitution script; it'll be something like:
I can't [believe/understand/comprehend] the [great/superior/amazing] [content/information/data]...
Since the spammers were likely non-English speakers to begin with, they didn't notice they were sending the script rather than the output.
If you examine a large enough corpus of spam, you can pretty easily figure out what algorithms they're using. It would be an interesting challenge in reverse engineering to write a program that deduces the algorithms used from the corpus.
I ask because when I first saw it, I thought perhaps they were being genuine but inarticulate.
They fooled you once. It probably won't happen again!
Commenter TildalWave points out:
none of the sample spam messages OP posted actually endorse any products, or are otherwise promoting any other cause.
Well let me give you an example: here's a comment that arrived a few minutes ago on my blog:
user name: cuisinart compact toaster review
user url: toasterovenpicks.com
user email: jeffryshuler@2-mail.com
user IP: 37.59.34.218
Comment contents:
One in particular clue for that bride and groom essential their
own absolutely new everything, actually a surname burned which has a mode,
which render nearly girl thankful recognizing their refreshing surname
therefore distinctively printed.
The product is promoted in the user's metadata, not in the content of the comment. The content is just an attempt to get past the spam filter. (I suspect that in this case the text is not a mutation of an existing text but rather generated by a Markov process over a corpus of documents about wedding planning.)
Obviously anti-spam forces are on to this one too, which is why this was in my spam filter. My spam filter (akismet) on average lets through one spam for every 705 submitted. Again, that's what spammers are going for; they know that 99.9% of their work will never be seen by anyone. They're trying to randomly explore the space of false negatives in spam filters, a space which is getting quite small indeed.