66

I fairly often happen across forums spammed with messages such as:

Arugula (Eruca sativa) is an quarterly green, pretended or roquette. It's been Traditional times, overclever 20 flat has be useful to "foodie" movement.Before impediment 1990s, thrill was norm harvested foreign wild. Colour has naturalized reactionary world, on top of everything elseloftier Europe addition North America. Arugula is all round Mediterranean region, wean away from Morocco and Portugal, eastern Lebanon plus Turkey. Roughly India, adult seeds are songeffortless Gargeer. Solvent is scour (Brassicaceae) family, rod is quite a distance rocket, which is public ...

What is the purpose behind such spam? It's annoying, yes, but one assumes that the spammer has some purpose other than to simply annoy to go to the effort of doing this. I don't see any URLs or hot links in the message, and no apparent "funny" formatting that might exploit something.

Is this somehow trying to influence web crawlers? (And, if so, to what purpose?) Does it somehow exploit some sort of weakness in the forum software? What?

Added: Not really related to the original question -- more of a tangential comment, but I thought it would be worthwhile to keep it in the same place, in case someone else comes looking:

The nature of the "strange" posts on the forum I'm mainly thinking about (http://forums.finehomebuilding.com/) has largely changed. What we get now (once/twice a week) are posts that parrot details from previous posts in the thread (often a very old thread), or perhaps details gained from a web search on the thread's topic, but they are generally pointless (at best a "me too" nature) and the English, while technically proper, is a hair stilted and clearly not that of an English speaker (neither British, American, Indian, nor African, all of whose dialects I'm at least passingly familiar with).

My best guess is that these are people, probably in China, who are learning English and are using the forum as a sort of test, to see if their post goes undetected. I don't know, however, if this is simply a game, a test for an English class, or a test/practice for a wannabe spammer. (It's unlikely that they're trying to "curry favor" with the spam filter, as the thing ("Mollom") is notoriously flaky and happily lets spam through on the first try while rejecting legitimate posts.)

But wait -- there's more!!

For about the past year the forum of which I speak has been regularly (at least weekly, and sometimes several times a day -- twice so far this morning) bombarded with posts such as:

Kitchen Units For Sale. Thirty Ex Display Kitchens To Clear. www. e x d i s p l a y k i t c h e n s 1 .co.uk £ 595 Each with appliances.

(URL slightly corrupted so as to not encourage these folks.)

Apparently this is a major spammer operating out of Europe (and our forum is about 99% US-oriented), so it's pointless at best. The oddest thing is that the constant spamming has apparently "poisoned" the URL for Google (and likely other search engines) such that you have to pretty much spell out the URL to get a "hit".

(The other odd thing, of course, is that the system operators seem incapable of blocking this, even though the URL is always the same.)

Another question --

Since, as I observed earlier, the "kitchen spam" posts (seen on dozens of other BBs as well) have apparently "poisoned" the associated web site for Google, is it possible that the spam is actually intending to do this, and is instigated by someone (a competitor?) who wishes ill for that web site?

Raedwald
  • 518
  • 4
  • 12
Hot Licks
  • 917
  • 7
  • 14
  • 1
    I wouldn't try to make sense of forum spam, I've seen some which would make a great sales pitch and not say what they were saying or provide a link of any sort. – Inverted Llama Mar 11 '12 at 08:53
  • Don't people have better things to do? The lengths they will go to to try to sell something are beyond absurd and self-defeating. They could earn more money mowing lawns! –  Jun 27 '16 at 01:16

4 Answers4

106

They are trying to do Bayesian poisoning.

By sending lots of correct words and a few words which are used in spam, like viagra, those words get a lower spam notification (over time).

This means that after a while they can get real spam with links through to the filter.

logicalscope
  • 6,344
  • 3
  • 25
  • 38
Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196
28

My observations are that this sort of spam has been the first few posts of a newly created user. After a few of this sort, the normal sort with links included start up.

My guesses as to the purpose are:

  1. Fooling anti-spam software that concentrates on first posts.
  2. Getting the first ten posts out of the way so they can post links. Some forum software enforces this.
  3. Search engine keyword stuffing. I don't see any obvious keywords in your sample but I have in the forums I run.
Ladadadada
  • 5,163
  • 1
  • 24
  • 41
  • 2
    Here are the words related to cannabis in the post: sativa, green, thrill, harvest, Morocco, seed, natural[ized] – alecail Nov 10 '13 at 16:51
13

(Disclaimer: I am in the anti-spam industry but I am not officially representing my employer.)

There are two types of spam in this question.

 

The first two examples ("arugula" and "parroted comments") are Bayesian poisoning.

Bayesian poisoning is an attempt to hide spam content among ham content, which aspires to confuse machine learning spam filters. It does not actually work.

 

The third example ("kitchen units") has nothing off-topic (e.g. random quotes like the first two examples), and is quite brief. Bayes poisoning is defined by its off-topic or non-sequitur content and is almost always quite verbose, so this is not Bayes poisoning.

Kitchen Units For Sale. Thirty Ex Display Kitchens To Clear. www. e x d i s p l a y k i t c h e n s 1 .co.uk £ 595 Each with appliances.

This is snowshoe spam, which is named after the giant basket-like shoes that distribute your weight across the snow and thus prevent sinking into the snow with each step. This leaves a lighter footprint and is therefore harder to track. Snowshoe spam aspires to similarly tread lightly and be harder to notice.

(URL slightly corrupted so as to not encourage these folks.)

That caveat is actually important. Snowshoe tends not to obfuscate its links much (since that makes victims less likely to click). Instead, the domain is used so briefly that the spam has already arrived in your inbox by the time URI DNSBLs can blacklist it.

Snowshoe spam generally has a short body, is selling something, and pretends to be a somewhat legitimate marketer. The current generation of snowshoe is limited to morally clean items (like kitchen units or garden hoses) rather than morally questionable items (like porn or drugs), but this could easily change.

Originally, snowshoe spam was very low volume in order to evade notice from spam traps, but spammers have learned that because trap-fed filters (such as DNSBLs) take a few minutes to propagate their knowledge, very high volume would work just fine if the entire spam campaign completed first. This fits the "tread lightly" principle that got this class of spam named even though it's less applicable nowadays.

Adam Katz
  • 9,718
  • 2
  • 22
  • 44
2

The post might rank well for a certain keyword in Google. A few days after the post is written the author can add a link to the signature of the account.

Christian
  • 1,876
  • 1
  • 14
  • 23