25

Mitigating Cheating & Voter Fraud in Online Contests…

We run online contests of various sorts that involve users voting on entries (usually one vote per user per day). The prizes range from hundreds to thousands of dollars. Over the last four years we have encountered a number of ways people try to cheat, and have implemented couter-measures in each case. As it stands, we use the following measures:

Authentication
A user must create an account and authenticate (log in). This rules out anonymous vote stuffing.

Email Confirmation
A user must confirm their email address by clicking a link in a system email to confirm they own and have access to their address. This rules out creating accounts en masse using random (not necessarily valid) email addresses. It also slows down the process a little for one account, and a lot if you're trying to create many.

No Gmail Address Aliases
Users cannot use instant alias addresses such as localpart+suffix@gmail.com. That slows down potential cheaters.

Additional Measures
We routinely audit our signups and voting rosters for strings of email addresses that come from the same private domain (user1@smithfamily.com, user2@smithfamily.com, etc.). We also look for similar names, usernames, and "local-parts" of email addresses across domains.

We also show voting result updates on a daily basis, so there's no instant feedback. That way, if someone is trying to cheat, it will take a day to see any results, and unless they went big, they won't know for sure if their method was successful. We try to be as much of a "black box" as possible.

Needless to say, this is all exhausting and getting harder and harder to scale up. We need an easier solution to ensure that we get a lot closer to "one person, one vote" in our contests, while not burdening the user beyond need in the process.

We have explored the possibility of using SMS to attach mobile numbers to accounts in order to verify the person; the jury is still out on this approach: https://ux.stackexchange.com/questions/15980/ways-to-avoid-online-contest-voting-fraud-is-sms-account-verification-too-much/, and Can SMS text messages be used to verify a person's unique identity through a short code system?.

Some suggestions have included using credit cards, mail-in verification, reputation points … but these are all much too onerous for our target users.

What more can we do automatically in the back end to 1) identify cheaters, and more importantly 2) prevent them from even cheating?


UPDATE

We decided not to make our fraud protection leak-proof because one of the developers pointed out "the harder you make it for people to cheat, the harder it is to detect cheating." Instead, we are utilizing a medieval Chinese hunting technique called a three-sided Battue. By making it very difficult to cheat in most ways, but relatively easy to slip through in other ways, we know exactly what to look for and eliminate before the voting results are updated.

We look for patterns in votes, such as one contestant receiving a string of evenly timed votes, then look at the accounts associated with those votes. If there's a pattern to the accounts, we eliminate those accounts, and the associated votes disappear with them.

Taj Moore
  • 391
  • 1
  • 4
  • 7
  • an interesting method I've seen is to use multiple types of storage to track users. "evercookies" Is a name for them I think. http://samy.pl/evercookie/ They just have to use the same browser. – WalterJ89 Jan 17 '12 at 06:58
  • While that sounds tempting, the privacy ramifications are a little too dim. – Taj Moore Jan 17 '12 at 15:47
  • 3
    Voting "online", "secret", "cheat-proof": pick two. – o0'. Jun 28 '13 at 15:47
  • @o0'. You can use cryptographically blinded signatures. – forest Dec 11 '17 at 04:18
  • I found this pretty interesting: https://www.evoting-blog.ch/en/pages/2019/public-hacker-test-on-swiss-post-s-e-voting-system ; this voting system is indeed designed to be complex (even to use) but I could not find any obvious vulnerability to exploit it (I was part of the test program). You may want to check how they implemented it. – Overmind Apr 08 '19 at 08:33
  • Ask for the equivalent of Italian's CF (taxpayer's code) that is unique for every person. If the winner's full data does not match the CF they can't redeem the prize. It comprehend letters for name, surname, birthday, city one's birth in and a last check digit (up to 52 people [26 males and 26 females] could be born in same town the same day with the same name and surname and there won't be a collision). – DDS Sep 24 '20 at 14:14

7 Answers7

16

You're attempting to uniquely identify something through a system that was somewhat designed to not require a unique identity. Short of tying and validating to a 3rd party (usually physical) identifier, this is impossible. Instead, your best bet is to restrict voting altogether, in a manner that discourages automation and/or quick results.
If your system supports it, see about weighting the votes based on some criteria:

  1. 'Age' of the account (since signup). Disallow votes for anything under some age (say, since the start of the contest).
  2. Number of entries (in other contests).
  3. Account activity (is the account visited/used regularily?) - comments in forums or on entries, etc.
  4. Percentage of votes by user in contests since signup (may reveal 'targeted' behaviour). Perhaps by number of different vote recipients?
  5. Throttle votes accepted per-entry based on total votes per-contest. Anything above a threshold gets flagged for review.

These all should be dooable from a database/reporting side, negating the need for any fancy javascript or other client-side code (which can be tricked).

Clockwork-Muse
  • 640
  • 4
  • 8
  • 1
    Really good answer with a really creative approach. Being so restrictive users will get discouraged to cheat. I would even dare to add "paying" to the list (in cases where it makes sense). – Alpha Jan 18 '12 at 21:46
  • The idea of number 1 and 3 didnt cross my mind, nice thinking! I'm creating a teacher & student kind of social network and i really want to prevent students creating fake accounts and/or abuse teachers. Is there any other ideas you could gift me with ? – xperator Jun 30 '15 at 21:38
  • 1
    @xperator - students and teachers at what level? If you're at the college level, most colleges/universities issue `.edu` email addresses. Send a confirmation email there, and you're golden. Is this for a single school district or institution? Then you probably already have access to the roster - just pre-generate the accounts, and hand them out. But the problem is this question (and thus answer) was focused in a different direction than you're actually looking; you more want to prevent duplicate accounts, while the original post was about vote fraud (slightly different). – Clockwork-Muse Jul 01 '15 at 11:25
  • Yes its different from original post but the main concern is the same. Its about stopping some people from misusing the system. And yup its at college level and its not associated to a specific place. Its independant from any organization. I wanna start off by handing out registration invite codes to few teachers i know and then expand & charge them if the feedback turns out good. But the registration is open to public and any students can join. They follow the teacher page and they can get in touch with them and get notified for homeworks, schedule change,etc.. – xperator Jul 01 '15 at 12:09
  • 1
    @Clockwork-Muse Actually, most colleges/universities are not allowed to use the .edu tld unless they are located in the U.S. – kojow7 Mar 09 '16 at 15:15
  • 1
    @kojow7 - ouch, you're right. Still, it was likely that the poster was asking about an American product... – Clockwork-Muse Mar 09 '16 at 22:42
4

You're going to want to check IPs (though obviously this isn't a large hurdle, you should still do it)

You'll want a captcha to prevent (at least weaker) automated cheating

You're going to want to (in the end) either manually or automatically check for large groupings of signups during small time frames (automation/stuffing)

Disallowing "free" email services would obviously take a large chunk out of cheating, but it would also probably take an even bigger chunk out of your user base depending

If you made them enter in more personal information it would at least help with verification (name + address + phone would add some verification even if you don't call/sms the phone if it is listed anywhere)

If at all possible you're going to want to have similarity detection across the board, not just for emails (to catch lazy cheaters who maybe repeat a similar email later, a name in an mail, or vice-versa)

But as far as preventing cheating on what is ostensibly a web form you're going to need some sort of unique identifier which is of course going to be a much larger overhead for your users, more security needed for your site, and will cause a number of users to become leery

SMS is more verifiable than email accounts though since it really isn't that hard to sign up for 20,000+ hotmail/gmail/yahoo etc. email accounts, that's ignoring throw away email accounts/forwarders.

There's also the option of automated calls (which rules out some, but not all of the SMS security issues)

doyler
  • 602
  • 4
  • 11
4

If I am correct in saying that you issue a live paper check to someone one measure to put in place is the address. If you structure it like radio stations do and make it so only one entry per household / address this could prevent some cheating. In terms of trying to prevent someone from cheating in the first place you probably want to look at the online gaming industry. Things such as PunkBuster and VAC have failed to completely prevent cheating in games. A good thing to do is to perform once per week audits and evaluate the time a user spends on a page before voting, this is available in GA (Google analytics) as far as I recall. Set a threshold for how long a user must be on a page, this could also help to detect scripts that are clicking links.

Woot4Moo
  • 889
  • 6
  • 10
  • We're concerned about voters more than entrants. But a time delay to vote is a new idea. We do have to thwart both bots and people with too much time on their hands. – Taj Moore Jan 12 '12 at 17:34
4

I agree with @makerofthings7 that U-Prove, together with appropriately verifiable claims from a single provider like the government, is the closest thing we have to what you eventually want. Users will (hopefully) want support for "unidirectional identities" rather than a gloablly-unique ID like an OpenID which allows relying sites to collaborate and track you across the internet. See more at the Laws of Identity

In the meantime, requiring that users register via something like (pick one) Facebook Connect or Google Plus would allow you to leverage the (controversial) efforts of those big players to get folks to use real names, or at least "one-to-a-customer".

But of course as others have said, no matter what you do, given enough incentive folks will find a way to register multiple times. So it really comes down to the other factors in your threat model - e.g. what is the risk of excluding people who don't want to go thru the hassle of registering, what is your ultimate bottom line based on, etc.

nealmcb
  • 20,544
  • 6
  • 69
  • 116
  • 1
    Requiring either FB or G+ will make you "co-responsible" for anything evil these services do. And in the end, requiring a "real identity" this way just encourages people to make-up cheap, fake, online "real life" identities. – curiousguy Jun 26 '12 at 01:03
3

The following beta idea may be helpful when it's adopted by government agencies... Microsoft is working on something called U Prove that allows a government to publish claims to others.

http://www.microsoft.com/u-prove

The idea is that the citizen would authenticate to the government STS and that would redirect back to your website with a unique ID and possibly other identifying information.

makerofthings7
  • 50,090
  • 54
  • 250
  • 536
  • 1
    How does this differ from OpenID? Or is this OpenID plus some tie into government identification? – Gilles 'SO- stop being evil' Jan 16 '12 at 19:13
  • 1
    My understanding is that it's based on SAML, WS-Fed / WS-Trust where the personal data (or hash) is sent to a 3rd party – makerofthings7 Jan 16 '12 at 21:40
  • 2
    @Gilles It is quite different from OpenID in that it has much better security, and can be configured so that the user is anonymous and can't be matched with accounts created by the same U Prove user on other sites. Whereas with OpenID the user reveals a globally unique id. Look up the "laws of identity" - http://www.identityblog.com/?p=354 OpenID only allows for an "omni-directional" identifiers, but U Prove also supports "unidirectional" identifiers which are only revealed to a specific relying party. – nealmcb Jan 17 '12 at 02:42
  • @makerofthings7 It is more than a concept in planning. But the problem is there aren't enough folks demanding this - either web users looking for more anonymity, or web sites looking for a way to identify physical individuals via e.g. a government claim, and/or willing to give up the ability to track individuals across web sites (which brings in more advertising bucks). So this contest use case is the sort of thing we need to get to a better identity system – nealmcb Jan 17 '12 at 03:00
2

I think there is a simple and low-tech solution. You introduce three requirements into the terms of the contest:

  1. Each user must specify their name and address when they create the account.
  2. You restrict the contest to one entry per household. You use the address to eliminate multiple entries with the same address.
  3. You put in the terms of the contest that, when a winner is selected, the check will only be issued to a person of the name and address found in the winning account. If there is no person of that name at that address, then the prize is re-allocated to someone else. When someone wins, you do something to verify the address of the winner: e.g., send registered, certified to that address and require the winner to prove they received the mail; or require them to show a copy of their driver's license showing that address on the license.

What this does is prevent someone from cheating. They can create as many accounts as they want, but if they use their own address, they'll only be able to submit one entry per contest; and if they use some other address, they won't be able to collect the prize if their account is chosen (since it's not their address, and the address of the winner will be verified before they get their prize).

It is not perfect -- depending upon your procedure for verifying the address of the winner, a cheater may still be able to create several accounts, each one listing the address of a different friend -- but it might be good enough in practice.

D.W.
  • 98,420
  • 30
  • 267
  • 572
  • Many legitimate voters live in the same household in our contests: a girl enters; her mother, father, sister, brother all vote for her. That's behavior we want. Her boyfriend the hacker who creates accounts, bots, or whatever to jook her numbers; that's what we don't want. The actual winner will not be the person who votes the most, but garners the most votes. – Taj Moore Feb 23 '12 at 18:37
  • @tajmo, You could modify my proposal to accommodate your circumstances: confirm the identity of the winner as part of the prize-awarding process (e.g., check their ID; make out a check to that name, so it can't be cashed by anyone else). – D.W. Feb 24 '12 at 05:25
  • 1
    We're not concerned with confirming winners; we're concerned with confirming voters. – Taj Moore Feb 24 '12 at 16:46
  • @tajmo, it sounds like you haven't absorbed my proposal. I apologize if I didn't communicate it clearly. The beauty of my proposal is you don't need to be concerned with confirming voters, because that is not actually your real goal. Your *real* goal is to prevent cheating and multiple voting. My answer demonstrates that you can prevent cheating and multiple voting *without* confirming voters, if instead you confirm winners. Confirming winners is a *lot* easier than confirming voters, because there are many more voters than winners. – D.W. Feb 24 '12 at 18:07
  • 5
    I can attest that confirming winners doesn't prevent cheating; in our contests, applicants show up in person before the contest begins. They are confirmed physically, with ID, and by affidavit. And yet there is voter fraud. One thing has nothing to do with the other. – Taj Moore Feb 24 '12 at 18:14
  • This is a beautiful solution for giving an equal (random) chance of winning to all real participants, where it doesn't depend on voting. Unfortunately, this solution doesn't help if any voters can plan to be less likely to win, because an attacker can use fake addresses *F1, F2, etc...* for voters that are less likely to win and a real address *R* for a voter that is more likely to win. The fake voters know that they are less likely to win, but they can still vote to help *R*. – Krubo Dec 05 '18 at 19:26
1

If you have a contest e.g. what is your favourite day of the week ? then you create a copy of your user table lets name it cont_101 then you create a table with the contest answers lets name it cont_ans_101.

Now when a user vote then a flag in the cont_101 goes true so the user have vote and the counter in cont_ans_101 raise by one.

When a user tries to vote you simple check the flag in the user table (cont_101) and if is it false then you can count the vote otherwise you could ban the user.

  • 1
    That would require you to uniquely identify a user. The problem is that it's easy to create a large number of accounts, and vote once with each of them. – CodesInChaos Jan 14 '12 at 23:35
  • 1
    This is also somewhat terrible from a database design point of view - It's much better to store a cross-ref table between user and answer. Data-set size is smaller, and it's much harder to 'forget' somebody (for example, a user who signed up _after_ the user table was copied). – Clockwork-Muse Jan 18 '12 at 18:53
  • yes you are right form database design point of view is not elegant. – Stelios Joseph Karras Jan 19 '12 at 06:13
  • 2
    Why do you need to copy the user table? As far as I can see, you just need a table that records which user participated in which contest: user_contest: id PK, user_id FK, contest_id FK). And of course the contest_vote table (id PK, contest_id FK, answer ??). – Hendrik Brummermann Jan 24 '12 at 07:19