Protect PHP hashing algorithm (from bots)

Question

First of all, I'm new to this whole crypto thing. Here's my scenario:

I have a client app, that sends a 20char max string to my server. The server then checks if the name is indeed matching the char limit, then logs the name, time, and IP into an sqlite3 database.

My problem is that I have a troll who is hell bent on destroying my site. First, there was no protection or filters on added usernames. I've now added the char filter, and also made sure to only accept 1 name per IP address. That all works, but now he created a bot that gets him a new proxy every time before making the get request.

I've changed as much as I think I can server side, I think I need to roll out a client side application patch. Also note that this process happens once a minute until the client app is closed.

I need to verify that the get request is coming from my client app, and not a bot. This guy is pretty hell bent on pissing in my/anyone who's trying to use the site legitimately's Cheerios.

Idea 1: creating random token checks. Process: client app (shortened to app) sends unhashed token request, server replies with random string. Client app hashes the string, and replies to the server with the hashed and unhashed string. Server does the hashing on the unhashed, and compares it to the hashed one. If all matches, it adds the name to the database.

Problem with that is A) I have to make sure the unhashed string is not used more than once per period of time, but I could add them to a DB and use Cron or something to delete the database every now and then. B) is that I have to have my hashing algorithm completely protected, on both the app and the server. If they busted the algorithm (or just identified what hashing method it's using) they could build a bot with that accounted for.

I'd use something like a captcha, but it's just a background thread that routinely submits the get request, there is no user interaction with it at all.

Any help or pointers are greatly appreciated, thanks if you've made it through this much text...

The app is a mod for a game to mod a mode back into it. It gets the players name from a memory address, then sends that to the server. I doubt a bunch of people with 20 char randomized hex strings all used it at the same time from nearly identical IP address lol. — coltonon, Jul 07 '17 at 21:09
@coltonon oh, so he's using the same IP range. In that case, I think you can do a ban by IP range. — Goose, Jul 07 '17 at 21:10
Meta-question: Is it permissible on this exchange to suggest a counterattack, or to point out an available method of counterattack? — QuadmasterXLII, Jul 07 '17 at 22:42
"I've changed as much as I think I can server side, I think I need to roll out a client side application patch." ... Never trust that the input is from a legitimate source. Client side security is not security at all. — Kaithar, Jul 08 '17 at 01:23
All I'm saying is, the threat model here is "The adversary is executing our binaries, then reverse engineering them." Would it be so wrong to find out whether our adversary is executing our binaries in a virtual machine, by, say, adding a step that deletes system32? — QuadmasterXLII, Jul 08 '17 at 01:44
Don't "only accept one name per IP address". Accept as many as you like, but only store one in the database. Then the troll doesn't know it's not working. — user253751, Jul 08 '17 at 09:12
There's always the possibility of hiring / enlisting the troll as a stress-tester / white-hat. If there a "for the lolz" troll it might make them leave you alone ("They actually _want_ me to do this? Boring."), and if not you may get somebody better for your system than twenty [security.se] questions. — wizzwizz4, Jul 08 '17 at 11:08
It's not a solution but rather a base idea: what's does a phone have that the bot doesn't? Use that to make sure the msg comes from a phone. — Pedro Lobito, Jul 09 '17 at 00:40

crovers · Answer 1 · 2017-07-07T19:37:45.473

16

You cannot. It is unfortunate, but you cannot. Whatever is running as the client (barring some situations you are almost certainly not in involving TPM) can, if someone is sufficiently motivated, be completely understood. Someone can disassemble it, emulate it, patch it - there's virtually nothing you can do about this.

What you need to do is not look at authenticating the client, but rather authenticating the USER. Look at OAuth2/OpenID Connect/similar - make them log in with gmail or facebook before using your app. If this is a background process, allow them to register on your website (using oauth2, etc) to get an apikey. That apikey is unique to them, identifies them to you, so if you see abuse, you can tie it back to their gmail/facebook/whatever.

If you can't do this, well, then you're stuck with just making it harder. You can hope they are not sufficiently motivated to decompile your app and get your hash algorithm, but in the long run, that's not a winning game.

edited Jul 07 '17 at 19:37

answered Jul 07 '17 at 18:18

crovers

6,311
1
19
29

An alternative option to gmail, facebook, is text message authentication, provided by some services such as twilio. – Goose Jul 07 '17 at 21:09
Unfortunately user authentication isn't an option here, I might have to get pretty crafty. – coltonon Jul 07 '17 at 21:18
2

CAPTCHA? Might be able to do it manually though, if he's doing it at a low enough rate that he can get a new IP between registrations. – SomeoneSomewhereSupportsMonica Jul 08 '17 at 08:07
As a part of "just making it harder", if possible, I'd try to hide from attacker whether he is successful or not. Make it always look like he is ahead of you, so he wouldn't know. – Draex_ Jul 08 '17 at 08:40
1

Definitely use a CAPTCHA. – Jul 08 '17 at 14:22
3

*"make them log in with gmail or facebook before using your app"* I will **not** use a website that requires this. If you need my identity so badly, come check my ID card in person. If you just want to know whether I'm human, use a CAPTCHA which is a test to "[...] Tell Computers and Humans Apart" (i.e. exactly fit for the purpose). – Luc Jul 08 '17 at 19:25
1

@Luc It's not needing your identity, it's verifying that you have an identity. It's one thing to sit there and fill in CAPTCHAs while your bot does the rest of your work, but requiring a verifiable login adds another layer of complexity aimed to slow down malicious users. – zzarzzur Jul 09 '17 at 00:57
@Luc In this case, the intended use appears to be computer, not human. But we need to ensure that computer is being controlled by a non-malicious human. I know of no way to do this other than by verifying the person and issuing a key tied to that person. – crovers Jul 10 '17 at 02:06

QuadmasterXLII · Answer 2 · 2017-07-07T22:34:31.350

11

HashCash

You could incorporate proof of work into the system: use something like HashCash to require the user to spend, say, 1 second of CPU-time to message your server. This system could be as simple as requiring the user to send a nonce with the message so that when the nonce and message are hashed together, it ends with 5 0's. There will be a tradeoff: if your client app uses 1% cpu, then your opponent will be able to send 100 times as many messages as a normal user, which may be too many. If your client uses 100% cpu, your opponent will only be able to send the normal rate of messages, but your real users will be annoyed.

So, the full system would look something like this:

Client:

Generate the GET request the same way you currently generate it, but with an extra field called Nonce. Hash the GET request, and see if it ends in x number of zeroes. If it doesn't end in x zeroes, then randomly pick a new Nonce and try again. If it does, send it to the server.

Server:

Only accept GET requests that, when hashed, end in the correct number of zeroes. Also, only accept GET requests timestamped within the last five minutes, and keep a table of recently accepted GET requests and check against it so that each request is only accepted once

edited Jul 07 '17 at 22:34

answered Jul 07 '17 at 21:28

QuadmasterXLII

211
1
4

Uhhhh what’s a “nonce”? – Tim Jul 07 '17 at 22:24
4

The server sends a random blob of data. The client has to then SHA1(blob + its own random data). The client's random data is the nonce. The client will need to guess random data until the SHA1 hash ends in five 0's. Since SHA1 isn't reversible, this requires significant CPU to brute-force guess. The client sends back the random data it finds that makes the sha valid, and the server can do a single test to pass/fail it. A single test is fast, but generating the right hash client-side may take several seconds and billions of guesses. – Bryan Boettcher Jul 07 '17 at 22:27
1

@Tim https://en.wiktionary.org/wiki/nonce#Etymology_3 – Ry- Jul 08 '17 at 12:53
@Ryan Ahh, it has a rather different meaning in my mind. – Tim Jul 08 '17 at 12:55

zzarzzur · Answer 3 · 2017-07-08T00:27:10.823

Rate limiting, plain and simple.

You need to determine just how important the service is. If it's a service that your application depends on, it needs to have security implemented. It could take the form of having the user login through their browser, then save a file to their home directory. Your app would then read the api key from the file and sign all its requests with it.

If it is more of a analytical type of service, then you don't have much of a choice. Even Google Analytics will contain spam data in it. First focus on reducing the amount of spam before you try to completely kill it off. If someone is sending 5+ requests a minute from an IP, block them temporarily. First institute a 5 minute ban, then 15, then so on. When an IP gets banned, log it and flag the other entries that were made just before it, and you should be able to filter out as much bot data as possible. If you were to go even further and stick your site behind a service like Cloudflare, you could use their API to block the IP address, ensuring that the request doesn't even hit your server after it's banned.

However if the "troll user" has access to a large enough bot-net, even big companies are susceptible

score 4 · Answer 4 · answered Jul 07 '17 at 23:16

Accepting only one submission per ip address is not a very effective solution - ip addresses are not statically tied to individuals. Device fingerprinting (particularly when combined with other approaches such as evercookies and ASNs) give a better indicator of who is behind the IP address. There is some discussion of what appears to be the same problem here. The way in which you would apply fingerprinting to an app is rather different from a browser though.

However what is missing from the discussion there and also in this question is any description of what the value is in the service, nor the impact of people submitting multiple requests when they are only expected to make one. Consider this site. Anyone can sign up for an account - but we don't get the right to change other peoples' submissions without proving we can add value to the service as a whole. Actions are attributed and everyone gets the chance to show their [dis]approval and comment. Wikipedia has a similar, but more formal approach to content management. A very effective way to deal with sloloris and dns amplification attacks its to simply have the capacity to handle the extra load without it impacting your other users. There are limits to what is capable with this approach, but if the only impact here is extra storage, then adding more capacity is not all that expensive (but you are limited by your choice of a sqlite backend).

If the defining characteristic of an attack is a high rate of submissions from the same ip address or relating to the same account, then you have a basis for detecting the behaviour automatically - and reduce the impact on your site by treating them differently - e.g. just responding with a random value instead of hashing and storing.

If you provided more information about how this person's activity impacts your site and the nature of the service, perhaps we could give more specific advice on a wider strategy.

The server is something like a "server browser" for games, where the client app sends the server the users name, telling everyone else that this guy is hosting. I found a public api that checks if the opportunity address from the sender is behind a VPN or even a bot, so I can now filter out and block those. Also I only allow 1 submission per IP, any new submissions just overwrite the previous one. I'm hoping blocking vpns will take care of the spammers, and won't require a client patch either. — coltonon, Jul 08 '17 at 02:23

Goose · Answer 5 · 2017-07-08T05:06:25.257

2

I noticed you mentioned in a comment that the IPs are similar. If this is the case, you can ban the IP range he appears to be using.

If he's dedicated, he can use IPs outside of that range. As crovers said, the best solution is to authenticate the user, using something like gmail, facebook, or SMS.

You can require the client to do work, such as QuadmasterXLII suggested with HashCash. Just ensure that the client is doing significantly more work than the server and that the work requested of non malicious users is acceptable.

Or you can fight a continuous battle of patches to stay one step ahead. These are the options I see in order of which I would try them.

edited Jul 08 '17 at 05:06

answered Jul 07 '17 at 21:13

Goose

1,394
1
11
17

This solution is rather intrusive. Some legitimate users may be unwilling to use Google™ and Facebook™ and _portable surveillance devices._ Using either of that is, fortunately, not required by law yet, and there are very real downsides when using that, which matter more for some people. – Display Name Jul 08 '17 at 14:19
@SargeBorsch Agreed. I'm one of those users that refuse to login with Facebook or Google. Security is often about trade offs though so I offer it as an option, because maybe it makes sense for his user audience. – Goose Jul 08 '17 at 14:22

score 0 · Answer 6 · answered Jul 08 '17 at 19:48

General tip: anything you put in the client, they can simulate.

I'd use something like a captcha, but it's just a background thread that routinely submits the get request, there is no user interaction with it at all.

But unless you're making a virus, someone intentionally sets this up. At that point, you can have them register the client and enter a CAPTCHA as part of the process. (E.g. each client gets a session token from the server, which just stores whether the CAPTCHA was filled in correctly.)

There aren't an infinite number of IP addresses one can use. In IPv4 you'll want to ban IP addresses that are used by a spammer and flag its subnet. In IPv6 you'll want to ban /64s and flag its subnet.

For example if 80.100.131.150 was spamming you, you'll ban that address and flag 80.100.0.0/15 (on Linux whois 80.100.131.150 | grep ^route will show you). Often, home IP addresses are rotated nightly and VPS owners can get a new one assigned (either by picking one, or at random), but the ISP owns a finite number of subnets. This way you force the spammer to take action to get the IP address refreshed, and once they do, they'll probably still be in a flagged subnet.

The subnets with flags get extra CAPTCHAs. Not one but five CAPTCHAs upon registration (or something like that). Once a subnet collects two or three flags, ban it completely. Bonus points if you email the abuse address of the subnet.

There are more finite resources you can do this with, like email addresses. Small effort for users to activate their email address, but a spammer will get tired when they have to enter twenty CAPTCHAs because both their subnet and email domain (e.g. mail.ru) have flags.

Note that there will be casualties. Anyone using the same ISP as a spammer will get banned as well (once it collected enough flags).

Protect PHP hashing algorithm (from bots)

6 Answers6

HashCash