1

I thought I was being clever - overly engineering a (just for my learning) mobile app that is scaled out using only distributed technologies (AWS Lambda, S3, and Dynamo DB in this case). I want to allow apps to register themselves in the database, but I'm trying to limit the database queries and writes that I need to perform in case someone tries to make my costs very high by spamming me with database activity. I experimented with a proof-of-work scheme where an api/service call handler generates a signed assignment (all in memory, no database) that the mobile app then has to perform which is later verified by another api call before writing the new registration to the database. Fine, I'll do the work, but they have to do more work which isn't much of a burden for the cooperating client, but which might be a deterrent for a malicious client.

But then I realized that without tracking the submissions, the malicious client can do the work for real the once, but then just spam me for the remainder of the assignment validity window (before it expires) and still require my database lookups. The window before expiration would have to be at least the worst case of my worst-performing supported client hardware which would leave plenty of time for trouble.

Further, because my database is merely eventually consistent, I don't know how to avoid double-issued requests to isolated replication nodes in any case.

There's no shared state or even host affinity between the event-driven lambda api processing. Dynamo DB is (I think) only eventually consistent. MemCache or Redis (offered by ElastiCache).

These are relevant:

Is there any benefit at all to adding a proof-of-work? Or is any additional complexity merely deferring 100% of the problem that is unavoidable (while also introducing new points of failure)?

Are there any anonymous registration schemes that are compatible with eventual consistency and requiring more of the registrant than the registry?

Jason Kleban
  • 207
  • 1
  • 7

1 Answers1

1

You are asking several very different questions.

I'm trying to limit the database queries and writes that I need to perform in case someone tries to make my costs very high

You should have a gateway (e.g. Amazon Lambda) that supports rate limiting. With this toy in your hands you'd be able to configure number of allowed requests. There are generic gateways available if standard stuff is not suitable or available, e.g. Umbrella.

... how to avoid double-issued requests to isolated replication nodes

You do this by configuring write quorum. More nodes in the quorum - better consistency but slower write speed (this is known as CAP theorem).

There's no shared state ...

This is by design, shared state is a performance killer especially in distributed environments.

oleksii
  • 1,046
  • 1
  • 9
  • 19
  • Thanks. I'll have to find how to enable rate limiting for aws Lambda endpoints - it's not immediately apparent from either the Lambda nor the API Gateway control panels. Hopefully it can limit per client and not across all clients (or dos just becomes easier). I'll look into how to implement a write-quorum in aws dynamo. I realize that shared state is a killer, which is why I'm trying to learn techniques for avoiding shared state, and which compromises are unavoidable and which are possible with the right configuration. – Jason Kleban Jan 11 '16 at 14:29
  • So are you saying that with per-client (per ip?) rate-limiting that a proof-of-work scheme would potentially provide real benefit to a distributed registration scheme? – Jason Kleban Jan 11 '16 at 14:53
  • I think yes, there is a benefit because even a read of the database to ensure that it hasn't been written before writing is still cheaper than actually writing, and putting a cost to the client for writing (allocating a new account) still makes sense. – Jason Kleban Jan 15 '16 at 15:36