1

According this YouTube video:
https://www.youtube.com/watch?v=oIkhgagvrjI&feature=youtu.be&t=7m19s
YouTube Videos views are frozen at 300 until they're verified, sometimes at 301 or even up to 310 due to multiple same requests at the same time.
in that case it is not a big problem but let's assume the following scenario where bob has an account with a $10 balance and he's making 2 withdrawal requests at 23:20:00:999ms

Dirty Bob
Question:
How does a server side programming language handle this type of request?
can two requests access a variable or any sort of data and change its value at the same time?
is the first arrived request to be handled first and then the second waits for its turn?

Question:
could it happen only by chance or 100% of the time?

Question:
What type of vulnerability is this? does it have a specific definition?

  • 2
    The server is usually using transactional database, so at a time only one transaction is performed (they are serialized). – Aria Aug 13 '16 at 21:05
  • yea i knew about that, but that was an example –  Aug 13 '16 at 21:21

2 Answers2

2
  1. The way that a programming language runtime ensures that concurrent requests that target the same state don't interfere with one another is ultimately through a mechanism called "compare-and-set" which has to be offered by the hardware on which the code is running.

    "Compare and set" works as follows- two threads say to the hardware- "compare the value of variable X to 300, then if equal, set X to 301"- and the hardware will ensure that only one of those instructions is executed at a time. The first will return a new result for variable X, the second will not.

    That said, there is much, much more to the story. The compare-and-set mechanism is offered by CPUs in the context of data in registers. That data still has to make it into RAM and thence durably stored on a disk. There are an infinite number of ways in which builders of a system may fail to orchestrate all the moving parts required for this machinery to complete successfully, leading to the accounting going wrong.

  2. Nothing happens by chance, but there are many classes of design, implementation and operational flaws that can cause concurrent systems to behave with less than desired consistency.

  3. There is no single name for vulnerabilities of this kind, and not even a single name for the subfield of computer science in which work relevant to this kind of problem occurs. Concurrency, distributed systems, consistency models, transaction isolation- googling any of those terms will yield rich veins to study to learn more.

    In the context of the two cited examples- YouTube video view counts, and bank transactions- what's important to understand is that these two problems have different requirements.

    It doesn't matter so much if YouTube freezes views at 300 or 310 or even 500 or some other magic number. The limit is arbitrary and was chosen by google to optimize their workflow. Slight miscounts have no economic or other impact.

    Miscounts in bank transactions, on the other hand, of course have a direct economic impact.

    It's also important to understand that the work required of programmers and distributed system engineers to get the accounting exactly right is significantly different and much more difficult than to get the accounting approximately right.

    In sum, these two problems have different requirements, and have different solutions.

    The lack of consistency in the solution behind YouTube view counting is a reflection of requirements, not an implication that other solutions at Google or in any other accounting context, behave the same way.

Jonah Benton
  • 3,359
  • 12
  • 20
1

How does a server side programming language handle this type of request?

Most programming languages don't deal with this. Instead, this problem is generally solved by databases, which is designed to solve these kind of issues using transactions.

Now the question shifts to, how do databases solve this? A variety of ways, older relational databases used to lock records so that only one request can modify a particular value at the same time. Modern relational databases used a technique called MVCC (multi version concurrency control), which used a log-based data structure to let multiple readers and multiple writers access the same record safely and concurrently, and to detect and rollback when there is update conflicts. Some non relational, distributed databases drops the requirement for strict data consistency, and instead permits inconsistent data temporarily, with "eventually consistent" guarantee.

At the most basic level, the primitive used a CPU instruction called compare-and-swap. At a higher level, you have a data structure a write ahead log/journal, which keeps track of the changes in an append-only file before the changes are incorporated into the current data structure. At distributed infrastructure level, you have various consensus algorithms like Paxos or Raft, and a number of specialized distributed ledger to ensure consistent accounting across multiple machines.

YouTube view counter likely employs a distributed infrastructure with "eventually consistent" counter rather than a fully consistent counter. This means that at any one time, the counter on each replica may never reflect the real universal view count, but supposedly if the world stops accessing YouTube and the system are given a bit of time to reach a stable state, the distributed infrastructure would eventually converge to the real view count.

can two requests access a variable or any sort of data and change its value at the same time? is the first arrived request to be handled first and then the second waits for its turn?

With databases that employs locks, yes, only one request can access one data at a time, the request that comes later had to wait for the first one to finish. With databases that employs MVCC, handing multiple requests concurrently are possible, but write conflict gets detected and the request that commits later get aborted. With distributed, eventually consistent counter, multiple writers are possible but you sacrifice a consistent view of the data.

Question:could it happen only by chance or 100% of the time?

It happens all the time, not necessarily 100% of the time, but often enough that when an application need to scale, you need to take them into account.

Question:What type of vulnerability is this? does it have a specific definition?

When handled well, there should not be vulnerability. Applications need to define the level of consistency guarantee that it needs. Financial applications generally need strong consistency guarantee, while YouTube view count can afford a looser consistency for greater performance.

Lie Ryan
  • 31,089
  • 6
  • 68
  • 93