16

It's increasingly common to see major attacks on governments and corporations attributed to a specific country or group. Typically we don't know for certain, but it's at least suspected.

Given the general anonymity of the internet and the ability to hide behind proxies, exactly how do security experts go about determining exactly who perpetrated an attack?

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 1
    A global passive adversary could see where the attack originates and compare it to the same behavior at the destination. Of course, the adversary need not be literally *global*, but broad enough that the attacker's origination point can be detected. Law enforcement can subpoena ISPs to achieve this, whereas individuals and corporations need more clever techniques, and perhaps can never be as certain as law enforcement. – James Mishra Feb 09 '15 at 20:46
  • 1
    Note that attribution is different than blame. For the former a body of evidence is required. – SilverlightFox Feb 10 '15 at 15:02
  • 1
    The term is "attribution". You want to know how they can attribute the code to a certain party. – schroeder Feb 28 '21 at 10:59

6 Answers6

15

As someone who runs personal honeypots and used to defend a massive global corporation, I can tell you that any attack leaves fingerprints. Styles of commands or command sequence, coding style of malware, as well as the paths used by attackers can all point in a direction of an attacker.

For example, I was able to positively identify someone trapped in my honeypot because they used their real name as their password (they didn't know I was recording their keystrokes). Using various correlation methods, I was able to attribute the pseudonym they were using on the site they used to distribute malware to their name, including finding out that they used the pseudonym on a single's site 2 years ago that they had deleted (but Google's long memory did not forget).

Once you start studying live attacks, you can really start to see the people "behind the keystrokes" and that's one reason why I continue to operate honeypots. I think I can tell whether an attacker is Asian or Eastern European, simply by their methods, and not by their IP. If I had enough data from a known attacker, I believe I would be able to recognize their actions in a new environment.

schroeder
  • 123,438
  • 55
  • 284
  • 319
5

There is no good way to determine clearly who made an attack, or even if an attack was performed by a nation-state, or as Bruce Schneier puts it "A couple of guys".

That we live in the world where we aren't sure if any given cyberattack is the work of a foreign government or a couple of guys should be scary to us all.

For physical attacks, if a tank comes rolling into your country, you know it's the army of a nation-state because people don't have tanks. The same isn't true for cyber attacks. Nations and "A couple of guys" use the same tools for cyber attacks.

Also remember that sophisticated attackers aren't dumb. They'll be deceptive with the IP address they use, so you can't rely on that. If an attacker can obviously be physically in the US, but control a set of computers in Russia or China to start attacks from. Largely the attribution is done by motive, and who's interested in spying or attacking whom.

Some will try to use the tools available and attribute them to specific attackers. That was what some were saying with the Sony hack and tying them to North Korea, but there is widespread disagreement within the security community about this.

Steve Sether
  • 21,480
  • 8
  • 50
  • 76
  • There is one counter example to your tank analogy, which is "Russia's" "little green men" in Georgia. –  Feb 12 '15 at 17:42
  • @Ian Yes, that's a good counter-example (Though think you mean Ukraine). But its another example of subterfuge where a government tries to hide the fact that they did it though using common weapons. Of course it didn't work out so well since humans aren't easy to disguise like code is. – Steve Sether Feb 12 '15 at 18:04
  • No, Georgia was the first instance of Russia and the little green men, Ukraine is just a repeat offense. –  Feb 12 '15 at 18:06
4

Computer-based attack attribution works like the attribution of any other illegal activity: it requires a significant amount of investigation, gathering clues, corroborating information, attempting to eliminate false leads and recognize right ones, etc.

On the attackers' side

The attacker may cover his tracks using two main techniques: plausible deniability and false flag.

Plausible deniability

Plausible deniability aims non-attribution by making the attacker's identity unclear. It relies notably on using off-the-shelf and widely available tools and techniques, and carefully removing all metadata or potential clue.

CIA's Development Tradecraft DOs and DON'Ts from the "Vault 7" leak is a perfect example on how to implement plausible deniability in malicious software.

False flag

False flag (in the case of a government entity we can also talk of a black ops) aims misattribution by voluntarily and actively forging clues designed to deceive investigators (or simply the targets) into attributing the attack to a different actor and/or a different motivation.

The least subtle ways of doing so would be by using the impersonated author's IP range as the attack source, starting a malware spread in its country or reusing malicious software which have already been publicly attributed to him.

Depending on who the attacker attempts to deceive, more subtle ways may be needed. A good example is a consequence of @Schroeder's answer. Running a honeypot, he is able to fingerprint an attacker:

Styles of commands or command sequence, coding style of malware, as well as the paths used by attackers can all point in a direction of an attacker.

The very same knowledge can then be maliciously reused to mimic those lesser known and not publicly available fingerprint information and let the adversary attribute the attack to an origin of your choosing.

On the investigators' side

The investigators may have access to various kind of information, gathered either passively or actively, depending on their position.

Investigator's main hope is attacker's mistakes, and the longer and more complex is the attack, the higher becomes the probability for such mistakes. It is also common, as part of the live investigation, to somehow "push" the attacker to the mistake by isolating him on a specially designed network or providing him specially crafted data.

A classical example of such crafted data is Clifford Stoll's story, one of the first historical forensic study dating back to the 80's, in which among other things Cliff generated a bunch of fake but large documents to incite the attacker to keep the connection open long enough to allow to track him back.

Private entities

Analyzing the data directly related to the attack is always possible. This involves analyzing the tools, the actions, but also any other information available around the attack.

For instance, it may worth noticing the Alex Tapanaris case who forgot to remove his name from a .pdf document announcing operations from the Anonymous group. It may sound silly, but mistakes are often silly details that were left overseen because, sooner or later, anyone does a mistake (the attacker may be under pressure, in a hurry, sick, etc. Yes, attackers are human beings too!).

Some attacks are claimed by an arbitrarily named group. In this case the information from all movements, every target included, made by this group can be merged to increase the chances of finding such mistake which could allow attribution to a physical entity behind the "branding name". That's why some group names, indeed used as branding for propaganda, remain tight to only one and a single operation.

Police departments

Police departments can resort to more information collection techniques.

They have access to Internet providers logs, may have access to servers storage as long as they are located in covered jurisdiction (but this is usually unlikely), and have also access to financial tracking abilities for cases involving money (ransomware for instance).

But they can also gather information actively. The CIA guide mentioned above existed for a reason, and implanting malicious software in the data collected by the attackers, on their own C2 server or in a website identified as used by them (such as a forum or a files repository) is an effective use of spying software as part of a criminal investigation.

Government intelligence agencies

For largest cases, a government can resort to counterintelligence techniques. We began to touch on such techniques when mentioning the active investigation above. Here we will find the same techniques, but pushed way further and with less legal restrictions as entering into the "national safety" realm.

Counterintelligence also includes more-or-less "passive" large-scale data collection techniques (it is not so passive per-see as the information collection systems are often malicious software actively implanted in various targets and critical infrastructures). Such data constitutes a database which can be queried in various ways for instance to coincide the movements and research made by some identified individuals with the movements and techniques used by an unidentified group of attackers.

Conclusion

While attack attribution relies on a body of evidence of various nature and origins, when you read announces closely you see that attack attribution are more often "suspicions" than a definitive attribution . Notwithstanding the efforts deployed, due to the very nature of computer-based attack there is usually no material evidence (in the legal sense of the term) available.

Moreover, public attribution may also sometimes be tainted by diverging interests, like the race between private security companies to be the first to publish an analysis of a particular attack or political agenda which may influence governmental communication.

The best advice at the end could be to stick to the facts, avoiding over-interpretation or blindly trusting an affirmation just because of a nearby logo.

Always keep your head and your own judgment.

WhiteWinterWolf
  • 19,082
  • 4
  • 58
  • 104
2

There's one thing to keep in mind as well: the simple fact that hackers are hackers. Some work for governments directly yet moonlight for organized crime; some work for organized crime and are contracted by governments for a short period of time. In that environment, it's challenging to really properly attribute an attack based on who's doing it.

In that case, it makes more sense to attribute based on what the goal is. Nations want to bother their adversaries; organized crime sees itself as a business and really just wants to make money.

If BadGuy123 hacks a power grid but doesn't do anything, he's likely either doing reconnaisance or working for a nation-state. If he hacks a power grid and then demands money to go away, most likely he's working for organized crime. If he hacks a power grid and causes a little mischief, he may be advertising his skills. If he hacks a power grid and causes a blackout to a major metropolitan area, then most likely that's a nation at work.

Different groups have different attack signatures as well. Some don't do a very good job of covering their tracks. Some deliberately remove info that is not complimentary about their country. Some target specific countries or exclude certain countries. By looking at the pattern, it's possible to get an idea which country is behind an attack.

That said, in contrast to WWII, BadGuy123 isn't wearing a uniform and advertising his allegiance. So while we have ideas about who is behind what, it's really very difficult to know for a fact.

baldPrussian
  • 2,768
  • 2
  • 9
  • 14
1

From the sophistication of the malware code

State-sponsored groups are backed by governments which are usually willing to invest a lot more money than an individual or even a group of criminals will invest into developing their cyber arsenal. This means that they will often have more skilled/experienced and a larger number of developers. This larger group will be able to develop malware with more and better features (ex. better ability to evade endpoint detection and response software, ability to exfiltrate data more stealthily, better anti-analysis features) than malware developed by the typical lone black hat. Thus if a particular malware has advanced capabilities and sophisticated code, it is safe to say the malware was the work of an advanced group, usually a state-sponsored APT.

Targets

The people targeted by the malware often says a lot about the threat actor behind it. If malware infects machines everywhere indiscriminately, its often the work of a criminal group looking to infect as many people as possible, to maximize their profit. However, if the malware is found selectively on the devices of high level diplomats or journalists, then this is likely a campaign by a state-sponsored actor looking to gain inside information on another nation's diplomatic activities.

Reuse of code and infrastructure

Malware development takes time and effort. So once a group has developed a particular set of capabilities, they are prone to reuse or repurpose the same code. If a malware analyst can identify significant similarities between code in a newly discovered malware sample and a previously discovered or leaked malware from state-sponsored groups, it's easy to make the connection.

Infrastructure (i.e command and control servers used to distribute, control and update malware) is often reused too. Often this is because (anonymously) obtaining and setting up new servers is too much of a hassle to repeat in every new campaign. New malware that uses the same infrastructure as previously attributed malware is a big give away too.

Metadata in files

If the above techniques do not provide sufficient clues to determine the exact group behind a piece of malware, analysts will look at metadata in the malware's executables/payloads. For example, when an executable is compiled, the compiler usually attaches a timestamp to it to indicate the time of compilation. If sufficient samples containing timestamps are found, they can be used to infer the time zone from which the group operates. Compiled executables can also contain file paths from the attacker's machine, which can provide further clues.

The catch here is that metadata can be forged in order to mislead analysts about the origin of the malware, and hence less confidence should be placed in them when attributing malware.


These are some of the more obvious and reliable techniques of attributing malware to state-sponsored groups, but this is nowhere near an exhaustive list.

schroeder
  • 123,438
  • 55
  • 284
  • 319
nobody
  • 11,251
  • 1
  • 41
  • 60
  • I think it's fine here. That's why I merged the Q's – schroeder Mar 01 '21 at 08:36
  • @schroeder Is it? My answer focuses on how attacks are identified as state-sponsored in the first place, while this question only asks about how the attack is attributed to a specific nation. – nobody Mar 01 '21 at 09:02
  • To be able to know that it's state-sponsored requires the same investigation as determining which state. To know one, you end up knowing the other. – schroeder Mar 01 '21 at 09:31
0

My answer is based upon a home network being attacked.

Cross-referencing time stamps, analyzing hops, motive, payoff, method (MD5 and SHA1 of malware used).

From my work in attribution, I look at the wireshark feed, export as CSV, use an excel macro to isolate variations in IP for IP dissection using IPtables, pull the most common value for the match to time stamp if I have an idea of when the attack occurred, re-analyze the data (raw csv from wireshark) and pull the anomalies to see if the occurrence of the final suspected IP comes up and then I compare the two analyses in excel by plotting them in a chart. If I think the attacker is still in my network I pull all RAM using Access Data FTK imager and also flash the hard drive for forensic analysis.

Next step is to "follow your own footsteps" which means reopen your browser and go to each site you were at one by one while also pulling the "netstat -n" output each time.

Finally you look at all analyzed data and draw your supported conclusion, though it is 'supported' and not final.

For business, governments and corporations I imagine they have much more successful and efficient methodologies in place.