Computer-based attack attribution works like the attribution of any other illegal activity: it requires a significant amount of investigation, gathering clues, corroborating information, attempting to eliminate false leads and recognize right ones, etc.
On the attackers' side
The attacker may cover his tracks using two main techniques: plausible deniability and false flag.
Plausible deniability
Plausible deniability aims non-attribution by making the attacker's identity unclear. It relies notably on using off-the-shelf and widely available tools and techniques, and carefully removing all metadata or potential clue.
CIA's Development Tradecraft DOs and DON'Ts from the "Vault 7" leak is a perfect example on how to implement plausible deniability in malicious software.
False flag
False flag (in the case of a government entity we can also talk of a black ops) aims misattribution by voluntarily and actively forging clues designed to deceive investigators (or simply the targets) into attributing the attack to a different actor and/or a different motivation.
The least subtle ways of doing so would be by using the impersonated author's IP range as the attack source, starting a malware spread in its country or reusing malicious software which have already been publicly attributed to him.
Depending on who the attacker attempts to deceive, more subtle ways may be needed. A good example is a consequence of @Schroeder's answer. Running a honeypot, he is able to fingerprint an attacker:
Styles of commands or command sequence, coding style of malware, as well as the paths used by attackers can all point in a direction of an attacker.
The very same knowledge can then be maliciously reused to mimic those lesser known and not publicly available fingerprint information and let the adversary attribute the attack to an origin of your choosing.
On the investigators' side
The investigators may have access to various kind of information, gathered either passively or actively, depending on their position.
Investigator's main hope is attacker's mistakes, and the longer and more complex is the attack, the higher becomes the probability for such mistakes. It is also common, as part of the live investigation, to somehow "push" the attacker to the mistake by isolating him on a specially designed network or providing him specially crafted data.
A classical example of such crafted data is Clifford Stoll's story, one of the first historical forensic study dating back to the 80's, in which among other things Cliff generated a bunch of fake but large documents to incite the attacker to keep the connection open long enough to allow to track him back.
Private entities
Analyzing the data directly related to the attack is always possible. This involves analyzing the tools, the actions, but also any other information available around the attack.
For instance, it may worth noticing the Alex Tapanaris case who forgot to remove his name from a .pdf document announcing operations from the Anonymous group. It may sound silly, but mistakes are often silly details that were left overseen because, sooner or later, anyone does a mistake (the attacker may be under pressure, in a hurry, sick, etc. Yes, attackers are human beings too!).
Some attacks are claimed by an arbitrarily named group. In this case the information from all movements, every target included, made by this group can be merged to increase the chances of finding such mistake which could allow attribution to a physical entity behind the "branding name". That's why some group names, indeed used as branding for propaganda, remain tight to only one and a single operation.
Police departments
Police departments can resort to more information collection techniques.
They have access to Internet providers logs, may have access to servers storage as long as they are located in covered jurisdiction (but this is usually unlikely), and have also access to financial tracking abilities for cases involving money (ransomware for instance).
But they can also gather information actively. The CIA guide mentioned above existed for a reason, and implanting malicious software in the data collected by the attackers, on their own C2 server or in a website identified as used by them (such as a forum or a files repository) is an effective use of spying software as part of a criminal investigation.
Government intelligence agencies
For largest cases, a government can resort to counterintelligence techniques. We began to touch on such techniques when mentioning the active investigation above. Here we will find the same techniques, but pushed way further and with less legal restrictions as entering into the "national safety" realm.
Counterintelligence also includes more-or-less "passive" large-scale data collection techniques (it is not so passive per-see as the information collection systems are often malicious software actively implanted in various targets and critical infrastructures). Such data constitutes a database which can be queried in various ways for instance to coincide the movements and research made by some identified individuals with the movements and techniques used by an unidentified group of attackers.
Conclusion
While attack attribution relies on a body of evidence of various nature and origins, when you read announces closely you see that attack attribution are more often "suspicions" than a definitive attribution . Notwithstanding the efforts deployed, due to the very nature of computer-based attack there is usually no material evidence (in the legal sense of the term) available.
Moreover, public attribution may also sometimes be tainted by diverging interests, like the race between private security companies to be the first to publish an analysis of a particular attack or political agenda which may influence governmental communication.
The best advice at the end could be to stick to the facts, avoiding over-interpretation or blindly trusting an affirmation just because of a nearby logo.
Always keep your head and your own judgment.