Well first, I think it's good to keep in mind that the "sophisticated attack" trope is one of the most common ones that's trotted out today by company executives (and government agency heads) who are acknowledging large-scale compromises. Sometimes, once more facts about how a breach actually occurred become known, it turns out that the attacker really was sophisticated in its methods. But very, very often, that turns out not to be true. Since one might therefore wonder about whether Yahoo!'s "nation-state" comment should be taken an interesting variation on this standard-procedure P.R. strategy versus a solid product of investigation, it's not necessarily wrong to start with some skepticism about whether this is a nation-state operation at this stage.
(Of course, 100 percent of the time the implication being made by a hacked organization with the "sophisticated attacker" point is that the organization could not possibly have been reasonably expected to prevent what happened. Because "sophisticated". And there is almost no doubt that Yahoo's "nation-state" comment is similarly intended to serve a buck-passing function in this manner. But I digress...)
Still... Yahoo's statement could well turn out to have some real basis. So it's certainly useful to talk about how an investigation might start to determine whether an attack was from a "nation-state actor", and what that even really means.
To start to answer the question: There are all kinds of ways that investigators can look at evidence related to a hack and start to draw some conclusions about whether a nation-state actor (intelligence agencies, mostly) or somebody linked to a nation-state (like a private cyber "militia" that a government covertly gives support to).
One factor that gets looked at, just as in investigations of crimes and disputes of every conceivable kind, is motive. In cyberattacks whether the attacker wouldn't have an obvious and easy way to profit financially from the attack, looking at whether a nation-state actor or nation-state-linked organization might have had reason to pull it off can make some sense. In particular, when we're talking about attacks that happen against the U.S. government, U.S. interests, or major U.S. companies, some attention immediately starts to shift to whether a group connected to one of the four persistent U.S. adversaries who have very active in-government or state-linked offensive cyber units--Russia, China, North Korea, and Iran-- might be involved.
Another factor that one might consider is how "sophisticated" the attackers would have needed to be to conduct the attack the way they did. What Tools, Tactics, and Procedures they used. And especially whether in their attack they used Tools, Tactics, and Procedures that haven't been widely used before in the information security community. Developing, or buying, novel attack methods and assets that are often used in genuinely-sophisticated attacks takes resources. Lots of resources. (Money, access to highly-skilled people, etc.)
Now, what are some organizations that have the most such resources, strong motivations to do lots of information gathering against other organizations, and also effective legal impunity (well, for all intents and purposes) in conducting attacks? Governments. Governments do. Therefore, when you see an attack that actually did require some advanced capabilities many people tend to jump to looking at nation-state actors
Of course, the problem is that not all technically sophisticated attacks are done by nation-state actors. Far, far from it. A corollary of that is that hackers working for nation-state units very, very often use completely common & mundane methods against less well-defended targets that don't require cutting edge attacks. Which is the vast majority of them.)
Which leads us to a third, and often much, much more reliable factor for attributing an attack to nation-state actor or nation-state linked group: clues gathered about the actual electronic infrastructure and software an attacker used.
Put bluntly, people are sometimes lazy and/or careless and make sloppy mistakes; nation-state hackers among them. And they leave behind clues during one attack that links that event to other attack/s. Including attacks where the party responsible is already either known or strongly suspected.
How do they get lazy?? Well, among other ways...
They reuse the servers they launch attacks from, command-and-control servers, servers they exfiltrate hacked data back to, proxy servers, and other elements of attacker infrastructure between different targets. Ideally, you should use different infrastructure for each attack effort you carry out. But that's a nuisance to do, and it's easy to just keep reusing it. Which leads to things like the OPM hack being initially detected by the U.S. government when Chinese hackers reused a command-and-control server that was already known to the feds' Edison intrusion detection system as malicious.
- They reuse malware and other tools between different attacks more often than is wise. (Admittedly, this must be a hard one to get away from; nobody has the resources to develop or buy completely new versions of all their malware, exploits, attack tools, methods, etc. for every one of hundreds/thousands/more targets that an attacker goes after.)
- They unnecessarily leave tools that shouldn't be exposed, exposed, and someone finds them. (Ahem, looking at you, NSA.)
And, no doubt, in countless other ways.
In short, they unintentionally leave clues behind that investigators find.