I use dancer-shell on a small cluster, which can execute shell commands in parallel on a list of hostnames. If you have a lot of systems, dsh can be impeded by network throughput. It offers a tree topology option to have hosts call hosts and so-on, as long as they all have dsh installed.
Dsh is nice, but it has some drawbacks. If your network or systems are busy, failure will occur, leaving you with some systems in one state and other systems in another. You can try to write idempotent commands and just keep running it until everything comes back ok, I guess. But the common solution is to have some sort of agent which tries to bring the system from it's current state to the existing one; this approach allows it to retry if a transient network error occurs. This is the approach cfengine/puppet/chef take to varying degrees.