First of all, there is a very important distinction between being able to detect a "Snowden-like" actor and being able to prevent one. As far as I have seen, Beehive makes no claims about preventing one, but rather seems to promise the ability to give you alerts that suspicious activity is happening in your network. Sure, not as good, but still considered a "holy grail" in some research communities.
With that said, I'm extremely doubtful that Beehive is able to meet those expectations. Machine learning can do quite well at extracting complex patterns from large piles of data with reliable identities. For example, differentiating between pictures of cats and dogs is extremely reliable; we can all do it 99+% of the time, yet if I had to tell what's the exact algorithm for taking in 100x100 pixels and determining cat vs dog, I have no idea how I would do that. But I can supply you with 100,000 of such images and let ML methods figure out a rule that reliably differentiates between the two based on the values of 100x100 pixels. If I do things right, the rules created by ML should even work on new images of cats and dogs, assuming no huge changes in the new data (i.e., if I only used labs and tabby cats in the training data, then try to get it to identify a terrier...good luck). That's ML's strength.
Determining "suspicious behavior" is a much more difficult issue. We don't have 100,000's of samples of confirmed bad behavior, and we don't even really have 100,000's of samples of confirmed good behavior! Worse yet, what was a good ML method that worked yesterday doesn't work today; unlike cats and dogs in photos, adversaries try really hard to trick you. Most people I know working on ML for cyber security have accepted that the idea of purely automated detection is beyond our grasp at the moment, but perhaps we can build tools to automate very specific repetitive tasks that a security analyst needs to do over and over, thus making them more efficient.
With that said, the authors of Beehive seem to have skipped that lesson and claim that they've solved this problem. I'm highly suspicious of the performance, especially given that the methods they suggest are the first one a ML researcher might think to try and have routinely been rejected as not useful. For example, they suggest using PCA to identify outliers in logs. This, and variations of it, has been tried 100s of times and the result is always that the security analyst shuts off the "automated detection" because they get so many false positives that it costs way more time than it saves.
Of course, in all these methods, the devil is the details and the details of these types of methods never really get exposed in published work ("we used PCA to look for outliers in server logs" is an extremely vague statement). It's always possible that they have some super clever way of preprocessing the data before applying their methods that didn't make it into the paper. But I'd be willing to bet my right arm that no user of Beehive will be able to reliably differentiate between "Snowden-like" behavior and non-adversarial real world use of a network in real time.