10

In our Python app, we are using pickle.load to load a file named perceptron.pkl. A HP Fortify static scan raises a high vulnerability, "Dynamic Code Evaluation - Unsafe Pickle Deserialization", at the same line.

How can I remediate this? Is there any way we can safely load a pickle?

Anders
  • 64,406
  • 24
  • 178
  • 215
Pro
  • 241
  • 3
  • 4

3 Answers3

14

The Python manual comes with a warning about the pickle module:

Warning The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

This warning should be taken very seriously. If you unpickle untrusted data, an attacker will be able to exectue arbitrary code on your system. That is bad. Very bad.

The key part of the quote here is "untrusted or unauthenticated source", though. If you are just loading a static file that you trust no malicious actor would have control over, unpickling is safe. For instance, if you store the file together with your source code and with the same access restrictions, an attacker could just as well modify the source code as the pickle file. So the unpickling does not have to be a security risk.

What if you can't trust the file? You have two options:

  1. Switch to a format not vulnerable to code execution, such as JSON.
  2. Create a restricted unpickler using the find_class method (see the manual).

My guess is that #2 would probably be quicker to implement as it requires less changes in your code. But it is also a risky strategy, as I suspect that it is very easy to make a minor mistake opening you up to a vulnerability. If you want to minimize the risk, I would go with #1.

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 2
    You can also sign the pickle file and check that the signature is valid. But that requires the code to be adapted. – Spack Apr 18 '18 at 12:09
3

Is there any way we can safely load a pickle?

You've asked for any way, but it partly depends not only on the way, but on the pickle in question and on what you mean by "safely."

Unless you mean something like "reasonably safely, given that I'll always know the provenance of the pickle," the answer is probably "no."

However, here are some questions to which the answer is "yes":

  • Can I safely load a pickle if I'm 100% sure that I wrote it and it hasn't been modified in transit?
  • Can I safely load a pickle if the source is trusted and I've checked that the file from which I'm loading the pickle is indeed from that source?
  • Can I safely achieve the same thing as loading a pickle by using completely safe de/serialization logic in the vast majority of cases?

So, the first question to ask yourself is: does the third of these apply to you? Can you serialize and deserialize in a different way?

If not, do either of the first two apply?

If not, there is a project about which I just learned at PyCon called "Pikara" - it aims to "make unpickling objects as safe as it ever is going to be." It's apparently named for a pickled Polish dish; I suggested that "kimchi" might be equally apt. :-)

If an alternative method of serialization isn't ideal for your implementation, you might consider checking it out: https://github.com/latacora/pikara

Another answerer also posted an alternate unpickling method, but I can't vouch for it (although I took a good look today and it is at least interesting - I'll check back when the docs come of age).

If this answer hasn't given you 100% confidence in the way forward, then let me ask a follow-up: what you are actually trying to unpickle here?

jMyles
  • 401
  • 4
  • 12
1

Only the default unpickler is unsafe. You can write a modified unpickler that's safe, or use one that someone else already wrote, such as picklemagic: https://github.com/CensoredUsername/picklemagic

user176454
  • 19
  • 1