8

Every month, I receive an encrypted Portable Document Format (PDF) file with my payslip. I can open the file for reading with my password. Without providing my password, I cannot open the file at all.

Does that mean whoever created the PDF file has access to my password in a recoverable format? Or can they encrypt the PDF by using the hash or some other non-recoverable function or pair of my password, similar to how public/private key encryption works?

The format is PDF-1.4.

Note that, as I understand it, my question is different from this question. The other question is about a PDF that can be opened with restrictions, such as restrictions to copy-paste or printing. My question is about a PDF file that cannot be opened at all before decryption. Secondly, the answers there do address that the security is weak, but do not answer my core question whether this encryption implies the creator of the PDF has access to a recoverable password (plaintext or encrypted).

Edit: I mean to ask whether it is stored on their system. I do not mean to ask whether it is stored in the file.

gerrit
  • 1,829
  • 1
  • 17
  • 26
  • 2
    May be my [answer here](http://security.stackexchange.com/questions/95781/what-security-scheme-is-used-by-pdf-password-encryption-and-why-is-it-so-weak/95784#95784) could be helpful? –  Sep 29 '15 at 10:59
  • @Begueradj As I understand it, that question is about PDF locking (limiting functionality), not about PDF encryption (unable to open at all prior to unlocking). Does your answer there apply to both cases (limiting functionality, and proper encryption)? – gerrit Sep 29 '15 at 11:09
  • @Begueradj Although the [document linked from there](https://www.cs.cmu.edu/~dst/Adobe/Gallery/anon21jul01-pdf-encryption.txt) does appear to suggest that the answer to my question is *No, it doesn't.*. – gerrit Sep 29 '15 at 11:25
  • @gerrit please note that that particular document refers to PDF-1.3 (Adobe Acrobat 4.x and below) – feral_fenrir Sep 29 '15 at 11:43
  • Why *wouldn't* the creator of the PDF have the encryption password, if they used it to encrypt the document in the first place? And why would anyone *other* than the creator know how or if they were storing it afterwards? – Xander Sep 29 '15 at 11:44
  • @Xander Perhaps the PDF is encrypted using a hash/key calculated from the password, and perhaps they store only that hash/key. – gerrit Sep 29 '15 at 11:47
  • @Xander In theory, one could architect system which would upon decryption try both the password and (after fail) it's hash. This would allow someone storing only hash of user password to create a file which can be decrypted with original password. But I highly doubt something like that is implemented with pdf. – Cthulhu Sep 29 '15 at 11:47
  • Based on the available doc, it looks like the pdf creator needs access to your plain text password since without it they can not encrypt the file (since it is symmetric RC4 encryption and so encrypt and decrypt key are same). BUT that does not mean that they are storing your password in clear text. They can still encrypt the password and then stored it in the database. – jhash Sep 29 '15 at 12:09
  • How could we know how they `store` your password? Are you asking if they have access to your clear-text password when they create the PDF? They must. – Neil Smithline Sep 29 '15 at 12:39
  • @NeilSmithline I'm not asking if they store my password. I'm asking if, from the information provided, it follows that they necessarily *must* store my password. If they must have access to my clear-text password when they create the PDF, then it follows they also must store my password in a recoverable state. – gerrit Sep 29 '15 at 13:24
  • Dumb question… is it necessary to store anything at all (we are talking the user password here, the one used to open the file)? The password has to be entered/defined when the document gets encrypted, and it has to be entered again for decryption. – Max Wyss Sep 29 '15 at 16:57
  • @MaxWyss I've configured my password on the site once, months ago, and since then they send me a document every month, that I can decrypt with this password. So they need to store something somewhere. – gerrit Sep 29 '15 at 17:41
  • OK, this sounds like either a dynamic encryption, using your password in their records, or they are using a certificate (which you created when you specified your password. In any case, the password is not saved in the document, but in their system. – Max Wyss Sep 29 '15 at 18:42
  • Why the downvote? – gerrit Sep 30 '15 at 11:45

1 Answers1

7

Whether this encryption implies the creator of the PDF has access to the plaintext password?

In my opinion, the answer to your question is: They are storing your password in a recoverable format.

I'm assuming this is the scenario (because you were not clear about it): You have account on a website. You are generating a PDF document using the application. This document can be opened (Note: Password to open and not password to edit/copy/print) with the password of your account on the website. Is this correct?

Encrypting a PDF is done usually using RC4 or AES, a symmetric algorithm. Which means a key is generated which is used to encrypt and decrypt your PDF document.

Now this key has to be generated using your password and that key is used to encrypt the PDF. Only them would you be able to input your password and decrypt it.

Hope this answers your core question as to whether your application's password is being stored in a recoverable format.

Edit: Please refer the following additional documents:

  1. PDF Reference for PDF-1.4 - Page 71 onwards
  2. PDF Security
  3. The undocumented algorithm

From these three links, this is what I've gathered:

  1. PDF-1.4 uses RC4-128 bit keys for encryption.
  2. Adobe's reference manual says that the algorithm used for generation of the key is undocumented.
  3. The encryption key uses the user password (password to open), owner password (password to edit) and the permissions to generate the password.

Additionally from the information you have given, I do not think that the key is initially created and stored. Most applications use a plugin module to create PDFs which would generate the keys and encrypt on the go.

Thus, I think your password is stored in a recoverable format.

feral_fenrir
  • 713
  • 5
  • 15
  • Your assumed scenario is close — it is my employer sending me sensitive but non-critical information (e-payslip). The password is not used for any other purpose. Is the key calculated from the password exclusively, or from a combination of password and other information (that changes between documents)? If the former, couldn't they encrypt it using only the key, eliminating the need to store the plaintext password? – gerrit Sep 29 '15 at 11:48
  • To summarize, virtually any question in the form of "Does X mean that Y is storing my password in plain text?" can only be correctly answered "No." So, regardless of the specific wording, any answer anchored on the word "Yes" is misleading at best. – Xander Sep 29 '15 at 12:21
  • @Xander Thank you for feedback. I agree with your points. My understanding of OP's posts from his edits and feedback was that his query was about whether the creator of the password is using his password of the hash of the password to encrypt the PDF. My answer was directed to clarify that. – feral_fenrir Sep 29 '15 at 12:30
  • @gerrit Please add information from your comment to the question. A series of PDFs encrypted over a period of time, all using the same pre-determined password is a very different scenario from a single PDF, where the password could simply be thrown away afterwards. – Xander Sep 29 '15 at 12:32