0

I have edited a PDF file in Acrobat, which left some metadata and I am wondering:

  • What metadata might be possible to discern from the end file if I were to send it to external parties?

  • What actions would be required to be taken to strip not readily visible metadata from a PDF file?

Sir Muffington
  • 1,447
  • 2
  • 9
  • 22
Ohan
  • 101
  • 1
    This is a very wide topic. Usual metadata include document properties (your name, editing time, etc.) and maybe the ID field at the end of the file. Other data not immediately visible could be stored in the XMP xpacket object. Then there's several different techniques to embed information so that it's not immediately visible and is edit-resistant, for example to detect tampering or track document sources. – LSerni Jan 30 '22 at 17:51
  • 1
    This is a very broad question. It starts from asking what metadata might be there, then asking for the *most comprehensive guide or utility* for stripping metadata. The latter is essentially asking for recommendations for external tools/resources - which can change over time, is subjective and therefore off-topic here. – Steffen Ullrich Jan 30 '22 at 18:02

1 Answers1

1

Most of the metadata you should be worried about resides in 2 places:

  1. Document Properties:

Document Properties

You can just delete/remove those from that windows. Also, there's the "Additional Metadata" button, that mostly includes information such as which software was used, possibly editing time, authors etc. Go ahead and remove those too.

  1. File Properties when you "right click" on the file and click on "Properties" and then "Details" (assuming you are using Windows). There is an option to have the metadata removed.

Since you mention Adobe Acrobat in particular, there are 2 options to have the software do the clean-up for you. I am attaching a capture from an older version, but it shouldn't be much different in newer versions.

Document Protection

The "Sanitize Document" just goes ahead and removed everything (even hidden text). The "Remove Hidden Information" analyzes the document and lets you choose what to remove. There is a catch though. In some case, if on the same page exist both images and text, the whole page is flattenned and converted into an image, along with some quality loss.

ARGYROU MINAS
  • 111
  • 1
  • 10