16

I have been analyzing a PDF I suspect to contain malicious content. For the most part, I've always trusted automated tools to determine if a PDF was safe to open. However, my eyes have been opened to all the encryption and obfuscation techniques in the wild today. So I've started reviewing my PDFs manually using tools like these, and PDFStreamDumper. I have looked at the PDF specifications located here.

Everywhere I look, no one seems to explain the purpose of the /<Abbreviation> directives. For example, the excerpt found in the header.

I can't find what /JT references. Or why the /GoTo does not specify a location.

The second object specifies /Cn and /V but I can't find these either.

The third object /Dt and /JTM, have no reference in the PDF specification. Can someone give me some direction. I'm willing to do the research but I'm not sure what I'm looking at besides abbreviated commands contained in an object. Is there a cheat sheet with these directives listed and their purpose?

Header

<<

    /JT 2 0 R
    /OpenAction 
    <<

        /D [ 9 0 R /Fit ]
        /S /GoTo

    >>

    /Outlines 8683 0 R
    /PageLabels 8875 0 R
    /PageLayout /SinglePage
    /PageMode /UseOutlines
    /Pages 5437 0 R
    /Type /Catalog
>>

Second Object

<<

    /A [ 3 0 R ]
    /Cn [ 4 0 R ]
    /V 1.1
>>

Third Object

<<

    /Dt (D:20101223094432)
    /JTM (Distiller)
>>

Note: I did run the file through Virus Total with just a few red flags. The pdf conforms with the 1.7 specs.

techraf
  • 9,141
  • 11
  • 44
  • 62
Ccorock
  • 263
  • 1
  • 6
  • One thing you could try is to scan it with an antivirus scanner, but this is not fool-proof, and I realize it doesn't answer your question, so I left it as a comment. – Jonathan Nov 10 '14 at 20:42
  • 2
    @Jonathan, Yeah, I bumped into a high school-er's blog a few days back and he was able to make a clean sweep through all 54 AVs using some ridiculous obfuscation. I'm really more interested in digging into it, rather than using the file. – Ccorock Nov 10 '14 at 22:39
  • /Dt seems to be a date-time stamp, i.e. 23 December 2010 @ 09:44:32 –  Nov 11 '14 at 16:54
  • Thanks for the clue. The examples I posted are just a dip of the 800+ objects in the PDF. I really need some kind of reference rather than guessing at the examples in the post. – Ccorock Nov 11 '14 at 18:31
  • JT is usually associated with CAD/3D (http://en.wikipedia.org/wiki/JT_%28visualization_format%29#File_structure and here are the file format specs http://www.plm.automation.siemens.com/de_de/Images)/JT_v95_File_Format_Reference_Rev-A_tcm73-111987.pdf – munkeyoto Nov 16 '14 at 23:23
  • There are two people I consider very versed in PDF shenanigans, I would recommend that you take some time to read their material: - [Didier Stevens](http://blog.didierstevens.com/) - [Ange Albertini](https://speakerdeck.com/ange) – wireghoul Nov 16 '14 at 23:04
  • According to PDF Specification 1.7 by Adobe, the /GoTo you showed is malformed. The location is specified by that /D right above it, but it should be AFTER the /GoTo (page 654). But before we go that LOOONG way into the specification, which version have you been using? Which version was used in the construction of the document you are analyzing? Without that information no one will be able to help with that. – DarkLighting Dec 02 '14 at 18:42

1 Answers1

2

This link is for PDF-aware developer tools: http://www.adobe.com/devnet/pdf.html Specifically, the 1.7 reference that DarkLighting mentions is: http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf

Section 3.2.4 of the document seems to address your question:

   3.2.4Name Objects
   A name object is an atomic symbol uniquely defined by a sequence of characters.
Uniquely defined means that any two name objects made up of the same sequence of characters 
are identically the same object. Atomic means that a name has no internal structure;
although it is defined by a sequence of characters, those characters are not considered 
elements of the name.

        A slash character (/) introduces a name. The slash is not part of the name but is 
a prefix indicating that the following sequence of characters constitutes a name. 
There can be no white-space characters between the slash and the first character in 
the name. The name may include any regular characters, but not delimiter or white-space 
characters (see Section 3.1, “Lexical Conventions”). Uppercase and lowercase letters are 
considered distinct: /A and /a are different names. The following examples are valid 
literal names:
    /Name1
    /ASomewhatLongerName
    /A;Name_With−Various***Characters?
    /1 . 2
    /$$
    /@pattern
    /. notdef

So it would seem that /JT /Cn /V and so-on are named objects within a PDF Dictionary Object (identified by double angled brackets << ... >>). In your examples, all of these "unidentified" elements are contained within dictionary objects. See section 3.2.6 for a more detailed description of this element.

It's also conceivable that these are part of the PDF extensibility options described in 2.2.8:

Additionally, PDF provides means for applications to store their own private 
information in a PDF file. This information can be recovered when the file is 
imported by the same application, but it is ignored by other applications. 
Therefore, PDF can serve as an application’s native file format while its 
documents can be viewed and printed by other applications. Application-specific 
data can be stored either as marked content annotating the graphics objects in 
a PDF content stream or as entirely separate objects unconnected with the PDF content.

Basically, it's hard to tell what all the various non-standard objects are defined as without going through each one and decoding it (either through a self-developed automation tool or manually).

I non-concur with DarkLighting regarding the /GoTo comment. PDF renderers should read the entire contents of the dictionary before taking any action. The PDF specification does not state that order is important - only that both the "/S /GoTo" and "/D <[some kinda destination]>" are declared. In your example it says go to Page 9, Location 0.

Nick
  • 437
  • 2
  • 9