How to detect malicious JavaScript in a PDF file?

Question

Is having any JavaScript in a PDF file by definition dangerous or is it only dangerous when specific functions (for example eval) are used inside a PDF, in which case, what JavaScript functions are dangerous in a PDF?

In other words: When should JavaScript in a PDF file be considered dangereous?

Bob Ortiz · Accepted Answer · 2022-07-07T12:28:16.120

I did some additional searching and found an interesting research-paper (easily readable and just 12 pages). The research is called Detecting Malicious JavaScript in PDF through Document Instrumentation.

In their research-paper they introduce a context-aware approach to detect and conﬁne malicious JavaScript in PDF through static document instrumentation and runtime behavior monitoring.

The following quotes and figure give insight in how their developed detection system approached malicious PDF detection.

Detection architecture

Our system consists of two major components, front-end and back-end, working in two phases. In Phase-I, the front- end component statically parses the document, analyzes the structure, and ﬁnally instruments the PDF objects containing JavaScript. Then, in Phase-II when an instrumented document is opened, the back-end component detects suspicious behaviors of a PDF reader process in context of JavaScript execution and conﬁnes malicious attempts.

Phase-I Static Analysis and Instrumentation

For suspicious PDF, the front-end ﬁrst parses the document structure and then decompresses the objects and streams. A set of static features are extracted in this process. When a document has been decompressed, the front-end will instrument it and add context monitoring code for JavaScript. In some cases, if the document is encrypted using an owner’s password, i.e., a mode of PDF in which the document is readable but non-modiﬁable, we need to remove the owner’s password. With the help of PDF password recovery tools like [28], this can be done easily and very fast.

Phase-II Static Runtime Detection

The back-end component works in two steps, runtime monitoring and runtime detection. When an instrumented PDF is loaded, the context monitoring code inside will cooperate with our runtime monitor, which tries to collect evidence of potential infection attempts. When Javascript executes to the end or a critical operation occurs, the runtime detector will compute a malscore. If the malscore exceeds a predeﬁned threshold, the document will be classiﬁed as malicious.

Credits to: Daiping Liu and Haining Wang from College of William and Mary and Angelos Stavrou from George Mason University.

score 0 · Answer 2 · edited Jul 25 '16 at 10:58

0

Short answer: Acrobat JavaScript is essentially harmless.

Longer answer: Acrobat JavaScript (the JavaScript in PDF) is essentially harmless because it runs in its own secured environment, and has very little access to the outside world.

It is possible, however, to craft calls to a server (for example using the getURL() or launchURL() method) which may be able to run malicious code, but that code had to be brought to the server involved in some or another way. In addition, there are certain protocols which are blocked in those mentioned methods, for example, javascript:.

As it has been mentioned, there is no correlation between potential danger and the PDF file size. It is very easy to create PDFs with more than 10 MB, or even 100 MB in size, without any issues (for the prepress world, 100 MB is considered "normal", if not "small").

edited Jul 25 '16 at 10:58

Bob Ortiz

6,234
8
43
90

answered Jul 22 '16 at 18:06

Max Wyss

207
1
3

5

Longerer answer (definitely a word) that may be worth mentioning: Acrobat JavaScript is as harmless as the environment is secure. Unfortunately secure run time environments have, historically, had... problems. – AstroDan Jul 22 '16 at 18:11
Acrobat will also prompt the user before making calls like `launchURL()`, Adobe seems to have a pretty good guide on JavaScript security : https://www.adobe.com/devnet-docs/acrobatetk/tools/AppSec/javascript.html#javascript-invoked-urls – Brandon Haugen Jul 26 '16 at 02:20
…or the call is set in a higher-privilege environment (aka application- or folder-level JavaScript, which does require an active action (in form of installation) by the user. – Max Wyss Jul 26 '16 at 14:48
You could argue that everything is essentially harmless but I don't think that adds to the discussion. Here is some detailed analysis of PDF attacks from BlackHat: https://www.blackhat.com/docs/eu-14/materials/eu-14-Esparza-PDF-Attack-A-Journey-From-The-Exploit-Kit-To-The-Shellcode.pdf – HackSlash Feb 21 '19 at 17:32
I just want to add that now 5 years later it's a very real threat: https://blog.talosintelligence.com/2020/11/vulnerability-spotlight-multiple.html – Travis Feb 23 '22 at 20:51

score 0 · Answer 3 · answered Feb 09 '20 at 18:18

There can be legitimate JavaScript in a PDF. An invaluable tool is pdfinfo, part of poppler-utils. First of all, it tells you whether a pdf file contains any JavaScript at all. In case, pdfinfo -js extracts the full JavaScript text.

A minimal programming skill is sufficient to tell malicious stuff from innocuous scripts like this:

    if (app.viewerVersion < 9.0)
    {
       if (app.alert(ADBE.Reader_string_Need_New_Version_Msg, 1, 1) == 1)
          this.getURL(ADBE.Reader_Value_New_Version_URL + ADBE.SYSINFO, false);
       ADBE.Reader_Value_Asked = true;
    }

Otherwise, taking recourse to some JavaScript analysis tool/ site can help.

mootmoot · Answer 4 · 2016-07-25T07:28:41.910

-3

Short Answer : NO.

Complex PDF file may embed javascript to make validation inside a form that allow user to input data, click and send them when you open it with correct PDF reader. Normally massive PDF file(>5MB) with javascript is quite safe, anything more than 10MB are cost burden to attacker for hosting the file.Nevertheless, there is always exception, e.g. PDF padded with empty junks to make it huge, but zip to smaller size and let victim unzip it in voluntary ways.

Focus on eval is quite useless in detecting PDF malware. With javascript, it is easy to construct eval without showing it at all.

For example, one can use Hieroglyphy to hide it.

edited Jul 25 '16 at 07:28

answered Jul 22 '16 at 12:54

mootmoot

2,387
10
16

2

what does the size of the PDF document have to do with anything? – dandavis Jul 22 '16 at 16:23
@dandavis : because malicious file always host in free hosting services, and there is a bandwidth cap from those service provider. – mootmoot Jul 22 '16 at 16:31
2

"_malicious file always host in free hosting services_" ?!!?!??! no, just no... – dandavis Jul 22 '16 at 16:33
@dandavis : There is always exception, milestones varied. Money talks. – mootmoot Jul 22 '16 at 16:55
I have to agree with dandavis about the size/host part. If money counts than a professional cyber criminal makes sure his "free host" isn't take down but instead arrange a stable infrastructure for stealing money! Also it would be more convenient to have a smaller file. The smaller the better. I think size/host don't matter in this case. – Bob Ortiz Jul 22 '16 at 16:59
Agreed on the short answer; the long answer contains a lot of babble, however. – Max Wyss Jul 22 '16 at 17:52

How to detect malicious JavaScript in a PDF file?

4 Answers4

Linked