49

Sometimes I'm interested in what's behind a malicious website. How do I stay on the safe side if I decide to inspect? I'm searching for methods that are quicker and more simple than running the website on a virtual machine.

Should I use cURL and view the HTML source in a file viewer? Should I simply view the source in the browser using view-source:http://malicious-website/? Are those safe?

Mirsad
  • 10,005
  • 8
  • 33
  • 53
  • 3
    See here: http://security.stackexchange.com/questions/117740/how-can-i-safely-check-whether-an-email-link-leads-to-a-malicious-website – hamena314 Apr 27 '16 at 07:29
  • 1
    https://www.browserstack.com/ have a free trial. You can inspect the code with a normal browser inspector – myol Apr 27 '16 at 19:33

6 Answers6

35

Why not just send the URL to Virustotal? Accessing a malicious website can be tricky. Using curl, wget, links -dump can be tricky depending on how the malicious content is served up. For example:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTP_USER_AGENT} 
 RewriteCond %{HTTP_USER_AGENT} ^.*(winhttp|libwww\-perl|curl|wget).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

Using mod_rewrite, I can feed you non-malicious pages. I can send you elsewhere, do whatever I'd like. Further, I can change payloads e.g.: instead of feeding you malicious, I can just change it to a non-malicious "Hello World" javascript. This may trick you into thinking my malicious website is harmless.

Normally when I have to visit a malicious website, I have a virtualized sandbox which runs burpsuite for interception, Squid proxy server, and a few other tools (noscript, ghostery, etc). What is the ultimate purpose of visiting outside of curiosity?

munkeyoto
  • 8,682
  • 16
  • 31
  • I want to do it manually, from VT I don't see some details that I' am interesting for it. Also I'm not sure how VT is handling obfuscated javascript and other hiding methods. False positives are another problem in this story. – Mirsad Apr 26 '16 at 17:35
  • 4
    I +1'd for the mention of rerouting using mod_rewrite and using a virtualized sandbox. But I agree with the OP's mention of how VirusTotal is not right for their purposes. Although it returns a lot of useful information it is not always guaranteed to be detected nor does it return a detailed analysis. As a test I once obfuscated my own code that exploited a then known browser exploit and VT never picked it up. Such would be a job for a virtualized sandbox like you mentioned. – Bacon Brad Apr 26 '16 at 20:14
  • 5
    After all, you made a good point about user agents, but there is a solution for that, so by tweaking .curlrc the door can become open for analyizing. `user-agent = "Mozilla/4.0 (Mozilla/4.0; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 3.0.04506.30)"` and sometimes also tweaking referer can be useful, `referer = "http://www.google.com/search?hl=en&q=web&aq=f&oq=&aqi=g1"` – Mirsad Apr 26 '16 at 21:49
  • 7
    As @mirsad said, your URL-rewriting tricks are interesting but no match for my UA spoofery. – cat Apr 26 '16 at 23:18
  • Curl headers like the user agent can also be set via command line. – Alexander O'Mara Apr 27 '16 at 14:37
  • @AlexanderO'Mara you're missing the gist of my comment. I am aware that user agents can be changed in curl, wget, etc. The point is to illustrate that malware authors can enable sending SPECIFIC types of pages based upon who/what is viewing the site – munkeyoto Apr 27 '16 at 14:40
  • I understand, but this feature of curl is also quite useful for scanning for such malicious behavior. You could quickly check if malicious code is specifically severed to MS IE and no other browser. – Alexander O'Mara Apr 27 '16 at 14:42
  • Great post. I'd love to see some sort of complete setup for a VM sandbox. Though there are probably pre-built images these days... – Rick Jan 12 '22 at 19:47
14

Visiting a malicious site is often a hit or miss because you're talking to THEIR software that THEY control. You have no real control over it no matter what you do. It could appear non malicious for a long time, and then hit you. It could try to hit you as soon as you visit it. It could...

Because there are literally infinite possibilities of how a site could be malicious you can't really ever be sure. All you can do is use some sort of burner equipment, explore, and still never trust the site. Ever. For any sites. The danger no matter what protocol you use is that in the end you will be visiting their server in some way. You open yourself up to payloads on every level of the OSI model. If you just want to see options headers that still an open connection. It's really a catch 22.


Remember, the web is a level of trust. I trust you to keep me safe. Just in case though, I'm still going to run anti virus software and let other people visit it first. If they stay safe long enough I guess I'll visit you.

And then there's the chance those could get hacked. Then your trust is broken.


Worse yet is trying to inspect code. Sure you get a copy, but a copy of what? It's in the best interest for a site to appear non malicious for as long as possible. Often time source is completely innocuous until it downloads the payload in some sort of non flagging off site location that would past most tests. So you then you're stuck hunting down every single link and source file and reading through those too or analyzing them, which is costly over time.

TL;DR:

You can never trust a site completely. Not even Google. The actual malicious part of the site can be put into anything, anywhere on the site. Sure you can safely inspect the source, but then all you might get is a false sense of security.

If you absolutely MUST do this, use a burner machine or VM that you can destroy the instant it becomes infected. The payload could be anywhere(HTML file, off site JS/CSS/Vector/app/image/CSV/JSON/file...). If you can't trust the site based on reputation, you can't trust the site at all.

Robert Mennell
  • 6,968
  • 1
  • 13
  • 38
  • I'm not speaking about visiting, but more about inspecting a potentially malicious website. For instance, I want to check (inspect) what's behind the URL which I found in my server logs or in spam folder. – Mirsad Apr 26 '16 at 18:04
  • 2
    ... Then you visit it. There is LITERALLY no other way to get that code than to talk to the server, get the code and inspect it. Otherwise you have ot rely on what other people tell you. There are services for that. NEVER visit a website from a server! – Robert Mennell Apr 26 '16 at 18:05
  • 4
    A website that you finally trust may be hacked. Then you're off by one. – ott-- Apr 26 '16 at 18:16
  • 3
    TL/DR; You can not prove that a website is not malicious for anybody. Very good point! – Noir Apr 26 '16 at 19:27
3

It's hard to inspect websites by analyzing their source code, because some sites have hidden codes in it. You might want to try reputation based analysis.

You can add an add-on to your browser to analyze the site before you click it. Example of it is wot, a plug-in (web of trust). https://www.mywot.com/

You can also send the URL to a free URL Scanner. Example is http://zulu.zscaler.com/ This inspects the website itself. This is a risk analyzer tool.

You can also try http://urlquery.net/index.php

The most common reputation based analysis site is http://www.urlvoid.com/ and https://www.virustotal.com/

vulnerableuser
  • 317
  • 1
  • 5
  • urlquery.net even gives you a screenshot of the web page. – Sphinxxx Apr 26 '16 at 22:55
  • I haven't been using WOT ever since they've been labeling w3schools as 95% trustworthy and child-safe. – John Dvorak Apr 27 '16 at 05:38
  • 4
    WOT doens't even analyze anything. Users _vote_ for a site's trustworthiness and child safety. – John Dvorak Apr 27 '16 at 05:41
  • @JanDvorak : And this is why WOT has lately more and more false positives, as users flag sites as malicious solely based on ideological grounds. Blogs, sites of politicians or political parties, many religious or philosophical sites, they are all filled with ideologically motivated red flags. – vsz Apr 27 '16 at 11:45
  • do **NOT install** the so called "Web of Trust" (**WOT**), it can/must be considered as malware!!! - https://thehackernews.com/2016/11/web-of-trust-addon.html – DJCrashdummy Nov 11 '16 at 07:19
1

Others mentioned that just retrieving a URL can provide crucial information to the host; usual the information is that you have read an email.

That said, I assume that one option to inspect a web page securely which is not too uncomfortable or complicated is to open the URL in a text based browser like lynx (under Windows perhaps in a cygwin environment). Lynx does not process JavaScript and does not immediately display images or other non-text content, which afaics makes it invulnerable to most attacks targeted on mainstream platforms (I'd appreciate comments on that though).

0

I like @Robert Mennell's answer, but I'll add that there is one way to see what the site is running, and that is to yank the disk and inspect it in another machine. That way, you're less likely to be impacted by a rootkit that's causing the OS to lie to you. Of course, the drive firmware could be lying to you, but that's a pretty specialized rootkit.

Adam Shostack
  • 2,659
  • 1
  • 10
  • 12
  • Worse yet you've also infected your other machine with that rootkit as well if it is in the firmware. That seems like a bad idea, hence why the burner machine would be the way to go. – Robert Mennell Apr 26 '16 at 22:37
-1

A suggestion -- not sure if it's actually safe or not -- is to use Google translate on the site, specifying a target language that can be minimally understood at a glance (German, Dutch, etc). I've done that for suspicious pages (and also translated to English for some Chinese and Indian web pages sent by customers that I was worried about). It appears to me that Google is taking the risk and generating what you see. So as long as you don't click on anything it seems to be safe -- of course, I could be way wrong about that. Google may say "site redirected you too many times" but if so that may be an indicator itself the site is not safe.

This doesn't solve the "inspect the site's source code" aspect, but it may give additional info and clues as to whether further methods to view the source code are warranted.