Could you write custom data to a file of a specific extension and use it to hack a program once it opens the file?

Question

From what I understand file extensions don't really affect the data contained within them at all. They just give your computer a hint of what the data is, how it's structured and your computer then finds the best program to deal with that specific file type.

So my question is: if you could write custom data to, for instance, a .png file that actually contains different values than what a normal .png file is composed of, then have a program open it, could you get it to do something malicious?

A 100% correct program should simply reject the file. Unfortunately we don't actually use formal methods to *prove* that our programs are correct but we just rely on tests and thus *most* programs actually have various vulnerabilities that allow this. — Bakuriu, Apr 06 '16 at 10:49
You don't really have to look very far to accomplish something malicious. Think about macros in the MS products. — MonkeyZeus, Apr 06 '16 at 12:39
I don't think this is totally clear from the top responses, but NORMALLY this isn't possible. If I took a regular executable and renamed it .png, all that would happen is that my image viewer would try to interpret the code as if it were png data, and either say it can't display the file or display gibberish. A virus would need to target a specific flaw or vulnerability in the image viewer that can cause it to break *just by reading certain data*. After achieving this, it may be able to execute code. — octern, Apr 06 '16 at 16:00

Mike Ounsworth · Accepted Answer · 2016-04-06T17:37:30.303

File extensions

The file extension actually has absolutely nothing to do with the data in the file or how that data is structured. Windows likes to make you think the extension is somehow magical - it's not, it's just part of the file name, and tells Windows which program to launch when you open the file. (Linux/Android and MacOS/iOS still use file extensions a bit, but not nearly to the same degree that Windows does.)

You are completely correct that you can dump some data into a file, call it virus.png and it'll get opened by an image viewer. Call it virus.docx and it'll get opened by MS Word.

Unexpected data

If you take a well-written program and feed it file containing data that it's not expecting, nothing exciting should happen. The program should give an error about a "corrupted file" or something similar and move on with its life. The problem happens when the program is not well-written - usually due to some small bug like a programmer forgetting to check the bounds of an array, forgetting to check for null pointers, or forgetting to put braces { } on an if statement.

Even if there is a bug, 99.999...% of malformed data will get the "corrupted file" error. Only if you construct the data very carefully can you get something malicious to happen. For a concrete example, see the section on StageFright below.

(Thanks to @octern's comment for this).

Malicious payloads in innocent-seeming files

Yes, what you're describing is actually a common attack vector - hence the general fear of opening unknown email attachments.

As an attacker, if you know that there's a vulnerability in a specific program, say the default Windows image viewer, then you can construct a malicious file designed to exploit this. Usually this means that you know that a certain line of code in the viewer does not check the bounds of an array, so you build a malformed .png specifically designed to do a buffer overflow attack and get the program to run code that your inserted.

PNG exploits

For example, here's a vulnerability report about the open source library libpng [CVE-2004-0597].

Multiple buffer overflows in libpng 1.2.5 and earlier, as used in multiple products, allow remote attackers to execute arbitrary code via malformed PNG images in which (1) the png_handle_tRNS function does not properly validate the length of transparency chunk (tRNS) data, or the (2) png_handle_sBIT or (3) png_handle_hIST functions do not perform sufficient bounds checking.

Aside: a Common Vulnerabilities and Exposures (CVE) is a way to track known vulnerabilities in public software. The list of known vulnerabilities can be searched here: https://cve.mitre.org/cve/cve.html

If you search the CVE's for "png" you will find hundreds of vulnerabilities and attacks just like the one you imagined in your question.

Android StageFright

The StageFright Android vulnerability of April 2015 was very similar: there was a buffer overflow vulnerability in Android's core multimedia library, and by sending a malformed audio/video file by MMS (multimedia message), an attacker could get complete control of the phone.

The original exploit for this vulnerability, was for an attacker to send a 3GPP audio / video file in which looked like a valid audio/video file, except that one of the integer fields in the metadata was abnormally large, causing an integer overflow. If the large "integer" actually contained executable code, this could end being run on the phone, which is why this kind of vulnerability is called an "arbitrary code execution vulnerability".

PDF and MS Word exploits

If you search the CVE's for "pdf" or "word" you'll find a whole pile of arbitrary code execution vulnerabilities that people have been able to exploit with those file types (wow - a number of very recent ones for Word too, neat). That's why .pdf and .docx are commonly used as email attachments that carry viruses.

About the StageFright vulnerability, it isn't an SMS but an MMS. SMS is short for Short Message Service, while MMS is short for Multimedia Messaging Service. This means that you can't send a video over SMS. (You can, however, send very basic MDI sounds and black and white images over SMS). Also, a more recent variant, according to the provided Wikipedia link, also uses MP3 files (audio). Another attack vector is actually the Normal.dot file, for Microsoft Word. Adding a macro to it, will execute it before opening a new document. — Ismael Miguel, Apr 06 '16 at 08:24

score 5 · Answer 2 · answered Apr 05 '16 at 22:44

Well, this question is quite broad. But in short:

File extension

Quite a lot of virus simply utilize/-ed windows great feature of hiding file-extensions. E.g. the file someimg.jpg.exe would appear as someimg.jpg to the user, which would open it without knowing he executed malicious code.

More specific hacks

There have been (probably are) some drive-by virus, that utilized a bug in windows image-library that were capable of executing code that was hidden inside the jpg. Flash has been known to be prone to buffer-overflow attacks as well. By inserting a frame that was larger than specified, a buffer-overflow could be caused that allowed code execution. PDF-files on adobe allow embedding of javascript-code, which can be used maliciously as well, and AdobeReader had a bug that allowed execution of embedded executables from within the pdf-file. The list is could be conducted endlessly.

These kinds of hacks require precise knowledge of specific software and it's weakpoints. But there's no option to simply rename a .exe into .jpg and hope that the image-lib will execute it, or anything alike.