At which point can a system be compromised when downloading archived data from an untrusted source?

Question

If I download archived data from a possibly untrusted source at which point am I at possible risk of harming my system:

Initially downloading and saving the archived data (still packed)
Unpacking the archived data
Executing any file from the unpacked archive

At point 3 I will obviously be at risk, but what about 1-2?

About the best someone can answer without a concrete case is, *"it depends"*. Without details like how you are downloading and how you are unpacking it is hard to say. For example, I believe cURL, Wget and some browsers upack the ZIP for you, unless you take special measures to avoid the behavior. Or, an email client could unpack the ZIP file for you for previewing contents. However, using OpenSSL to fetch it will just save it to the filesystem. — , Jul 17 '19 at 18:14
...I can't help but ask if you can trust that the source will actually provide the file in step #1, and not do anything else (such as attempt to sneak something into your system instead of or before sending the archive itself). — Justin Time - Reinstate Monica, Jul 18 '19 at 18:33
@jww neither cURL nor wget documentation mention that they unpack ZIP files automatically. Could you clarify you statement? — AlexD, Jul 22 '19 at 19:49
@AlexD - I remember seeing the issue in passing on one of the GNU mailing lists. See [\[Bug-wget\] New wget (1.19.2): Unexpected download behaviour for gzip-compressed tarballs (HTTP-header dependent)](https://lists.gnu.org/archive/html/bug-wget/2017-11/msg00000.html). Wget started decompressing archives automatically. I don't know if it was a new option that was "on by default", or if it was hard coded behavior that cannot be changed, or if it was a mistake. — , Jul 23 '19 at 01:33
@AlexD - I think a more interesting case is Browsers and Email clients. They will prefetch links, start opening files and interpreting content without user interactions. Gutmann warns about the embedded Turing machines in his book *[Engineering Security](http://www.cs.auckland.ac.nz/~pgut001/pubs/book.pdf)*, p.197: *"A better use of the time and effort required for user education would have been to concentrate on making the types of documents that are sent as attachments purely passive and unable to cause any action on the destination machine."* — , Jul 23 '19 at 02:04
@jww this automatic decompression with wget only happens when it is explicitly requested with `--compression` switch (default setting is to disable) and server response includes 'Content-Encoding: gzip' header. This can lead to surprises if a server sends `tar.gz` with such header. — AlexD, Jul 23 '19 at 19:03

score 25 · Accepted Answer · answered Jul 17 '19 at 07:25

25

1 should not present any danger as long as the file is just saved somewhere and no attempts to open it with anything are made. If you view it even with a text editor, there's already a small danger of exploits.

In the case of 2 there are vulnerabilities and exploits, so there are dangers. Some examples of such possible scenarios:

Arbitrary file writes caused by .tar.gz archive symbolic link (symlink) vulnerabilities that are exploited because of how Bower (a popular web package manager) extracts such archives
CVE-2018-20250 is an absolute path traversal vulnerability in unacev2.dll, the DLL file used by WinRAR to parse ACE archives that has not been updated since 2005. A specially crafted ACE archive can exploit this vulnerability to extract a file to an arbitrary path and bypass the actual destination folder. In its example, CPR is able to extract a malicious file into the Windows Startup folder.
CVE-2018-20252 and CVE-2018-20253 are out-of-bounds write vulnerabilities during the parsing of crafted archive formats. Successful exploitation of these CVEs could lead to arbitrary code execution.
Zip Slip which attackers might use to target files they can execute remotely, such as parts of a website, or files that a computer or user are likely to run anyway, like popular applications or system files.
Helm Chart Archive File Unpacking Path Traversal Vulnerability.
CVE-2015-5663 - the file-execution functionality in WinRAR before 5.30 beta 5 allows local users to gain privileges via a Trojan horse file with a name similar to an extension-less filename that was selected by the user.
CVE-2005-3262 allows remote attackers to execute arbitrary code via format string specifiers in a UUE/XXE file, which are not properly handled when WinRAR displays diagnostic errors related to an invalid filename

There are plenty more examples and databases with such vulnerabilities and even most of them got fixed in later versions of the software, a risk still exists.

So therefore, [2] is risky and should be handled with care.

answered Jul 17 '19 at 07:25

Overmind

8,779
3
19
28

22

Another interesting threat of unpacking untrusted archives are [zip bombs](https://en.wikipedia.org/wiki/Zip_bomb): Specially crafted archives which seem small but unpack to huge amounts of data. – Philipp Jul 17 '19 at 08:11
7

I think it would be even more rare (and bordering on extreme paranoia), but it might be possible that the act of downloading (or copying) the file could exploit a vulnerability in a web browser/wget/curl/rsync/etc. I'm not sure if there's ever been an example of this. – mbrig Jul 17 '19 at 16:36
5

If the system has an antivirus installed, the archive itself could be crafted to exploit a bug in the file-scanning routines that run when the file is downloaded too. See for example some of the CVEs for Windows Defender: https://www.cvedetails.com/vulnerability-list/vendor_id-26/product_id-9767/Microsoft-Windows-Defender.html – Paul Belanger Jul 17 '19 at 18:14
1

@mbrig Well, in the end points 1 and 2 are mostly "identical" from a security perspective. They both deal with programs handling data in some way. If we include the possibility of bugs that allow an attacker to perform arbitrary code execution it can happen in both case 1 and 2. Downloading a file may be a "simpler" action with less chances of errors, but if you think at all the bugs in networking software (think heartbleed for example...) it's absolutely not out of this world. If you don't care about software bugs then opening a file in a text editor is fine, only execution is an issue. – Bakuriu Jul 17 '19 at 18:34
1

Well, just visiting a site before you start the download can also exploit many browser vulnerabilities. I'm assuming you don't have a direct URL and just run a `wget`. – Nelson Jul 18 '19 at 01:36
3

Depending on the system, thumbnailers, previewers, and indexers could be exploited. – forest Jul 18 '19 at 01:59
1

Note that all of the vulnerabilities from the above answer are known vulnerabilities that have already been patched AFAIK. so if you're careful enough about keeping your software up to date, you are most likely only going to be affected by 0-day vulnerabilities. And it is unlikely anyone is going to burn a 0-day on a random zip archive meant for public use. Those are usually saved for high profile targets. – Nzall Jul 18 '19 at 07:39
@Philipp I don't consider infinite unpackers as a threat. I used to play with such things since MS-DOS era and it's just something to have fun with; even unexperienced people will see that an unpacking assimilates all their space. Harm can be actually be done with a thumbs preview that crashes explorer.exe for example. – Overmind Jul 18 '19 at 08:07
Whether 1 is a vulnerability also depends on where you save the downloaded data. – Federico Poloni Jul 18 '19 at 10:45
Actually, you could (in theory) exploit a browser or even Windows, just by viewing the icon of an exe file. The icon may be crafted to exploit some unknown vulnerability (under/overflow) just with the icon. A self-extracting exe or an msi can have such icons. In the case of 7zip, the self-extracting exe shows an icon similar to any other compressed file. That could easily pass by as a compressed file. – Ismael Miguel Jul 18 '19 at 15:53
@Bakuriu As for the download being simpler that is arguably true with the caveat that it does assume that neither the browser, it's extensions, the operating system or other privileged processes on the system attempt to do anything more clever with the foreign input than copy and store operations at this point. This assumption is of course violated by many system configurations and ironically the biggest culprit is probably anti-malware services too, granted hopefully those are running well sandboxed code but it's still an attack surface and exploits of this kind have been found before. – MttJocy Jul 18 '19 at 16:50

score 9 · Answer 2 · edited Jun 16 '20 at 09:49

9

In theory all of these places could be exploited. I am not going to go into specific exploits available as these change constantly with archive format and moving tech:

Initially downloading and saving the archived data (still packed)

It is unlikely but it is possible that your download manager / web browser does have some kind of exploit. You say the source is untrusted therefore the server could try and attack your download program using exploits in its implementation or weaknesses in the file transfer protocol you are using. These exploits are rare but not unheard of. But fundamentally unless you are certain your software is entirely unexploitable any network connection with a malicious server could result in an attack.

You can somewhat mitigate this by sandboxing the download software with only minimal permissions and access needed to the location you wish to download to and the network stack. This mostly mitigates this weakness assuming your OS permission model or sandboxing software do not also have exploits.

Unpacking the archived data

There are numerous attacks over the years involving using poisoned archive files to run arbitrary code on a system by exploiting weaknesses in the archive format or decompression software. These are probably more common than the above weakness.

The main protections are again making sure to give the extraction program minimal permissions and potentially sand-boxing it to ensure it can do minimal damage if it is attacked successfully. Caveats above apply.

Executing any file from the unpacked archive

This is obviously enormously risky, and the same issues as running and malicious software applies. It is relatively easy for software when run explicitly to break many sandboxes and permission system protections so all bets are off. You can have some safety running the software in a hardened VM but this still doesnt fully protect you short of using an airgapped machine to run the programs which is then destroyed.

TLDR

All of these steps are fairly risky, but each successive step is probably more dangerous than the last.

edited Jun 16 '20 at 09:49

Community

1

answered Jul 17 '19 at 18:58

Vality

399
2
7

1

When downloading, a system should only read and write ASCII blocks so there is no content interpreted and you can't sabotage that in any way. The process which leads to the download is the one that can actually be exploited. So you don't practically use the actual archive for exploiting, therefore this type of risk is unrelated to the initial question. – Overmind Jul 18 '19 at 08:11
@Overmind some download-managers could peak at the first bytes of the binary file to guess mime-type (if not correctly provided by the server) a vulnerability in this code could be exploited by a specially crafted file. – Falco Jul 18 '19 at 09:48
If you craft that it will become an invalid archive (if there was one in the 1st place) so we practically have a crafted file designed against a specific DM (download manager), not a true archive / not archived data as the topic suggested. – Overmind Jul 18 '19 at 11:47
1

@Overmind: Download data is **binary,** not ASCII. – JRE Jul 18 '19 at 14:20
@Overmind the OP said specifically that the server was untrusted. I'm therefore assuming it can send whatever invalid file or protocol data it wants. This could include invalid files, invalid length and invalid protocol metadata. While a download manager is a relatively simple tool and should not be vulnerable to attacks, I would never say this is impossible. Whenever a network protocol is not entirely trivial there are generally attacks possible on the network stack or application by sending specially crafted packets. And at least some software is likely vulnerable. – Vality Jul 18 '19 at 15:41
@Overmind for example https://exchange.xforce.ibmcloud.com/vulnerabilities/45711 an exploit that attacks a download manager by sending specially crafted metadata and headers. A number of these exist in download managers. – Vality Jul 18 '19 at 15:45
1

On many systems the download stage is also where any potential malicious code gets it's first crack at the attack surface presented by the hosts anti-malware solution (at least on hosts running a realtime anti-malware service) since those do often accept bytes from the network stack in real time and does pass them as arguments to complex functions. Granted this sort of code is generally audited to a higher standard than most and often attempts are made to sandbox it to limit the scope of exploits should they be found but exploitable bugs can still creep in so there is a non zero risk here too. – MttJocy Jul 18 '19 at 17:02

At which point can a system be compromised when downloading archived data from an untrusted source?

2 Answers2

Initially downloading and saving the archived data (still packed)

Unpacking the archived data

Executing any file from the unpacked archive

TLDR