0

I recently discovered the problem with Percent-encoding.

It makes perfectly sense when we are dealing with such problem in a browser scenario. But I don't get why a software like WinSCP can be affected by the same issue.

In my opinion the special character encoding/decoding shouldn't happen in normal SFTP software (as SFTP is a subsystem of SSH) or it could be easily bypassed.

I would like to know if this encoding/decoding could represent a security issue. Or is this encoding/decoding happening in every SFTP client? Or is just WinSCP the only client doing this and therefore why?

Martin Prikryl
  • 493
  • 5
  • 21
  • How did you arrive to conclusion that it is a *security* issue? We have [discussed this topic already in the WinSCP forum](https://winscp.net/forum/viewtopic.php?t=31429). I believe I've explained, that it's how URL syntax works (Even you have posted the link to the Percent-encoding Wikipedia article). I've never mentioned any *security* reasons. – Martin Prikryl Sep 27 '21 at 20:12

2 Answers2

3

This is because in the case you've provided, you're using a URL, and according to RFC 3986, which defines the generic URL syntax, certain characters must be escaped. WinSCP uses a URL, which requires escaping, and some other clients, such as OpenSSH do not, and so do not require escaping.

The escaping is required because certain characters are used to delimit characters in a URL, and we need some unambiguous way to determine what is a delimiter and what is part of a delimited component. URLs also, for historical reasons, are plain ASCII, and as a result representing non-ASCII character encodings require escaping for the 8th bit.

Requiring escaping in a URL isn't intrinsically a security problem. It is the case that if you generate a URL from components and some of those are controlled by the user and you fail to escape them properly, then the attacker could change the URL to point somewhere unexpected. It's also important to consider that escaping allows essentially arbitrary bytes, and care must be taken to avoid attacks where certain byte values or patterns (NUL bytes, invalid or overlong UTF-8, multiple encodings of the same path, etc.) can cause undesired or unexpected behavior.

However, similar attacks can happen in almost any scenario where escaping is required or allowed and the code receives untrusted data and fails to account for the escaping.

bk2204
  • 7,828
  • 16
  • 15
  • I like your answer @bk2204 . As far as I can tell WinSCP is the only software that use such encoding, that's why it looked doggy to me. I cannot name another software behaving this way. So strange. – Francesco Mantovani Sep 27 '21 at 12:11
  • Actually, even OpenSSH uses URL-encoding for real URLs. Like in `sftp sftp://user%40domain@host` (contrary to more frequently used non-URL syntax `sftp user@domain@host`). Same for curl or wget. So it's not true that WinSCP is the only software that does that. As my answer below explains, it's not something that WinSCP does for fun. It's necessity due to the syntax of the URLs. As this answer correctly says, **certain characters must be escaped**. – Martin Prikryl Sep 27 '21 at 20:02
3

As I have already responded to your question on WinSCP forum:
That's how URLs work. Even the Wikipedia article that you yourself point to (well, I've pointed you to that article), explains that. It does not matter, if it is http:// URL, sftp:// URL or ftp:// URL. It's still URL.

For example, how else would you tell, if in the following URL, the username is user and password bar:blah or that user is foo:bar and password is blah?

sftp://foo:bar:blah@host

All SFTP/FTP software that I know that use URLs do the same. As you can see on the example above, they actually have to. That includes OpenSSH, curl, wget, to name few.

OpenSSH does not use URL-encoding in the frequently used non-URL syntax user@host. Also, their situation is easier because they do not support providing password on a command line. Hostnames cannot contain many special characters. Usernames do not contain them often either. So for example even, if the username contains @ sign, the syntax user@domain@host can still be parsed, as host cannot contain @ so it's clear that user@domain is the username. But even in their limited use, you might face situation in which the syntax is ambiguous. That happens when the syntax contains at least two components that can contain special characters. With OpenSSH that can happen if you include both username and path. For example:

sftp username/domain@host:/path@foo

How would you parse that? There's no unambiguous way to decide what is a username, what is a hostname, and what is a path to the file to download. That's when you will want to use proper URL syntax with percent encoding:

sftp sftp://username%2Fdomain@host/path%40foo

Also WinSCP's URL syntax is incomparably more powerful than that of OpenSSH. WinSCP URL can contain username, password, path and even more. All of these can contain special characters. So WinSCP must be even more strict when parsing an URL.


As with OpenSSH, other software also supports alternative syntax to provide the connection info than URL, if you do not want to mess with the URL-encoding. That's true for wget, curl, and WinSCP. In WinSCP, there are -username and -password switches for that.

Martin Prikryl
  • 493
  • 5
  • 21