16

I have a vulnerable test site up that runs PHP. How can an attacker identify that PHP is used? if I type .../add.php the site gives back an error message, although the file is add.php. If I type .../add the site runs.

Maybe I can inject code to identify PHP? Or is it impossible to check for PHP (including version) if a site is well coded?

Here is the code for the test site: Elastic Beanstalk + PHP Demo App

Laurel
  • 129
  • 7
Jan Küfner
  • 163
  • 1
  • 6
  • 4
    Related: [OWASP - Fingerprint Web Server](https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/01-Information_Gathering/02-Fingerprint_Web_Server) – Andrew T. Jun 24 '21 at 08:39

5 Answers5

28

There is no method that is guaranteed to work.

The way PHP works is that the HTTP server receives the HTTP request, identifies that it's meant to be PHP and relays the request to the PHP module. This could either be a module built into the web server or be a dedicated "PHP server". The server then checks which PHP code is meant to be executed with which parameters, then executes it, generates a result and relays that result back to the HTTP server, which returns it as HTTP response.

Whether or not this process occurs, or whether or not the result received stems from a static page or any number of processes, is unknown to the user.

However...

There are a number of possible ways PHP could "reveal" itself. The first and most obvious is the X-Powered-By HTTP response header. PHP likes to advertise itself, and so in some installations, the X-Powered-By header is set, which includes that the site is running PHP and which version.

There is also a very strange "easter egg" in PHP, which returns specific information such as credits to the development team or the PHP logo, when a specific query string is sent. This behavior can be disabled in the configuration, so it isn't foolproof either. If it works, then it's overwhelmingly likely to be a PHP installation, but if it doesn't, you can't exactly deduce that it's not a PHP installation. Absence of evidence isn't evidence of absence, afterall.

Stack traces and other PHP errors, such as this beautiful masterpiece taken from this question, can be an indication as well:

Sick PHP skills on display

Of course, all of these methods only work because of some misconfiguration. On a properly configured server, it is not possible to know for sure if PHP is used or not.

  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/126852/discussion-on-answer-by-mechmk1-how-can-an-attacker-identify-if-a-website-is-usi). – schroeder Jun 25 '21 at 12:07
20

Because PHP is built from its own code

Maybe by looking at how it parses specific querystrings (e.g. ?a[]=1&a[]=2&a[]=3), which regular expression syntax it supports, the distinctive arrangement of HTTP response headers it produces, the distinctive byte-level characteristics of results that PHP functions, extensions, and dependencies produce (e.g. the order and structure of bytes it sends through mail()), implementation-specific stuff that RFC doesn’t care about, measuring time it takes to process some specific requests that would call specific PHP functions, using its own vulnerabilities, or by whatever low-level quirk it might distinctively have that possibly identifies PHP.

Do not assume that attackers can never know that you’re using PHP. However properly configured, there must be some quirks you’re not aware but a sophisticated attacker is. You cannot configure the internal behavior of the PHP engine.

  • 3
    Exactly. Even after hiding everything possible, there are still ways like stated above which cannot be easily masked because they are unknown to common developers/admins. This is similar to [SuperUser question on router type detection](https://superuser.com/questions/620199/is-it-possible-to-detect-the-model-brand-of-a-remote-router) which leads to nmap, which even has a specialized library for OS detections. Assumption that a remote device or software is undetectable has never to be relied upon. Just do the best you can to secure it, but keep fresh backups. – miroxlav Jun 24 '21 at 11:45
  • 2
    `However properly configured, there must be some quirks you’re not aware but a sophisticated attacker is.` This is a very bold claim without any supporting evidence. It's almost theological in nature. –  Jun 24 '21 at 13:11
  • 1
    @MechMK1 The point is that: just because a software has its own code, its own specific implementation, there _ought_ to be its own way of doing something that makes it possible to be distinguished from other software and possibly opens it up to side-channel attacks. And furthermore, there are many things that are up to an implementation as RFC doesn’t specify every miniscule detail, and those implementation-specific quirks are what I’m talking about as well. I don’t think it’s a _claim_; I think it’s an inevitable nature of software. – Константин Ван Jun 24 '21 at 13:59
  • 1
    @MechMK1 I said “_However properly configured_” because I was not talking about what you can _configure_, rather, it was about the hard-coded characteristics of software, which you cannot alter. (You could rewrite the code to differ, but then that’s not PHP anymore.) – Константин Ван Jun 24 '21 at 14:19
  • 5
    @MechMK1> rather the opposite, it is very practical. There are thousands of parameters to inspect, from specific header values, to header ordering, to request timing. All of those exhibit properties that stem from the inner working of PHP engine (eg: "PHP always writes header foo after header bar"). Each of those is merely a hint, but there are thousands of those. You will never reach 100% certainty for sure but 99% confidence is totally doable. Which is more than enough for an attacker to decide they will spare the resources and try to break it as if it were PHP. – spectras Jun 24 '21 at 14:56
6

From a practical point of view, your question does not matter.

If you know that your server runs a vulnerable version of PHP, or anything else, it needs to be patched, not hidden.

Meanwhile, the attackers often just run a variety of automated requests without prior detection: open any logs of a public web server and you'll find 404 errors for admin pages of phpmyadmin and other well-known tools, even if the server is 100% ASP.NET

IMil
  • 1,081
  • 1
  • 7
  • 7
  • 2
    This really hits the nail on the head. It doesn't matter whether the site is PHP, ASP.NET, Spring, Struts2, Ruby or any other common platform. 9 times out of 10 an attacker is just going to go for low hanging fruit and spam every site they can find with a common set of flaws, then go deeper on those that appear to be fruitful. As such, having a vulnerability is just asking for problems. – Nzall Jun 25 '21 at 20:50
  • Hiding it could discourage attackers from performing zero-day attacks _that are relatively expensive to perform_ on every server in sight. There’s not much though. – Константин Ван Jun 26 '21 at 07:42
5

In some (but not all) cases - if a site is using PHP, you may see this in the response headers. One way to view the response headers is to use curl with the -I option.

For example:

curl -I www.example.com

If the site is using PHP, you may see something like this in the server response header:

Apache/2.4.25 (Debian) PHP/7.0.19-1 OpenSSL/1.0.2l
Patrick Mevzek
  • 1,748
  • 2
  • 10
  • 23
mti2935
  • 19,868
  • 2
  • 45
  • 64
0

The bulk of opportunistic attempts assume that you are, and in the bulk of cases they are correct. The drive by scripting is fired at you regardless.

I gather we are talking about run-time PHP and not static content generated with php for just-in-time publication.

They should not know. Leaving phpinfo() for them to find can help them if you want that. But they are generally looking for older, unhardened targets with vulnerable "injection attack" entry fields.

So... they look for content that exists for apps thet have a PHP dependency. 90+% of this is "Content Management System X" or "DBA tool Y". I should not have to name these, they will be in your logs.

Of course leaving your backup tarball public will given them all the evidence they need.

The naming variations are used because they historically work, people have been foolish. This is why they are in the script. To hackers, this is panning for gold, but automated.

One of my sites is purely "cron/awk/sed", it still gets hit 24x7 for .php targets. It can't be avoided. Serving XML or JSON or YANL it will be assumed these are pre-processed content and rightly so, but you guess the pre-processor.

mckenzm
  • 469
  • 2
  • 6