12

I have PHP code that uses the loadXML function (as well as other XML functions).

  1. Is the loadXML function vulnerable to XXE attack? Namely, if the XML contains external entities, will they be interpreted?

  2. Is this function vulnerable to other XML based attacks? E.g., the Billion Laughs DoS attack?

  3. Can you refer me to a list of functions that are vulnerable to XXE and to other XML-related attacks?

I know that it is possible to easily block XXE attacks in PHP by changing the settings. Yet, I will be glad to get answers to the questions I raised.

Anders
  • 64,406
  • 24
  • 178
  • 215
Gari BN
  • 485
  • 1
  • 6
  • 14

1 Answers1

14

Is the loadXML function vulnerable to XXE attack? Namely, if the XML contains external entities, will they be interpreted?

By default, no.

External Entities are not parsed without LIBXML_NOENT being set. If libxml_disable_entity_loader is set to true, not even LIBXML_NOENT will allow XXE.

Is this function vulnerable to other XML based attacks? e.g., the Billion Laughs DoS attack?

No, billion laughts will be caught with default settings:

DOMDocument::loadXML(): Detected an entity reference loop in Entity

What will not be caught is quadratic blowup. Note that the entities need to actually be substituted, which is either done via the rather poorly named LIBXML_NOENT or by simply accessing the node (eg via textContent).

tim
  • 29,018
  • 7
  • 95
  • 119
  • 7
    The name of this flag is so horribly confusing, as is the complete lack of official documentation. – Polynomial Aug 15 '16 at 10:58
  • 21
    @Polynomial: Welcome to PHP. – Williham Totland Aug 15 '16 at 12:04
  • 5
    @WillihamTotland PHP is pretty awful regarding naming and documenting. But `noent` is already present in [libxml](http://xmlsoft.org/xmllint.html) (at least as parameter in their command line tool, and the other bindings such as ruby use the same flag as well). – tim Aug 15 '16 at 12:17
  • 9
    @tim A poorly named, poorly document flag that _can't_ be blamed on PHP? Frankly, I never thought I'd see the day. ;) – Williham Totland Aug 15 '16 at 12:22
  • An example of quadratic blowup: [the billion lulz attack](https://en.wikipedia.org/wiki/Billion_laughs). – Naftuli Kay Aug 15 '16 at 17:38
  • @NaftuliTzviKay Sure, it's a question of definition, which isn't always 100% clear. You could - as wikipedia does, and which does make sense - say that quadratic blowup is a variant of billion laughs. But often the specific attack which uses nested entities is called billion laughs, and the other variant is called quadratic blowup ([the source wikipedia cites](https://docs.python.org/2/library/xml.html#xml-vulnerabilities) does this for example). – tim Aug 15 '16 at 18:03
  • Wait... you say the "billion laughs" attack will be caught by the parser because it contains a circular reference, but then you say it's vulnerable to quadratic blowup. The "billion laughs" attack doesn't contain any circular references; it's a quadratic blowup issue. – Mason Wheeler Aug 15 '16 at 19:38
  • @MasonWheeler no, it doesn't contain circular references, but nested references, which are caught. quadratic blowup doesn't contain nested references, but just the same reference over and over, which isn't caught. – tim Aug 16 '16 at 10:14