tl;dr:
Encoding should be done at the time the data is being used in context - when the code is composing HTML, javascript, etc from untrusted data - that's true whether you are composing on the server or client. At that point you know what parts are data, and what the encoding context is. You should leverage binding (ng-bind) in Angular for context specific encoding. The AngularJS Sanitize function is useful when you need to "sanitize" a hunk of untrusted HTML.
Details:
Its good to be careful when the term "sanitize" is being used, because different libraries use it differently - it can mean validation, canonicalization, simple encoding, parsing and encoding.
The AngularJS Sanitize function "sanitizes" inputs by parsing the HTML into tokens. All safe tokens (from a whitelist) are then serialized back to properly encoded values. In this way it behaves similarly to
the OWASP HTML Sanitizer Project and the older OWASP Antisamy project.
See
https://odetocode.com/blogs/scott/archive/2014/09/10/a-journey-with-trusted-html-in-angularjs.aspx for more information on using the library.
This is a fine way to encode untrusted input if you don't have the luxury of composing the HTML, javascript, or URL from its parts. However:
- The parsing and removal of tags is more prone to being bypassed - I don't know of any current issues
- It modifies the HTML to remove unrecognized tags, which can sometimes be fragile
- It has to make assumptions about whether the underlying content is encoded
- The tag support will be limited to the white list - however, you can decorate the $sanitize service to change the list
So where you can:
- Encode at the time the data is being used in the context - for example by using binding.
- Leverage the Sanitize function if you didn't compose it
- Try to leverage CSP (Content Security Policy) as a defense in depth