NexJ Logo

HTML Sanitization

HTML sanitization is currently being done by the third party library [http://code.google.com/p/owaspantisamy/AntiSamy]. The goal is to filter out any potential threats such as [https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html], breaking the layout, or phishing attempts.

The rules are based on a white list that contains all the allowed tags, attributes and styles.

Whitelisting Markup

The white list can be found in the enterprise repository with the name “antisamy.xml.”

As browsers improve, we may need to introduce new tags, attributes or styles. The current policy file is heavily commented and fairly easy to understand but the [http://code.google.com/p/owaspantisamy/downloads/list developers guide] may be useful if more in depth information is required.

Building AntiSamy

To prevent an unnecessary dependency (commons-httpclient) we build a custom JAR. The source can be pulled from:

https://hg.nexj.com/oss.cgi/antisamy/

To build it and generate the JAR file, use the targets: clean, build, and jar. This will create the JAR file in the directory called “jar” in the root folder of the project.

Blocked Markup

Below is a list of intentionally blocked markup and the reasoning behind it.

JavaScript

For obvious reasons all forms of JavaScript will be filtered out. This includes but is not limited to JavaScript in