You can put raw XHTML content in many components, such as the Label and CustomLayout, as well as in tooltips and notifications. In such cases, you should make sure that if the content has any possibility to come from user input, the input is well sanitized before displaying it. Otherwise, a malicious user can easily make a cross-site scripting attack by injecting offensive JavaScript code in such components.

Offensive code can easily be injected with <script> markup or in tag attributes as events, such as onLoad. Cross-site scripting vulnerabilities are browser dependent, depending on the situations in which different browsers execute scripting markup.

There is no generic way to sanitize user input as different applications can allow different kinds of input. Pruning (X)HTML tags out is somewhat simple, but some applications may need to allow (X)HTML. It is therefore the responsibility of the application to sanitize the input.

Character encoding can make sanitization more difficult, as offensive tags can be encoded so that they are not recognized by a sanitizer. This can be done, for example, with HTML character entities and with variable-width encodings such as UTF-8 or various CJK encodings, by abusing multiple representations of a character. Most trivially, you could input < and > with &lt; and &gt;, respectively. The input could also be malformed and the sanitizer must be able to interpret it exactly as the browser would, and different browsers can interpret malformed HTML and variable-width character encodings differently.

Notice that the problem applies also to user input from a RichTextArea is transmitted as XHTML from the browser to server-side and is not sanitized. As the entire purpose of the RichTextArea component is to allow input of formatted text, you can not just remove all HTML tags. Also many attributes, such as style, should pass through the sanitization.