Docs

Documentation versions (currently viewingVaadin 8)

Vaadin 8 reached End of Life on February 21, 2022. Discover how to make your Vaadin 8 app futureproof →

Common Security Issues

Sanitizing User Input to Prevent Cross-Site Scripting

You can put raw HTML content in many components, such as the Label and CustomLayout, as well as in tooltips and notifications. In such cases, you should make sure that if the content has any possibility to come from user input, you must make sure that the content is safe before displaying it. Otherwise, a malicious user can easily make a cross-site scripting attack by injecting offensive JavaScript code in such components. See other sources for more information about cross-site scripting.

Offensive code can easily be injected with <script> markup or in tag attributes as events, such as onLoad.

Cross-site scripting vulnerabilities are browser dependent, depending on the situations in which different browsers execute scripting markup.

Therefore, if the content created by one user is shown to other users, the content must be sanitized. There is no generic way to sanitize user input, as different applications can allow different kinds of input. Pruning (X)HTML tags out is somewhat simple, but some applications may need to allow (X)HTML content. It is therefore the responsibility of the application to sanitize the input.

Character encoding can make sanitization more difficult, as offensive tags can be encoded so that they are not recognized by a sanitizer. This can be done, for example, with HTML character entities and with variable-width encodings such as UTF-8 or various CJK encodings, by abusing multiple representations of a character. Most trivially, you could input < and > with &lt; and &gt;, respectively. The input could also be malformed and the sanitizer must be able to interpret it exactly as the browser would, and different browsers can interpret malformed HTML and variable-width character encodings differently.

Notice that the problem applies also to user input from a RichTextArea is transmitted as HTML from the browser to server-side and is not sanitized. As the entire purpose of the RichTextArea component is to allow input of formatted text, you can not just remove all HTML tags. Also many attributes, such as style, should pass through the sanitization.