How to filter out text only from a RichTextArea?

I need to save the value of a RichTextArea to be saved in the database as a property of an Entity Bean.
How to save only the text content without all the html-related tags such as
and so on?

Hi, I’d use
. It’s a Vaadin depenency so you already have it on your classpath.


Thank you. I’ll use that :slight_smile:
EDIT: Actually, how do I import it to my project? I can see the jar in the folder MyProject/WEB-INF/lib
but I can’t import it in my eclipse.

Go To
Java Build Path
> Tab
and Click on
Add JARs

Thanks. It was actually in the Web-project’s build path already but not in the Ejb side yet. Had to fix that :slight_smile:

As Matti stated, Jsoup is a very good option. I used this already in several projects and works great.

You can specify your custom “WhiteList” object so only certain html/xml tags (the ones you provide in the list) are allowed in the texts.

Also unclosed tags or non-html characters can be cleaned up (JSoup.clean method)

It also supports XML content specifying the correct parser

Good luck