Vaadin data provider with Elasticsearch

Is there any good example or topic on using Vaadin data provider and lazy loading with Elasticsearch for large data sources?
Currently and back, in last couple of years and Vaadin versions, somehow, spotlight was always on lazy loading and scrolling large number of data in grid or any other component with data source using data provider. Loading large data sources and scrolling lazy e.g. application logs can be very difficult and almost unusable from relational database.

This is why other data sources must be considered as NoSql (document oriented) databases. One of the most used is Elasticsearch and currently we are trying to switch one of our largest db (MySql) table in production to new index in Elasticsearch, but at the very start we run up against problem in lazy loading Vaadin grid with more than 10k rows.

Data provider sending offset and limit all the time user use random scrolling but when offset is bigger than 10k elasticsearch throws an error in basic SearchRequest (we are using new RestHighLevelClient). There are some alternatives with scrollApi and search_after params but all of that is little hard to achieve with vaadin data providers.
Alternative is to switch to paged grid but just wonder if something like this is doable with simple lazy loading for grid and data provider?
It is very strange that nobody still hasn’t dealt with it here in some forum post or blog.

Hi Nemanja,

We are currently working on a set of changes that should make things somewhat better in some cases, but not sure if those would help this case.

The main change is to implement lazy data binding that don’t require you to report the size. That’s impossible or very expensive with certain backends. In your case using that version of the lazy data binding would effectively make it impossible to get those over 10k requests. When scrolling to the end, it always just extends the dataset → one would need to scroll to the end around 200 times before getting that far.

Not sure if it is an UX issue for your app or not. Probably with that big datasets you have a decent search functionality in you UI, so I’d guess not.

The “cursor”/“reference item” style Elastic search supports (search_after) has also been proposed by another customer recently and that might be an other improvement in the future. I don’t remember if there is an enhancement issue created for that yet.

cheers,
matti

Matti Tahvonen:
Hi Nemanja,

We are currently working on a set of changes that should make things somewhat better in some cases, but not sure if those would help this case.

The main change is to implement lazy data binding that don’t require you to report the size. That’s impossible or very expensive with certain backends. In your case using that version of the lazy data binding would effectively make it impossible to get those over 10k requests. When scrolling to the end, it always just extends the dataset → one would need to scroll to the end around 200 times before getting that far.

Not sure if it is an UX issue for your app or not. Probably with that big datasets you have a decent search functionality in you UI, so I’d guess not.

The “cursor”/“reference item” style Elastic search supports (search_after) has also been proposed by another customer recently and that might be an other improvement in the future. I don’t remember if there is an enhancement issue created for that yet.

cheers,
matti

Alright, thanks Matti!

Probably, for now, it’s better to switch to grid with pager (available only as addon unfortunately in vaadin directory?), which is somehow doable with search_after parameter in elasticsearch.

All the best,
Nemanja

Matti Tahvonen:

We are currently working on a set of changes that should make things somewhat better in some cases…

Sorry, not related to this but as you just mentioned here, is there maybe a chance that you are working on that well known issue with data provider default fetch size of 50 items? We have a grid with little bigger height rows and many template renderer columns. It is enough just to fetch some 10 items at start and then 5 or 6 max, but at starts it first fetches 10 and right after that new 40 which is not necessary because that causes slow render time and stucked ui. grid.setPageSize(10) is ignored.
I posted that question already many times but still nothing on it.

Thanks.