Grid loads more data then it should

Hi i have a Grid with setPageSize(10)

This will give 10 as limit to the dataprovider, correct? It works generally, but it loads 5 x 10 data sets from the database without scrolling, why is that so and is there a way to prevent it? The next 10 should only be loaded if they should (user scrolls etc)

My DataProvider

`

public class OrderDataProvider extends AbstractBackEndDataProvider<OrderEntity, Void> {
    private OrderRepository orderRepository;
    public OrderDataProvider(OrderRepository orderRepository) {
        this.orderRepository = orderRepository;
    }


    @Override
    public Stream<OrderEntity> fetchFromBackEnd(Query query) {
        int offset = query.getOffset();
        int limit = query.getLimit();
        Optional<CustomFilter> filter = query.getFilter();
        return orderRepository.findOrdersPaginated(offset, limit, filter).stream();
    }

    @Override
    public int sizeInBackEnd(Query query) {
        Optional filter = query.getFilter();
        return orderRepository.countOrders(filter);
    }
}`

How many rows are shown / visible, when you open the view and have the 5 fetches? In short: it loads as many rows as it needs to show (and I believe some additional ones to have a smoother scrolling experience but idk how many are loaded in advance)

Grid always keeps some extra rows loaded to reduce the risk that the user has to wait while more data is being loaded.

Furthermore, when you initially add a Grid to a view, then it doesn’t yet know how big the grid will be and how high each row will be so it makes a guess that errs on the side of loading too much to reduce the risk that the user initially sees only a partially populated grid.

I use the grid as a better option for virtualList. I use one row as a card renderer. So each row is 1 card in 1 column.

I have problems with VirtualList and auto fetching next data sets, thats why i switched to using a grid.

The inital view shows 5 cards, but 50 gets loaded and 10 is the set pageSize.

And why is this a problem?

It loads unnecassary data into memory which i would like to prevent.

As Leif said this a usability optimization.
Usually it doesn’t matter because if you load 10 or 50 rows from the database doesn’t matter. And also the additional used memory is minimal.

Depends on the dataset how much memory it consumes. In this case i render a Div like a card per dataset, which consumes more mem than plain data in a usual grid i think.

I will retry to use virtual list, maybe the error is gone which i had before with it (but cant remember what it was).

But this usability optimization i think is not predictable. If i set pageSize 10 then i usually expect getting 10 items not 60 which is way to much. I cant find this info in the documentation, is that documented anywhere - would make it easier to understand that behaviour.

But before i made changes, i will check how many users use this kind of view. Maybe its really not worth it changeing it.

Just to give you some numbers.

I use cards a lot, also in one ERP system. There we have a fetch size of 500. We did load tests, and this is absolutely not a problem.

As Donald Knuth said: “Premature optimization is the root of all evil”.

Well said ;)

How have you made load tests? I have tried it using K6 but the source server went down before the targetserver. (it simulates x chrome visitors which consumes really much RAM on the maschine which make the load tests) Do you have any suggestion how to handle load tests succesfully / simulate tons of users?

I do crude manual tests to check actual memory consumption by adding this snippet to the application: Measure Vaadin UI memory consumption by opening lots of tabs · GitHub

Before running, I make some local changes to the application to bypass authentication and have the view I want to measure as the default route (i.e. @Route("")). I then run in production mode and follow the steps mentioned in the gist to get an idea of how much memory is used per user who has that view open.

I see that it can be confusing that the grid loads more than one page in the begging.
On my end it was more of a SQL duration problem than a memory problem. When opening the grid, it also loaded 3 pages which too quite some time in heavy tables. I increased the page size to 150 and it loaded only once and improved the time.

I also think this is improving UX: when starting to scroll i don’t want to see an empty grid and wait to load. Having like one page prefetched, does already have some rows to show.
What happens if the user uses a 4k or vertical monitor? It shows more rows → loads more pages → longer loading time.
I personally would prefer increasing the page size in favor of loading time.

We used addon from Johannes to turn Testbench tests into Gatling Load tests

I don’t know if this still works.

But you could use the Gatling record and then disable CSRF

1 Like

@Leif , i have used your Gist with two different views, but i dont know if i interpret it correctly.

View1

View2

Does it hold only HTML elemets like DIV and other elements like this or are Pojo/Entites and their data like Strings, ints etc in the memory number as well (which are loaded in the view class)? What about the Spring classes etc?

What i dont understand is the memory/heap usage, maybe some one can help me out. I am confused.

The app starts with xms4g and xmx4g so there is not that much GC necassary, i use G1C1.
I have a Heap Floor of ~300MB, the Heap goes up to max ~2,8GB an then drop back to ~300MB and this all the day. So far so good. The GC time is between 100 and 200ms on each drop, which should be fine.

So what is now the real memory usage, can i read from this data how many users/sessions/views the app can handle parallel? I understand, that there is a baseline mem usage (floor of the heap?)

I would really love if some one could it explain for dummies :smiley:

(Ok i have a little Mem Leak i think, when i restart the app the floor is ~100MB and goes after 3 days to ~300MB, looks like something wont be collected. But thats not part of this topic, i will check this later maybe its kind of warmup. I will check this during some days if it stays around 300mb floor or if it grows over time)

The bottleneck is usually not the memory, but the thread pool and the database connection pool.
(That’s why Java introduced Virtual Threads)

Please provide some numbers:
How many users do you expect, and what will they do with your application?

The only way to find out if your app is capable of serving the expected load is to do load testing with a realistic scenario with realistic data.

This tells that for each additional user that has View1 open, you will need at least an additional 170 KB of memory. Correspondingly for View2, the number is 603 KB. That is the total memory consumption per user including all component instances and all data referenced by those component instances. Those are quite typical numbers for Vaadin views so there’s probably no reason to start optimizing things unless you’re really sure memory usage is a bottleneck.

In practice, you will need slightly more memory than that because there always needs to be some headroom for the garbage collector to function smoothly. I’m not an expert on that area but I’ve heard suggestions of 30% - 50% headroom. Applying that to your view sizes and rounding to even numbers gives around 250 KB for View1 and around 800 KB for View2.

There’s always also some baseline memory use for e.g. loaded classes, JVM internals, caches, and I/O buffers. In your minimal case that’s maybe around 100 MB but it might increase as more classes are loaded the first time each view is opened and so on. Taking a conservative estimate on that and including some headroom leaves you with 500 MB baseline usage, and thus around 3500 MB remaining for you maximum 4 GB heap size.

That leaves room for 14000 concurrent users on View1, 4375 concurrent users on View2, or a something in between for a mix of both views.

Thank you Simon & Leif, really appreciate your support!

The two Views are real world views which i have used for the gist. Usually the App loads a set of data from the DB and present it to the user, who need to enter some values, maybe upload PDFs etc and save it again. It has also some REST Endpoints, which are used to fetch status of an invoice for example but those REST endpoints are not in the UI State of course, its Spring Boot without a view.

Its like the easiest ERP system you can imagine, opening Offers, saving them, so basically nothing special. The only thing i dealed with is showing images which are saved in the DB, but we are now streaming them instead of holding them in RAM, this was a good move i think.

Let me try to get some data ( i have build a admin panel where i count registered views etc)
Currently opened UIs: 57
Active Sessions: 72 means 29 idle sessions which are about to close in some minutes (12 min timeout is set)
The current Heap floor with those UI is at 295MB (today 7 am after a GC it was at 350MB. So the GC can collected more trash at 10am, than the GC in at 7 am, which is quite well i think. So it doesnt looks like there is a huge mem leak, otherwise the floor heap would be really increasing over time.

@Leif 14.000 Users for View 1? That sounds crazy. I bet i could optimze View2 so it can will be reduced as well.

So what i take from your answer: each UI holds the shown data (for Example OrderDTO which presents the Databasedata which is a field in the Class) but Spring Services, Repositories etc are not in the Gist values, but those are in the baseline floor.

So the main issue i have is, that i cant imagine that this app can handle 1000 Views in parallel. In best (or worst) case, it must handle 1500-2500 parallel Views. (Its possible that only 500 Users are logged in, but they have 1000 Views open, some users have 5 Tabs open, others only one)
This makes it quite hard to guess what this setup really can handle.

The bottleneck like Database etc is another team’s topic if it becomes an issue, but there are also enough resources i think. Currently we are running with a pool of 10 connections and 2 idle.

(I have the option to increase the RAM to 16GB or a second instance in Azure which will handle the session automatically by AAR Affinity Sessions - which should be sticky sessions to glue one session to one instance with auto routing the user to the correct instance )

That’s right. Services and repositories are singletons so there’s only once instance of those no matter how many users are active.

What the snippet in my gist measures is really how the total post-GC memory consumption changes as the number of actually open view instances changes. This means that the calculation considers everything that was allocated due to a specific view instance. It ignores singletons and other things that stay the same regardless of the number of active users.

1 Like

One thing i missed to ask. I usually check the heap health like this:

If the baseline stays at 300MB and get after a GC back to this 300± its healthy. I only start worrying if the floor increases after each GC and never get below the previous floor value, right?

Or are there other memory measurements i should have an eye on?

So the Buffer between floor and Peak (2,8GB) is the memory which gets used and dropped, so you have any suggestion how much “Bufferspace” between floor and xmx should be?

The JVM tries to use most of the memory that is available to it if that allows it to postpone doing a full GC. High peak consumption in itself is not a problem as long as it gets back down to a reasonable level after each GC. You start having an actual memory problem only if GC has to run too frequently since that means lots of CPU capacity is used for running GC instead of running the actual application.