Session Replication in the World of Vaadin

Comparison of sticky sessions and round robin load balancing
We often get questions about how to use session replication together with Vaadin. Many of Vaadin's unique capabilities are based on the way the entire UI state is stored in a server-side user session. For this reason, our recommendation is to rely on session replication as little as possible. Depending on the use case, there are alternatives that don't rely on session replication or that reduce its use.

Update: Kubernetes Kit that was introduced in Vaadin 23.3 contains all the low-level integration that is needed for using session replication in Vaadin Flow applications. The same functionality is also included in cloud provider kits such as Azure Cloud Kit.

What's the Problem with Session Replication?

Session replication is used when multiple servers run the same application, and they all need access to the same session data. For smaller deployments, the session is serialized after each request and immediately propagated to all other servers. For bigger deployments, the session is deserialized from a shared database at the beginning of the request and serialized back to the database at the end of the request handling.

The first problem here is that the entire UI state needs to be serializable for this to work. All the basic building blocks of Vaadin are already serializable, but the application developer also needs to ensure all their own classes are also serializable. Furthermore, third party classes that are directly or indirectly referenced from the UI state also need to be serializable.

Another problem is that the size of the UI state affects how long it takes to process the session data and transfer it between different servers. The size of a single session is typically measured in hundreds of kilobytes, or in some cases even megabytes, depending on how the application manages its data. Because the state is expressed as Java classes that might have arbitrary references to each other, there is no practical way of storing individual pieces separately and thus only process the parts that have actually changed.

Finally, every HTTP request to a Vaadin application will lead to some changes in the session. Even in trivial cases where nothing in the application itself changes, there are still internal things such as timestamps and sequence numbers that are updated. This means that it's of utmost importance that multiple requests for the same session are never handled in parallel.

Vaadin uses a ReentrantLock instance that is stored in the HTTP session to ensure the UI state isn't accessed by multiple parties at the same time. The lock is automatically acquired for regular request handling and when using UI.access(). Relying on a ReentrantLock only works when everything related to one session is isolated to one JVM. For a session that is replicated between multiple servers, concurrent access needs to be ensured separately. In particular, a distributed lock for a specific session must be acquired before even deserializing the session data.

All this means that generic "transparent" session replication solutions cannot be used. Instead, it would be necessary to build something specifically to Vaadin and also make changes to bypass the built-in use of ReentrantLock. Even with those changes, application development gets more complicated and the user faces increased latencies.

Scale Out Using Sticky Sessions

One common use of session replication is to deal with many concurrent users by running the application on multiple servers with a load balancer that distributes requests between the servers. The simplest approach on a conceptual level is that the load balancer assigns each request to any available server without considering the contents of the request. This in turn requires session replication.

A good alternative to session replication is to instead use sticky session. This means that the load balancer looks at the session id cookie in each request and uses that to determine which server should handle that request. In that way, all requests related to a specific session are handled by the same server, so there's no need to replicate sessions. The load balancer can still pick any available server for requests that do not yet have a session id cookie, or if the load balancer knows that the session in question is expired.

In some cases, an even simpler approach might be used. Vaadin can handle quite many concurrent users as long as the business logic triggered by each user doesn't use too much server resources. So instead of load balancing the entire application, it might be easier to keep the UI state on only one server but distribute the application logic over multiple servers. In that way, the server running Vaadin still needs enough memory to handle all concurrent sessions, but CPU and I/O capacity for the business logic can be scaled out quite easily.

Sticky sessions are also helpful when using session replication. If the request goes to the same server that most recently used a specific session, then the server can directly use the previous session instance without having to fetch the latest version from somewhere else.

Use Parallel Deployment for Rolling Updates

Another reason for using session replication is to allow updating an application by starting the new version before shutting down the old version. Here, session replication would be used so that the server running the new version can continue from exactly the same UI state that the user had for the old version.

In addition to the regular session replication challenges, this also requires special consideration so that a newer version of the application can deserialize objects from the old version, even if the structure of the class has changed. Furthermore, the updated server-side logic must be implemented to still support requests from old client-side logic that stays loaded in the user's browser. Both of these challenges are very difficult to deal with in practice.

Instead of using session replication for this kind of scenario, it's recommended to use an approach similar to sticky sessions. The only difference is that requests without a session id are always directed to the server running the new version. The server with the old version can be undeployed as soon as all its sessions have expired. Common names for this functionality are "parallel deployment" or "versioned deployment".

For users that are still on the old version, the application can show a notification that encourages the user to reload their application to get to the newest version. In this way, the old version can be undeployed more quickly than if waiting for all sessions to expire naturally. The user experience when reloading can be improved by using a single sign-on solution and by separately storing the user's current location and any unsaved changes.

High Availability Requires Highly Available Session Storage

The most complex use of session replication is for high availability purposes. This means that the user should be able to continue using the application without any disruption even if a server unexpectedly becomes unavailable. This scenario requires some kind of session replication since the session data must be distributed between multiple servers to ensure availability.

Rather than doing the full sequence of locking, deserializing, processing, and serializing for each request, it's better to keep using the sticky session approach for as long as a specific server is available. This means that the only overhead during happy path operations is that the session needs to be serialized and replicated after each request. With slightly relaxed availability requirements, the response can even be returned to the user immediately, instead of waiting until the latest UI state is safely persisted.

Locking and deserializing would then only be needed in cases when a server has become unavailable. In such situations, the locking can happen on the load balancer without making any changes to Vaadin's built-in locking scheme. It's instead the load balancer's responsibility to ensure that concurrent requests for the same session are never sent to different servers. In that way, the session replication scheme can be more lightweight than what would be needed if only relying on session replication without any help from the load balancer.

Summary

As you can see, there are various alternatives to session replication that will still achieve the same benefits, but in a way that allows enjoying the benefits of keeping the UI state and logic on the server.

Download the Scalability Assessment Handout