Discover the Vaadin Kubernetes Kit
Vaadin Ultimate subscribers can access our Kubernetes Kit that contains documentation, examples, and libraries to help you run a successful cluster with Kubernetes. The Kit makes it easy to set up a modern environment that allows you to deploy updates any time of day - without interruptions.
What’s included in the kit?
- Documentation and examples on how to set up a Kubernetes cluster with tested best practices for Vaadin apps
- Zero downtime updates with blue-green and canary patterns for smooth updates and rollbacks
- A UI kit to inform users about the availability of a new version and allows them to move to the new version in a controlled manner
What is a Kubernetes cluster?
Web applications are often hosted on a cluster of servers. This way you can achieve better scalability and performance, higher availability during updates, and resilience to server crashes. Your team structure or the architectural style, like microservices, might also imply clustering.
Vaadin is well suited for clustered environments, nowadays often orchestrated using Kubernetes. There are a couple of best practices and tips you should note, though.
Distributing different kinds of workloads
Perhaps the most common way to improve scalability is distributing the workload to multiple server nodes based on the task. In the most trivial case, the database is moved over to a different server, but it can also be a three-tier architecture or go all the way to microservices.
Scalability is not the only reason to opt for one of these options, separation of concerns between multiple teams and projects is another good reason.
Vaadin's main responsibility is the UI of the application, but the application might also be running time-consuming computations, doing complex data-wrangling, or interacting with – sometimes slow – external systems. If the application is well architected, any task can be split out and run on separate server nodes as the need arises.
The Java ecosystem provides excellent native options to consume remote services or access REST services written on any technology.
Scale horizontally - for better performance and improved availability
Splitting the application into a single server node for each task and beefing up the server specs (adding more CPUs and memory, a.k.a vertical scaling) will only let you scale to a certain level.
The next level is horizontal scaling, which means distributing the same task to multiple nodes to scale your application for any level of load.
Resilience and availability can also be improved with multiple servers doing the same job. The effects of hardware and software failures can be mitigated, and upgrades and updates to the system can potentially be seamless.
Horizontal scaling can, for example, mean that you have multiple Java server nodes, or application servers, executing the same Vaadin UI code.
A load balancer (or front proxy) sits between the application servers and the internet, seamlessly distributing the incoming traffic to one URL between multiple physical application servers.
Alternatively, or in conjunction with front proxies, you can use DNS-based techniques like geographical load balancing.
Horizontal scaling is the part of clustering where some Vaadin know-how is most valuable.
Sticky sessions or serializable sessions - pick either or both
Many of the great things like abstraction, security, and simplicity that Vaadin Flow provides you spawn from the fact that UI logic and state live inside the JVM.
For horizontally scaled Vaadin Flow applications, the trade-off is that you either need to use session replication or session affinity to ensure access to the correct UI state.
The former, also called shared sessions, means the UI state is continuously available to all the server nodes. Generally, this solution incurs a latency penalty due to serialization and transfer.
The latter is also colloquially called sticky sessions and means that all the traffic from one user goes to one single server node where the session lives.
So, the architectural choice is to pick at least one:
- Sticky sessions
- Serializable sessions shared between server nodes
Sticky sessions are much easier for developers to implement and superior from a performance point of view. This approach is the recommended choice for most Vaadin applications. Sticky sessions or session affinity is typically enabled by configuring the load balancer.
Session replication still has its place, especially when aiming for as high availability as possible. For instance, replicated sessions can prevent data loss in certain situations when one of the UI servers fails. However, other factors affect availability much more than session replication in most cases.
Monitoring system health and ensuring smooth updates and minimal maintenance breaks can have a much more significant impact on uptime.
If you still decide to go with session replication, here are some things to consider:
- The application server's built-in session replication features often provide the best experience.
- Use tests to ensure your session does not grow too big and does not refer to any non-serializable objects.
- When using in-memory options, use a replication strategy that only synchronizes the session to one backup node, not actively to all nodes. When used in conjunction with sticky sessions, this minimizes I/O between the nodes and improves scalability.
- Spring Session Redis has a bug/optimization making it incompatible with Vaadin.
- Session locking may be problematic with multiple simultaneous requests.
- Server push and dynamic UI updates originating from non-UI threads, especially when using a WebSocket connection, are inherently incompatible with shared/serialized sessions.
Handling updates with zero downtime
Scheduled maintenance (or even unscheduled, usually to deploy an update) is the most common cause of downtime for a well-behaving application. As the industry is moving towards continuous delivery, the issues caused by software updates are becoming more relevant. Users are also increasingly accustomed to evergreen apps with seamless updates.
Ultimately, avoiding downtime during updates can be a more important reason to invest in a horizontally scaled cluster than to preemptively prepare for a huge amount of concurrent users or anticipate hardware failure.
Scheduled maintenance windows have traditionally been the go-to fix for service disruption caused by software updates. For certain types of updates, it might be the lesser evil among available options, but we do not suggest making a habit of this practice. From your users' point of view, it does not matter that the downtime is caused by an update; the application is still unavailable.
Furthermore, no schedule is perfect: engineers probably do not enjoy coming in on nights and weekends to do tightly scheduled maintenance work. And there is no low-traffic window of opportunity at all if your application is global.
When implementing your zero-downtime update mechanism, consider the following:
- Give the users a possibility to choose when they are ready to move over to the new version (upgrade), at least within a given timeframe. This way they maintain a sense of control and predictability and are not surprised by sudden changes that can cause work to be lost.
- Preferably inform the user about what's new in this version and changes that might affect them; consider displaying a notification, message, or section in the help system.
- Empower your IT department to do updates whenever needed, even during rush hour, without causing significant UX issues for your users. New features and bug fixes can quickly be deployed to production while engineers can maintain control over their working hours.
- Have a way of testing (at least smoke test) the new version in the production environment before switching users over.
- Ensure you have a way to downgrade (rollback) seamlessly if something unexpected happens.