Continous deployment and Vaadin

jorg.heymans · September 20, 2024, 8:41am

Anyone out there deploying Vaadin in a CD fashion ? When going through a period of active hotfixing or feature releases, a concern is that the “new version is available - please reload” notification offered by kubernetes-kit pops up too frequently and disturbs users. If we don’t use this feature we can pretend to do releases silently but users will still incur unexpected page reloads suddenly - this is also annoying. Can this be avoided at all ?

So far i have not found a good compromise to this, we want to release whenever necessary at any point of the day but without disturbing the users. As you would do in a stateless back-end. Blue-green deployment does not solve the issue.

SimonMartinelli · September 20, 2024, 9:59am

You are comparing a full-stack application with just a stateless backend.

If your UI changes the app must also be reloaded. In a SPA setup by the user or automatically if the version of the API doesn’t match the UI.

jorg.heymans · September 23, 2024, 8:14am

Sure, but as long as there are no ui or api changes you can do a lot of work and redeploys on an spa without the user in the browser ever noticing a thing. With Vaadin, I cannot imagine doing 10 rollouts per day on a busy application.

Hence my question, how are people handling this ? Is the conclusion that Vaadin and CD are not friends, and a timed nighly release plan is better?

@vaadin-devs: is the page reload on redeploy something that can be mitigated in the framework (dare i say ‘fixed’) ?

Leif · September 23, 2024, 8:45am

In an SPA, you need to be careful to maintain forward compatibility for endpoint APIs. In a server-side UI with serialized sessions, you instead need to make sure UI state serialized with the old version can be deserialized by the new version. This is typically a relatively big effort since you need to manually deal with any case where an instance field is added or removed in any component class or when a lambda is updated to capture some additional value in its closure. This means that there are many more cases where it’s not safe to let active users start using a new version of the app without the reload that also causes the state to be re-initialized.

The VersionNotifier logic can certainly be updated but I don’t see how it could automatically detect what to do without input from application code. The big question is thus what logic you as an application developer would want to use to determine when it should be show?

jorg.heymans · September 23, 2024, 10:57am

The VersionNotifier logic can certainly be updated but I don’t see how it could automatically detect what to do without input from application code. The big question is thus what logic you as an application developer would want to use to determine when it should be show?

Indeed, only the application can know if a new version is session compatible with the old version. Right now Vaadin imposes the safe path and does a full page reload to avoid issues.

This means that there are many more cases where it’s not safe to let active users start using a new version of the app without the reload that also causes the state to be re-initialized.

Sure, but could the development team get a chance to decide this, along the lines of SessionSerializationCallback.
? Then at least some portion of the reloads can be avoided, am I correct? It’s quite a big toy to give, I admit.

Also, if developers would manage carefully serialVersionUID then deserialization would fail with an explicit InvalidClassException and you can still opt to do a full reload at that moment ?

Leif · September 23, 2024, 11:51am

Note that this is not a binary question but there are at least three potential options:

The update is fully compatible which means that users can be transparently directed to the new version. Since the load balancer just sees a cookie, there would either have to be a way of configuring it to do different things with different cookie values or alternatively, a way for the application to update the cookie the next time there’s a request from the user.
The update is incompatible but not urgent so existing users can keep using the old version for the time being while new users should be directed to the new version. Note that you still eventually want to shut down the old version so you want to eventually start showing the notification.
The update is incompatible and somewhat urgent so you want to notify users to make them update asap.

jorg.heymans · September 23, 2024, 12:34pm

1 → Why would the session cookie need to change ? In multi-instance there’s always an external session store configured.
2 → We avoid this scenario, newly connected users seeing different behaviour than already connected users is confusing.
3 → Yes, that’s the current assumption in Vaadin.

ollit.1 · September 23, 2024, 1:01pm

This sounds like something that’s bound to end up badly - as the number of changes ever increases, someone’s bound to forget to make such a small and unenforced change sooner or later, and the consequences can be bad.

Leif · September 23, 2024, 1:36pm

It’s not the session cookie. Kubernetes Kit sets a dedicated cookie to define which version the load balancer should direct requests to. When you click the notification to update to a new version, what actually happens is that the cookie is updated to direct the user to the new version and then the page is reloaded so that the cookie value is actually used.

You will have that anyways. In an SPA, the client already has the JS and thus also the template for the old implementation loaded and it will not change until they reload the page. With Flow, most parts of the component tree are stored in the session which means that the old component remains in use at least until the user navigates away from the view whereas global UI elements will not change until a new session is created.

jorg.heymans · September 24, 2024, 1:15pm

Leaving aside the fact that it’s not any better/worse with SPA, and the potential disasters that could happen if you don’t refresh, is it feasible technically to let the developer decide to forego on the full page refresh ?

Leif · September 24, 2024, 1:59pm

The notification is shown when receiving a request with a X-AppUpdate header value that is different from the version of the application that handles the request. This header is injected into all incoming requests by the load balancer based on the proxy_set_header line from the example configuration in the documentation. If you skip that part when you deploy a new version, then no users will be notified about that version.

If you then apply the config with affinity-mode: "persistent" for the new version and optionally also remove the old version, then requests for current users will be directed to the new version which might or might not work depending on how the session is deserialized.

jorg.heymans · September 25, 2024, 8:40am

What we are seeing is a full page refresh each time a backend that the user was connected to (via sticky session) is undeployed and replaced with another one. Using kubernetes-kit and redis session store, we don’t use the X-AppUpdate header. This is the case even without any code changes, just doing a rolling restart. Note that our sticky sessions mechanism uses its own mechanism to route, not the kubernetes-kit cookie.

Leif · October 4, 2024, 8:19am

Let’s start with a disclaimer. I don’t have practical experience specifically with Kubernetes Kit even though I’m intimately familiar with the overall architecture any everything related to Flow on its own. I don’t have a Kubernetes Kit test environment running on my machine and I haven’t tried to replicate the symptoms that you’re describing.

I was originally thinking that the issue here was related to changing to a different version but I’ve started suspecting that it would behave exactly the same for you if a user would be moved over to a different server running exactly the same version. In other words, this might be a problem with failover in general rather than something specific to version updates.

I suspect the underlying issue is related to the way the handover works when switching to a different server. What should happen is that a new HTTP session is started on the new server and then the component tree from the old session is deserialized and injected into the new session. This can cause problems if there are multiple concurrent requests at that moment because each request will then have a session cookie that the new server doesn’t recognize which makes the server create a new session for each request. The browser will then end up using the session id from the last received request in all future requests and that might not be the session to which the old component tree was loaded. This should typically not be a problem since the client only sends one request at a time but this is not necessarily the case if there’s also an open push connection.

This leads to two questions that might help understand what goes on here:

Do you use push?
Could you look at session cookie headers in the browser’s network inspector for responses to the first responses received from the newly deployed server to see if the server does indeed create multiple session for the same user?

jorg.heymans · October 4, 2024, 10:26pm

Thanks for your thoughts.

The sequence is like this: the client gets a jsessionid from an instance when first accessing the application. The http session is stored by that instance in redis using jsessionid as a key. When that instance goes down, the next request of the user is routed by the load balancer to another instance. Note that the jsessionid has not changed, the client is unaware. That new instance now receives the request, loads the sessionid from redis and handles the request with the correct session state. After that, again the session is serialized to redis. Why would a fresh http session need to be started on the new server?

We don’t use push.

As to what the session cookies are concerned, I will check this. And probably yes it is receiving a new session cookie and hence somewhere in flow a full page refresh is triggered probably.

jorg.heymans · October 6, 2024, 11:56am

EDIT this only applies when starting in development mode, so not relevant for this discussion.

I have isolated an example, which i think demonstrates the essence of what i’m trying to get at.

Use the vaadin helloworld starter
In Application.java, add this to persist sessions across restarts

  @Component
  public class CustomContainer
      implements WebServerFactoryCustomizer<TomcatServletWebServerFactory> {

    @Override
    public void customize(TomcatServletWebServerFactory factory) {
      factory.addContextCustomizers(
          context -> {
            context.setManager(new PersistentManager());
            FileStore fileStore = new FileStore();
            // can be any directory
            fileStore.setDirectory(System.getProperty("java.io.tmpdir"));
            ((PersistentManager) context.getManager()).setStore(fileStore);
          });
    }
  }

Start the application. On the home screen fill out the text field and press the button
Stop the application. Verify that a .session file is written to java.io.tmpdir. The browser screen shows no errors.
Start the application again. Go back to the browser, press the button. The screen will do a full page refresh. The jsessionid did not change.

From my birds eye view then, given that there were no code changes, and that the session state was fully persisted, and the session id did not change, why is there a full page refresh? Distributed session stores or kubernetes-kit have nothing to do with this.

marcoc_753 · October 7, 2024, 5:49am

Following the above steps, I cannot reproduce the behavior in production mode. When the server is restarted, I see a temporary “online” notification, but no page reloads. The value in the text field is preserved, and pressing the button shows the notification with the expected value.

Edit: I tested it first with a project created with start.vaadin.com and then with the hello-world starter. Both worked, I could not reproduce the issue running the app in production mode.

jorg.heymans · October 7, 2024, 8:12am

Indeed, when running in production mode this behaviour does not show. I was running in development mode. So, my minimal reproducible example does not apply, apologies. I will come up with a more advanced example then using 2 instances, a LB, and redis. Stay tuned.

jorg.heymans · October 8, 2024, 3:40pm

Alright second attempt at a setup that will allow us to talk more about this issue. The basis is the vaadin starter attached, which is the basic starter with AppLayout (seems needed to reliably reproduce)

unzip the repro project attached to the support issue (or let me know a better way to share zip files)
start redis docker run --name redisvaadinrepro -p 6379:6379 -d redis
start haproxy, pointing to the absolute path of the config file provided in the zip docker run -d --net=host --name my-running-haproxy -v /path/to/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro -p 8080:8080 haproxy
mvn clean install -Pproduction to build backend
start the 2 instances on different ports i.e. java -jar target/spring-skeleton-1.0-SNAPSHOT.jar --server.port=8089 and java -jar target/spring-skeleton-1.0-SNAPSHOT.jar --server.port=8088

Now the setup is complete with haproxy, roundrobin instances and stickysession based on the jsessionid cookie.

Then the test scenario:

go to localhost:8080 in a private window, put some text in the inputfield and press the button, the text should appear below the button
observe in the log output which instance is handling the request, it will have c.v.k.s.s.SessionSerializer output.
ctrl-c that instance
go back to the browser and press the button, you should see a full page reload happening and the text below the button is gone indicating that the state has been lost.

marcoc_753 · October 23, 2024, 6:28am

Here’s a summary of the findings about this issue:

there’s a bug (and a fix) in KubernetesKit that postpones the creation of the cluster key cookie after the initialization of the HTTP session. This causes concurrent requests to get different cluster keys, but only the first one is stored in the HTTP session and used by Kubernetes Kit. The other requests will overwrite the cookie value, but not the one in the HTTP session, preventing the correct session to be restored when a new server node will handle client requests
Tomcat destroys all the HTTP sessions during shutdown, if it is not configured to persist them. This triggers the Kubernets Kit listener that tries to delete the distributed session. The fix for this is to set server.servlet.session.persistent=true in application.properties
The Kubernetes Kit SessionTrackerFilter must be executed before any other filter that requests HTTP session creation; otherwise the distributed session will not be restored. Make sure that your custom filters calls request.getSession(false) to prevent HTTP session creation, or configure them in a way that they are executed after SessionTrackerFilter (that has @Order(HIGHEST_PRECEDENCE + 50))