Vaadin Push Lessons Learned

I upgraded my application from 7.0 to 7.1 over the last couple days and the actual upgrade was painless as 7.1 appears to be completely backward compatible. Here’s what I learned in the process, including enabling push:

I switched from ProgressIndicator to ProgressBar as ProgressIndicator is now deprecated. This required me to call UI.setPollInterval to get server side updates (before I enabled push described below). One issue I ran into was that different ProgressIndicators had different polling intervals. If I know that a server side computation would be done in a second or two, I used a short poll interval. For longer computations, I used a long poll interval. That isn’t really an option now because there is a single poll interval. You’ll have to just pick some compromise value.

I then switched all my manual UI locking in background threads to using UI.access(). While the access() method is nice and less error prone than manual locking/unlocking, I did find it a bit tricky to write a utility class that will let me easily execute tasks that can be cancelled because you now have two threads of background execute: one doing the work, and one doing the UI update in the UI.access() call. I have some tricky threading code that I believe does what I need, but I really had to think through the various race conditions. It would be nice if Vaadin provided a utility class like a BackgroundUITask that had a doWork and doUpdate method, the former being called in an unlocked thread and the latter called with the UI lock. I could then do something like UI.workAndAccess(new MyBackgroundTask()) and it would handle running the two methods and support MyBackgroundTask.cancel() logically. I can share some code on how I did this if anyone is interested.

I enabled push and with Websockets and everything looked good with Jetty 8 server side, FF, Chrome, and Safari client side. I just added jetty-websockets and vaadin-push to my dependencies and then followed the configuration directions in the Vaadin Wiki.

I then tried HTTP Streaming and everything broke. I spent more time than I’d like to admit tracking down the issue. In the end, I found that I had the GzipFilter configured in my web.xml for Jetty. Apparently this filter buffers responses until the response is closed (or based on some other configuration parameters) and it therefore wasn’t sending the partial Atmosphere JSON responses to the client. I was able to add a
filter init parameter
to the filter to ignore the Vaadin push path:


<filter>
    <filter-name>GzipFilter</filter-name>
    <filter-class>org.eclipse.jetty.servlets.GzipFilter</filter-class>
    <init-param>
      <param-name>mimeTypes</param-name>
      <param-value>text/html,text/plain,text/xml,application/json,application/xhtml+xml,text/css,application/javascript,image/svg+xml</param-value>
    </init-param>
    <init-param>
      <param-name>excludePathPatterns</param-name>
      <param-value>.*/PUSH/</param-value>
    </init-param>
  </filter>

The next issue I ran into was that Websockets were working locally, but would break when I deployed to Amazon EC2. At first I thought this was an EC2 security group issue but I tracked it down to a custom HTTP proxy that I was using for my remote deployments. My proxy, similar to the ProxyServlet in Jetty, doesn’t proxy Websockets. The good news is that Vaadin falls back to HTTP Streaming and things work (aside from the Gzip issue above). The bad news is that I get this exception in my log:


[2013-08-01 12:06:52,303]
 [WARN ]
 [org.eclipse.jetty.servlet.ServletHandler          ]
 [qtp904581127-24                    ]
: /portalui/app/PUSH/
com.vaadin.server.ServiceException: java.lang.NullPointerException
        at com.vaadin.server.VaadinService.handleExceptionDuringRequest(VaadinService.java:1410)
        at com.vaadin.server.VaadinService.handleRequest(VaadinService.java:1364)
        at com.vaadin.server.VaadinServlet.service(VaadinServlet.java:238)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
        at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
        at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:256)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:370)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
        at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
        at org.atmosphere.client.TrackMessageSizeInterceptor.inspect(TrackMessageSizeInterceptor.java:96)
        at org.atmosphere.cpr.AsynchronousProcessor.invokeInterceptors(AsynchronousProcessor.java:286)
        at org.atmosphere.cpr.AsynchronousProcessor.action(AsynchronousProcessor.java:243)
        at org.atmosphere.cpr.AsynchronousProcessor.suspended(AsynchronousProcessor.java:166)
        at org.atmosphere.container.Jetty7CometSupport.service(Jetty7CometSupport.java:96)
        at org.atmosphere.container.JettyAsyncSupportWithWebSocket.service(JettyAsyncSupportWithWebSocket.java:70)
        at org.atmosphere.cpr.AtmosphereFramework.doCometSupport(AtmosphereFramework.java:1448)
        at com.vaadin.server.communication.PushRequestHandler.handleRequest(PushRequestHandler.java:109)
        at com.vaadin.server.VaadinService.handleRequest(VaadinService.java:1352)
...

The issue appears to be that the proxy strips the “Upgrade: websocket” header when proxing the request and this causes a NPE in Atmosphere. It seems like this should be handled more gracefully by Vaadin or Atmosphere to avoid the exception. A workaround I found was to modify my proxy to return a 200 (or anything other than 101) when it sees an “Upgrade: websocket” header because it will never be able to upgrade the protocol.

I use Shiro for all my authz/authc and I immediately ran into issues because of the way async servlets work with push. You can read my full solution
here
but the gist of it is that Shiro’s SecurityUtils can’t be used.

In the end, I’ve found that the combination of UI.access() and push have made my application feel much more responsive. My next step is to test all this out behind my real production load balancers but so far everything is looking good.

One more item I forgot about: Be careful how frequently you generate server side updates. I have a custom upload component that displays progress in a ProgressBar and I was updating it on every call to uploadProgress from the Upload component. I found that this would crush the browser because it would get push updates from the server for every one-thousandth of a percent as the bytes poured into the server. With polling, this wasn’t an issue because the browser would only grab changes every couple of seconds. I worked around it by only update the progress bar when the percent delta was greater than 1 percent.

How did you reliable set the polling interval and then turn it off (want to be sure it’s turned off since polling is not needed with PUSH enabled otherwise) in a reliable way for uploading? Is turning it on in uploadedStarted() and turning it off in uploadFinished() ensured to work for this?

Seems like I shouldn’t need polling set at all with push enabled, or is this for fallback when push isn’t supported by the client browser?

Uploading will work just fine with a push or poll configuration. If you are displaying a progress bar to show the progress of the upload, you’ll need a way to make the bar update on the client side.

If using push, this will happen whenever you change the progress bar value on the server side. Just be careful to not update it to frequently as I mentioned above. So no setting the poll interval required.

If using polling, you’ll have to use UI.setPollInterval so the client sees the changes. You’ll probably want to just set it to some reasonable default (5 seconds?) and just leave it alone. It doesn’t seem like something you’d want to be constantly changing but I don’t think there is any reason that you can’t.

You should be using either push or poll but not both (although I guess there is a case of falling back to poll if all push mechanisms fail) so I doubt you’ll ever want to set the poll interval if push is working reliably for you.

Thanks, Mike. What is interesting is that with push enabled and using the original ProgressIndicator (with polling set to 1000), I seemed to get more updates to the visible progress bar during the file upload than when I switched it to a ProgressBar. I even tried the UI.getCurrent().setPollInterval(1000) in my uploadStarted() callback and it still seems to show fewer updates before it’s done.

And I made no changes to my updateProgress() callback which does a progressBar.setValue() each time. I may have to run network monitoring on it to see what’s happening.

But it does sound like I should remove polling since I have push enabled and it didn’t seem to make any difference from how the bar actually appeared during the file upload.

Hm, that’s not what I was seeing; however I was running the server and client side on a single machine so it’s possible that server was generating the load and the slow browser response was a symptom. Either way, it seems like updating the progress bar value on every updateProgress call generates a lot of load when push is enabled which makes some sense.

I suppose you could run into some UI lock contention with a polling interval and push enabled because you have two threads (the client side timer and the server side push) generate state information/changes but I wouldn’t think it would be noticable with a > 1 second poll interval.

Thanks for the very comprehensive report, much appreciated!

You could always change the interval on the fly depending on your needs. Actually, it might be nice to have an Extension that adds polling support to any component, always ensuring that the shortest requested interval is used.

The access() method was originally synchronous, executing immediately in the same thread it was called. However, that quickly leads to deadlocks if you have to lock something else in addition to the session, as is often the case. If you know what you’re doing, you can use UI/VaadinSession.accessSynchronously().

Please do! As it happens, our very own Leif Åstrand recently wrote a prototype of something very similar. Expect to see something like that, if not in Vaadin 7.2, then in the Directory as an add-on.

Yeah, this is the problem with streaming. The SSE and long-polling transports might be worth trying out, even though we don’t currently officially support them as they aren’t extensively tested.

And this is the main problem with Websockets :slight_smile: There’s practically no proxy software currently available that groks them. Oh well, the price you pay for using state-of-the art technologies.

Thanks! Yeah, the MessageSizeInterceptor NPE is a
known issue
; could you add a comment to the ticket as well?

We look forward for hearing more about push in a real-world environment :slight_smile:

I thought about this but it gets tricky as you add and remove components. For example, if I add a component A that sets the poll interval and then component B that also sets the poll interval, when I remove either A or B how do they know what to restore the interval to? Should they assume there is some other component that will clean it up or set it to 0 which might break the remaining component? It seems like some kind of list of polling components is needed and something needs to find the shortest requested interval and apply it. Or just use push and forget about all this!

I posted the code for my BackgroundUiTask
on GitHub
. I think it is safe, but it could use another set of eyes or two. The basic idea is to implement the Future interface and delegate to either the internal “work” future or the real “update” future from the UI.access method depending on which future is currently active.

I’m glad it is a known issue and not just me. I’ll add a comment to the issue. Thanks for your notes.

-mike

We finally pushed a build to our staging environment with PUSH enabled and immediately ran into issues. We disabled websocket support because of the proxy issues described previously. That left us only streaming support. What I’ve found is that something in the network chain is buffering responses resulting in partial responses that never complete on the client side until they interact with the site causing another server side push. For example, after adding some debugging I’ll see messages like (long output trimmed by me):

[14:51:41.027]
 "Got partial message: 47058|for(;;);[{"changes" : [["change",{"pid":"389"},["4",{"id":"389"}]
],["change", ...
[14:51:41.029]
 "Waiting for: 47058"
[14:51:41.031]
 "Got partial message: 47058|for(;;);[{"changes" : [["change",{"pid":"389"},["4",{"id":"389"}]
],["change", ...
[14:51:41.031]
 "Waiting for: 47058"
[14:51:41.032]
 "Got partial message: 47058|for(;;);[{"changes" : [["change",{"pid":"389"},["4",{"id":"389"}]
],["change", ...
[14:51:41.032]
 "Waiting for: 47058"
[14:58:43.139]
 "Thu Oct 17 14:58:43 GMT-400 2013 com.vaadin.client.VConsole INFO: Sending push message: 4fd2a715-f99f-4d52-8b94-9ec081bc4ecf[["5","v","v",["c",["i","6"]
]],["153","v","v",["filter",["s",""]
]],["153","v","v",["page",["i","-1"]
]]]"
[14:58:43.163]
 "Got complete message: for(;;);[{"changes" : [["change",{"pid":"389"},["4",{"id":"389"}]
],["change", ...
[14:58:43.164]
 "Got complete message: for(;;);[{"changes" : [["change",{"pid":"153"},["45",{"id":"153","noInput":true, ...

You can see that the client gets a couple of partial responses and then nothing. When the user interacts with an immediate component, another server side response is generated which pushes all the content down to the browser, completing the initial, partial message.

Unfortunately it seems like HTTP “streaming” is a bit of a hack and that any component (server, load balancer, firewall, proxy, browser, etc) can cache the content waiting for the entire reply. Obviously this breaks the user exprience completely because the user is waiting for some background process to finish but they never see the completion until they interact with some component or navigate away. We don’t see it much in our development environment, but it is very noticable in our staging/QA environment. It seems to be even more noticable with HTTPS which I’m guessing is because the server (or FW in our case) is trying to package up optimal response sizes to reduce encryption overhead.

At this point we’re going to have to turn off push completely and switch back to polling which is unfortunate because push looked really promising. It would be interesting to try this again with long polling (if supported by Vaadin) because the connection would be closed on each response which would always flush the response to the browser. Websockets are the right solution, but there is so little LB/FW support for them right now I don’t see it happening anytime soon.

If someone has any suggestions on configuration changes I could try I’d love to hear them. I’m also curious if this anyone else (including Vaadin testing) is seeing something similar. In the mean time I might try messing with Jetty’s response buffers, but I’m not sure that will fix everything because the issue could occur at any intermediary hop.

My current configuration is:
Java 1.7.0_21
Jetty 8
Vaadin 7.1.6
WinXP and OSX
Firefox 24 (and users have reported IE8 and Chrome as well)

-mike

Have a look at that bug ticket:
http://dev.vaadin.com/ticket/12529
it seems to be your bug as well, I’ve spent more than a week debugging that ticket :-/
Our workaround is setting a huge (32kb) response buffer for jetty, which is obviously is a hack, but makes the application usable with streaming.

Following up on my background executor work, I posted a new UIExecutor framework that replaces my original BackgroundUITask. Refer to
https://vaadin.com/forum#!/thread/4269500
.

-mike

OMG the same in my project sometimes our application hangs with request for huge data. We are receiving couple push requests and then nothing, that very tricky because it happens only on production servers.

Unfortunately, out-of-order messages are a frequent after-effect of proxy/loadbalancer closing the connection after x minutes of inactivity. Please see https://mvysny.github.io/Vaadin8-push-issues/ for more details and for possible solutions.