Stateless mode for Vaadin?

Even though statefulness is rarely a scalability issue, there are use-cases where it is desired. The whole architecture of Vaadin is based on storing a server-side state. This message outlines a quick proposal on how it could be possible to implement a stateless operation mode for Vaadin.

All comments on usefulness of such feature, shortcomings of this proposal as well as enhancements to the proposal are welcome.

The master copy of the proposal is maintained at
http://dev.vaadin.com/ticket/6349
. Below is a copy from the ticket.


= Proposal =

The proposal outlines a new operation mode for Vaadin applications where HttpSession or server-side state is not needed for applications at all. This is an optional operation mode that could widen the scope of application where Vaadin could be used, not a default or recommended mode of operation.


== Enabling ==

Set parameter stateless = true in web.xml for the ApplicationServlet.

[/b]== Implementation ==
[/b]

After each http-request, the application state is serialized and removed from the HttpSession. The serialized application state sent along UIDL to the client. Client stores the application state in a javascript variable. Whenever the client contacts the server, it also sends the application state along the request parameters.


== Security ==

The implementation must guarantee that the client can not modify the application state. This is done by maintaining a random salt in server and adding a SHA-2 checksum calculated from the serialized state+salt in the end of the state.

For many applications, the state might not contain any secrets that can not be revealed to the client. Still in some cases the developer chooses to store some server-side secrects (such as password for DB connection). In order to keep the state secret, the state is encrypted with a random key stored in the server using AES-256.

Open question: how the key and salt are stored and shared between the servers in clustered configurations?


== Optimizations ==

The serialization state size can grow quite large. Thus the state should be compressed with GZIP before sending to client.
The effect of encryption to performance should be measured and if the effect is noticeable, a parameter for turning encryption off should be added to web.xml. Still the encryption must be on by default.

Quick tests show for a really small application, state can be around 8kb, but it can be compressed to 3.5kb with GZIP. Each additional TextField (with a random caption (“foo” + Math.random()) adds the size of the serialized session by 134 bytes. When GZIP is used, the state size per TextField goes down to 13 bytes and thus is just the size of the captions.

Research if it is possible to seed GZIP (or some other text-compression algorithm) with a serialized copy of a just initialized application. If it is possible, this might cut the size of the state tremendously.


== Use cases ==

  • Use in large public facing web-sites where per user session is not allowed. The use would mostly be small embedded mini-applications
  • Clustering environments where sticky sessions are not available. (While as Google App Engine is such an environment, the current memcached based server-side session sharing is probably better option for most applications)
  • Embedding small mini-applications with the new embedding mechanism to any web pages without burden of session management for the server.


== Feasibility ==

Stateless mode could be useful and practical for really small applications with only a bit of state. A practical limit for the (compressed) state size could be some tens of kilobytes.

Serialization time for an application with some 1000 UI components (textfields) resulting to 152kb of (uncompressed, unencrypted, unsigned) state is 3ms in 2.4GHz core duo MacBook. By adding more components, we measured the serialization time to be proportional to the number of components in the application. Guesstimate for overhead including both serialization, deserialization, compression, encryption and signing is around 0.01ms / component in the application.

Simple applications embedded to public facing web-sites are the most likely users of the stateless mode. These applications include registration forms, small calendars, insurance calculators, questionaires and such mini-applications. For these applications the number of components is typically something between 10-100. If we guesstimate the average to be 30 components, we may estimate that overhead for such application is 0,3ms of server CPU time and 10kB of combined in+out data transfer per http-request. Lets assume that an user causes in average 15 UIDL ajax requests per day. A public facing system with 1 million daily users would add just 5% overhead to one server CPU for serializations. A more visible cost would be the additional 150GB of combined I+O bandwidths needed (in Amazon EC2 this would cost about 15 USD / day).

In my opinion the “Stateless mode” is quite misleading as the difference to the normal mode is that the state is saved in the client side instead of the server side. The app is still pretty much stateful. This might still be a good idea if your app has a user base counted in the millions and you would not be able to add the needed memory for some reason. I would like to hear more on what the real benefits would be to turn this mode on in your application.

I just thought to bring up the discussion of truly stateless applications in the form of making it possible to implement a RESTful architecture. This would pretty much take away the need of storing the state of the app anywhere, be it on the server or the client. The mini-applications mentioned in the proposal don’t seem to need the state at all. A RESTful web service along with the possibility to pass some data (not states) in cookies/POST/etc. would be in my opinion even more efficient.

For example the Directory has a quite RESTful approach in use right now, but it is rather something that is glued on onto a stateful app in contrast to something that is designed into the fundamentals. This leads to blinking pages when you change the current “page” with an url change, like when using the back button. Also there is a state stored when we would not need one at all (not even stored on the client side). I’d love to see changes in Vaadin that would fix these issues and thus lead to a better user experience, with the side dish of lesser memory consumption and cpu load in the servers.

Hi Joonas,

An interesting proposal from a technical point of view, but not - from our point of view - very useful for any of our intended projects.

Putting aside from our lack of use for such a solution, why would one want to use Vaadin for applications like these? Aren’t there plenty of other frameworks that would be more useful in those scenarios? Isn’t more of a case of of
Maslow’s Hammer
: “it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail”?

I personally see Vaadin’s strength in being a server-side RIA targetted at medium-to-large sized business/form oriented applications. There are fairly valid concerns being raised on the forums about forms/validation and (to a lesser extent IMHO, but still) layouts - which I would think fall more into Vaadin’s core areas. I would be slightly worried if any significant engineering/development effort were to go into extending Vaadin’s breadth without perhaps addressing the issues with Vaadin’s (current) target audience.

As I say - it sounds like a good proposal for solving the problem raised, but I question whether addressing those use-cases should be a particularly high priority.

Cheers,

Charles

All web applications have a state. This is either stored in client and server (as in vaadin) or only in client. In most of the cases these “client-side state” applications are called stateless. The reasoning there is that the state is lost if the page is reloaded. One might even argue that having a state is what separates web applications from web pages.

Thus the statelessness with the above definition is what I propose. This actually is not far at all from a static form, a form with some jquery on top or a full blown GWT app. The real difference here is that the Vaadin app in stateless mode would do most of the processing on the server-side and thus it would have to move more client-side state information to server for processing. The architecture below the UI can be restful.

The use-cases for stateless applications (with the above definition) are listed in the proposal. I am sure that there is more. And OTOH, some of the above use-cases can be argued to be not valid as it might be wiser to just convince the IT department to turn on sticky sessions (for example).

It is quite a different issue. The urifragmentutil is hack that must (and will) be moved to be a part of Window API to be able to check the fragment before any rendering. Now the Directory renders first the default view with urifragmentutil that checks the fragment. Only after that rendering we change to the correct view.

Good point. If the proposal would be complex to implement or require lot of effort to maintain or would require major architectural revision, there would not be any point for even considering this. But as the proposal looks like quite small and independent feature, it is in my opinion worthwhile to consider. ROI seems to be good, so to speak :)

One of the use-cases was embedding. This might be a game-changer for this feature. This is something we have been researching for a while with Tampere Univ. of Tech and hope to announce in short future. (Short spoiler: embedding will allow you to add a Vaadin app to any web page by just adding one-line html-snippet. Just like google maps).

If so, why not publish it as an add-on?

As for signing and encryption, a well selected encryption approach with the encryption key on the server should take care of “signing” as well without an extra step. As compression should be performed before encryption, the whole content will be encrypted.

While as prototype could possibly be done as an add-on with minor changes to core, maintaining the feature in core would probably be easier.

Sure. The real question is - where the random key(s) are stored. Trivial way of doing this would be to type in a shared secret directly to server/servlet configuration or even store that key in the application code, but it would be quite ugly. And possibly open up security vulnerabilities - especially for open source projects.

Anyone knows if it is possible to seed the gzip (or similar fast text compression algorithm with freely available java implementation) with data that is not included in the compression result?

Note myself. One could use
http://download.oracle.com/javase/1.5.0/docs/api/java/util/zip/Deflater.html#setDictionary(byte[])

.

I measured the effect of dictionary pre-seeding with the simple address book tutorial app (but only with one row of data in table). When editing an address in form, the serialized data is 21050 bytes. It compresses down to 6758 bytes with default java.util.zip.Deflater. If another application state snapshot is created just after init and it is used for dictionary seeding, the compressed size goes down to 3071 bytes.

Total time for serialization (including serialization, compression and dictionary seeding) is 3ms on my 2.4GHz core 2 duo MacBook. The compression is actually quite slow as the serialization was just 0.5ms with the same setup.

Added also AES-256 encryption. It adds just 0.2ms on top of the compression.

I would love to use elegant programming model provided by Vaadin to build stateless websites. Right now, Vaadin is not feasible for doing this kind of development. Yes, there are alternative like MVC frameworks, but you loose all of the amazing object oriented components that Vaadin offers.