Upload components uses high amounts of CPU

I currently am using the Upload component to allow uploading files to my server which is a 64-bit Redhat machine. I have stripped the class I am using down to remove any of the succeeded/finished/progress listeners, so that the only implemented method used is the receiveUpload using a FileOutputStream, basically the same thing that can be seen in the Book of Vaadin. The issue I’m having is that while the upload is taking place, my CPU is pegged at 99.9%. When I run the same web application on my local Windows development machine, I don’t see anywhere near that amount of CPU usage. Is this expected behavior?

After some in depth remote debugging, I found the high CPU usage starts in the while loop in AbstractCommunicationManager.streamToReceiver. My listenProgress and isInterrupted here are always false, so I never get into the two if statements inside the loop, so basically all that is getting called is the read and write, but if I pause my debugger before this while loop, the CPU usage is minimal, but as soon as I hit continue, the CPU jumps to 99.9% and stays there pretty much the entire time until the upload has completed.

I feel like this is likely some sort of bug, but wanted to post a forum request first to see if there would be any feedback on this issue first. Any help/suggestions would be greatly appreciated.

If this happens, my guess would be that the reads and writes work on just one or a few bytes every time, performing a lot of small reads and writes, whereas on Windows you get bigger chunks of data at a time. Note that the input stream might be wrapped in a SimpleMultiPartInputStream, but normally it should not cause that big issues if the underlying stream is buffered.

First make sure that writes to your output stream are buffered.

An strace could help see what kind of OS-level reads and writes are being performed, whereas a remote debugger could tell about the streams used etc.

Of course, it could be that byte level operations on your 64-bit machine are slow, but any possible impact of that should be closer to 1% than 99%.

After some more research this week, I’ve stepped through the code in the AbstractCommunicationManager and determined that during the upload, both machines are reading the bytes in 4 KB blocks of data at a time. The “in” InputStream in this streamToReceiver function says it’s of the type AbstractCommunicationManager$SimpleMultiPartInputStream. On the Windows machine (which has a quad-core processor) the upload never exceeded 5%. The Redhat machine I was using when I found this is only a single core, but I tried the same thing on a quad core box and it also ended up using over 75% of the CPU, so I’m not sure that’s causing and differences.

I didn’t do an strace because I figured the fact that it’s just a java application running, even if I set it up to only trace IO read/writes, it would be too difficult to narrow down just the reads/writes coming from the FileOutputStream.

Are there any suggestions you can provide for trying to work around this? Can we somehow have the Upload use a custom implementation of the streamToReceiver function where we read the stream in larger chunks at a time? I’m not sure that will entirely solve the problem, but it might be somewhere to start.