Adventures in X Programming
I have reached the end of an over 3-week exploration into the possibility of gaining a nominal speed improvement in Firefox page load times on Linux. Vlad asked me to look into using the Xlib Shared Memory Extension (XShm) for image rendering. I have created an extension to Cairo that can use XShm and plugged that into Firefox’s image library, but now that it is working, the performance results are mixed and I am unsure what to do with it.
Let me explain exactly for what XShm would be useful in Firefox: Faster rendering of images while they are decoding. Some readers of my last post seemed to be under the impression that static images were being drawn using XPutImage and the plan was to use XShmPutImage instead, which given the way I wrote the article (namely, badly), I can’t really blame them. Let me rectify.

In Firefox, while images are being decoded, the partial image is constantly being drawn to the screen. Once the image is complete, it is of course optimized into a pixmap on the server, but until that time, it’s constantly being sent to the server via XPutImage. This is where XShm could conceivably help.
However, there is one hitch in this process. Due to the asynchronous nature of X, setting up the XShm image can fail under a certain chain of events, so the application has to synchronize with X (by calling XSync) before destroying the image. This throws a major wrench in application speed if these images are created all the time.
When I ran Talos on a Firefox that used XShm for images, timings were all over the place. Some web pages showed a noticeable improvement in performance, while others showed a noticeable regression in performance. Fiddling with the code, I was able to get it to the point where there was a sizable overall performance improvement on my machine, but I really can’t be sure that will translate into the same thing on most Linux machines.
So I am now at the quandary of deciding what the heck to do with all this work. It seems so close to being something very useful, yet I can’t seem to draw consistent results out of it. There would most likely be regressions across the board for certain machines or web sites. I suppose I’ll let it sit around for a while until I decide how I can improve it.
Tags: Cairo, Firefox, Gecko, images, performance, X11, Xlib, Xlib Shm, XShm
July 5th, 2008 at 10:54 pm
Eric, my X fu is weak, so bear with me for a bit here. Is it possible to keep doing what we do now (the XPutImage) until we get the callback from X (or whatever method it uses to communicate this information) indicating that the XShm call succeeded, then switch to the shm thing. This means more memory usage between the Shm call and us knowing that it succeeded than we use now, but that might be ok…
In other words, we’d have three stages in the life of an image:
1) Image is not all decoded, needs to be drawn, not in shared memory, use XPutImage.
2) Image is not all decoded, in shared memory, use XShm
3) Image decoded, optimize.
Or would we never hit stage 2 in practice?
Another interesting question here is how often we in fact end up having to draw an image that is only partially decoded and whether we can reasonably reduce the number of times this happens (in other words, try hard to only draw images that have been optimized to pixmaps). Would that have similar performance benefits without the obvious pain of the XSync hit?
I should also note that the XSync hit will always be worse (a LOT worse) on a non-local X, so it’s something we should, imo, either avoid or make optional (e.g. disable the Shm thing based on a pref) if possible.
July 6th, 2008 at 8:23 am
I think it would be good to have this integrated into Cairo, and let the application decide whether they want to use shm or not. Depending on what the application is, this may or may not be useful, but you should leave the option open to cases where it does make a significant difference.