Bit of a long winded title, but I wanted to get it all in there.

EDIT: The long title broke my new design, so I shortened it :)

Why?

  • I have a (highly optimised) report that takes way too long to generate, up to around 30 seconds
  • Too many variables to prime caches for every possible combination
  • Jakob Nielson of usability fame says:

    10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.

  • Personally, I don't think the browsers inbuilt progress bar is enough feedback for todays web users

How?

First off, I configured the report to cache the generated data in memcached for a day. That's great, now if the user requests that particular report (with the exact same combination of filters etc), twice in a day, the second time will be really snappy. What I now need is a way of priming the cache, while letting the user know what's going on. This is where Gearman comes in handy. I've used message queues in the past, but they don't have the notion of progress like a job queue does. For example purposes, I've written a little script that fetches URLs and titles from a list of RSS feeds:

  1. When memcache tells my application it can't find the droids it's looking for, my application now adds a background job to the gearman queue.

screenshot1

  1. The application then forwards the user to a job status page, complete with a progress bar courtesy of the fantastic Dojo Toolkit. The user sits and watches in eager anticipation.

screenshot2

  1. A gearman worker script is running on the server thanks to supervisord, it comes along picks the job from the queue and sets about fetching the data. As it progresses, it pings the gearman server about it's current status.

screenshot3

  1. The page the user is glued to is repeatedly pinging the server to check on the progress of the job. As the job server gets updates from the worker, the web server passes them on to the user via the dijit.ProgressBar.

screenshot3

  1. Once complete, the javascript forwards the user back to the original URL, the data is successfully fetched for the cache, and we're in business.

screenshot4

Get to the code already

You can browse the sample scripts at github, they're not amazing, just meant to show the method I've described here. As a side note, I wondered if anyone had come up with a Sinatra style framework for PHP, turns out there were quite a few, I chose the Slim Framework as I could see from the homepage it took PHP 5.3 lambdas, the others didn't make it too clear.

Has anyone else done anything similar? Am I doing something stupid? Comments are welcome.