Robert Eisele
Systems Engineer, Architect and DBA

Dynamic thumbnail generation on the static server

It's quite natural to use features of the main programming language used in a project to generate user thumbnails or thumbnails of other images. Let's assume, that we generate 3 sizes of every picture. With a rigid system, you would generate every size and store them temporarily on the webserver until it wasn't uploaded to the static server. That means we have a lot of write operations on the webserver to generate the small pictures and finally writes to upload each of these pics to the static server - the CPU cycles intentionally left out.

I've been thougt a while about the problem and concluded, that the best and more faster way to generate the thumbnails is when they are needed on the static server. Not every thumbnail is needed in every size and with this approach, we are also able to simply add new sizes without changing the application logic.

However, this requires a strong caching mechanism, on both the client and the server. We assume that we upload each picture to the server in a directory, which has an appropriate format. I decided to use the following format for any image-URI:

http:// {domain} / {prefix} / {random} [ -{size} ] . {extension}

Static webservers get an own domain, to be cookie free which reduces the request header size. The specified prefix is a simple token, which indicates the image type, for example profile-, album-, group- or whatever pictures you have on your site. The extension is the file extension, which is forced to be .jpg here. The more interesting part of the URI is the optional size indicator. An example for a small user profile picture could look as follows:

http://www.example.com/user/4/f/d/6/0/2/3/1/9/0a2f9c6258-s.jpg

If you do not specify the size indicator, you'll get the original picture, uploaded by the user. Now let's come to the interesting part of the whole thing: the dynamic generation of the various sizes. Sure, we could use PHP or any other scripting language on the static webserver as well, but my concern is to reduce the footprint on the static webservers. So I use a patched lighttpd as webserver, which allows to build nice gadgets using mod_magnet. I've written the whole thing in LUA, which calls Imagemagic through os.execute() for every image what is not still cached. After the image is generated, it will be sent to the user with a few caching hints for the user agent.

If you did not mounted the partition with noatime, it is now possible to write a garbage collector, which removes unused cache files based on the last access time, if you need to save disk space. Okay, what's missing? The most important, the source code:

-- jpeg quality
quality = "85"

-- docroot of the images
docroot = "/var/www"

-- list of all sizes used
map_sizes = "sml"

-- map of prefixes : sizes to width/height - if the size doesn't exist, an error will be thrown
map = {
	["user:s"]  = {60,  60},
	["user:m"]  = {150, 0},

	["album:s"] = {60,  60},
	["album:m"] = {160, 120},
	["album:l"] = {480, 0}
}

function check_header(st)

	last_modified = os.date("%a, %d %b %Y %H:%M:%S GMT", st["st_mtime"])

	lighty.header["Last-Modified"] = last_modified
	lighty.header["ETag"] = st["etag"]

	ts = lighty.env["request.time"] + 60 * 60 * 24 * 30

	-- HTTP 1.0
	lighty.header["Expires"] = os.date("%a, %d %b %Y %H:%M:%S GMT", ts)

	-- HTTP 1.1
	lighty.header["Cache-Control"] = "max-age=" .. ts

	if nil == lighty.request["If-None-Match"] and nil == lighty.request["If-Modified-Since"] then
		return 0
	end

	if nil ~= lighty.request["If-None-Match"] and lighty.request["If-None-Match"] == st["etag"]  then
		return 304
	end

	if nil ~= lighty.request["If-Modified-Since"] and lighty.request["If-Modified-Since"] == last_modified then
		return 304
	end

	return 0
end

st = lighty.stat(lighty.env["physical.path"])

if nil ~= st then

	x = check_header(st)

	if x ~= 0 then
		return x
	end
else

	original = string.gsub(lighty.env["physical.path"], "([%w/]+)-[" .. map_sizes .. "]\.", "%1.")

	if original == lighty.env["physical.path"] or nil == lighty.stat(original) then
		return 404
	else
		key = string.gsub(lighty.env["physical.path"], "^" .. docroot .. "/(%a+)/.*-([" .. map_sizes .. "])\.jpg$", "%1:%2")

		if nil == map[key] then
			return 404
		end

		-- Generate the thumbnail using imagemagick
		if map[key][2] > 0 then
			os.execute("convert " .. original .. "[0] -quality " .. quality .. " -resize x" .. (map[key][2] * 2) .. " -resize '" .. (map[key][1] * 2) .. "x<' -resize 50% -gravity center -crop " .. map[key][1] .. "x" .. map[key][2] .. "+0+0 +repage " .. lighty.env["physical.path"])
		else
			os.execute("convert " .. original .. "[0] -resize " .. map[key][1] .. "x " .. lighty.env["physical.path"])
		end

		st = lighty.stat(lighty.env["physical.path"])

		if nil == st then
			return 501
		end

		x = check_header(st)
	end
end

To get the script working, you have to add the following line to your lighttpd.conf, assumed that your configuration directory of lighttpd is /etc/lighttpd:

magnet.attract-physical-path-to = ( "/etc/lighttpd/image.lua" )

Note: The specified patch, I pushed to the lighttpd's bug tracker, is not necessarily needed here. I just make use of the lighty.env["request.time"] variable to save the time() sycall inside of LUA.

You might also be interested in the following

2 Comments on „Dynamic thumbnail generation on the static server”

Robert

Hi nitrox,
this is a good point. The script blocks at os.execute() but I had no problems in production so far, because the cache hit rate is around 99.7% and every thumbnail generation takes < 0.2 seconds on the used SAN. There are also several servers shipping the traffic, so I reduced the problem statisticaly and with more expensive hardware.

But in general you're right, that lighty will block on thumbnail generation.

nitrox
nitrox

And you´re sure lighty isn´t blocking?

 

Sorry, comments are closed for this article. Contact me if you have an inventive contribution.