raw code

In early 2007, a nice trick made the circuit to reduce parallel connections by bundling CSS and JavaScript files to one larger file. I immediately started the research to optimize this approach. But why should you combine static files at all?

The browser needs styling and scripting information to complete the rendering of a given site. When a visitor calls your site for the first time, he has to load all static files one after another with a spare parallelization option. Anyway, every further click should be faster, even on an unoptimized site. But as long as not all data has been delivered to the client, the user sees the previous site or even worse, a blank page. The problem comes from the fact, that browsers are allowed to open only a small number of parallel connections to one domain. Newer browsers have increased this value, but the main problem remains the same: Too many connections.

I started with a simple PHP script which did exactly the same as described on sitepoint, but when I started hacking some lua scripts to handle special requests (for example, my already published approach of dynamic thumbnail generation) directly inside of lighttpd using mod_magnet, I also added the combining script with a bit more intelligence. Thus, the script optimizes and caches the data on several levels beside the actual combining process. If you do not want to use lighttpd as webserver, you can also use mod_concat for Apache or a PHP script like the one of Niels Leenheer and adapt it to your needs.

Okay, I think we had enough preface. Let's dive a bit deeper. We assume, we have the following <link>/<script> tags to bind the corresponding files into the current document:

<script src="/scipts/jquery-min.js" type="text/javascript"></script>
<script src="/scipts/home.js" type="text/javascript"></script>
<script src="/scipts/calculation.js" type="text/javascript"></script>
<script src="/scipts/helpers.js" type="text/javascript"></script>

<link href="/style/reset.css" type="text/css" rel="stylesheet" />
<link href="/style/general.css" type="text/css" rel="stylesheet" />
<link href="/style/home.css" type="text/css" rel="stylesheet" />
<link href="/style/box.css" type="text/css" rel="stylesheet" />

Our goal is to reduce the number of requests and more over to achieve a high cache hit rate. Thus, we should not stack too many individual files in one global file (okay you could also stack everything in one file, but remember: Internet Explorer truncates the content after a certain size), but files we use on some groups of undersides more often are good candidates to get into one group. My proposal for the example above would be:

<script src="/scipts/jquery-min.js,/helpers.js" type="text/javascript"></script>
<script src="/scipts/home.js,/calculation.js" type="text/javascript"></script>

<link href="/style/reset.css,/box.css,/general.css" type="text/css" rel="stylesheet" />
<link href="/style/home.css" type="text/css" rel="stylesheet" />

This is a good approach for real world scenarios, because the cache hit rate would be high for recurring users and also faster than loading each file seperately. A better way would be a package like xpi files to stack all files into one compressed archive, as I've already mentioned in my article about future needs of web browsers. When you've looked a bit closer at the snippet above, you may have noticed that I use a similar CSV format like the other implementations out there, with the difference that I use a solid prefix (/scripts and /style) and all other files in the list are relative to that path. The last two words about my implementation are, that the cache is really hard and only avoidable by adding get parameters to make the address unique on the server. The other thing you should know is, that I patched mod_magnet to pass lighttpd's simple hashme() function to the lua scope. The function is used to generate ETags inside of lighttpd and is as easy as quickly. If you don't want to patch mod_magnet, search for another hashing method like md5 and include it in lua. To optimize the combined code I use csstidy for CSS and the YUI compressor on the other side for JavaScript. But now the code:

etag = hashme(lighty.env["request.uri"])

if nil ~= lighty.request["If-None-Match"] and lighty.request["If-None-Match"] == lighty.header["Etag"]  then
	return 304
end

-- Initialize some variables
prefix = "/var/www/"
cache = "/var/cache/" .. etag
types = { scripts = "text/javascript", style = "text/css"}
extensions = { scripts = "js", style = "css" }

-- Set Etag Header (this approach is also portable over a cluster of servers)
lighty.header["Etag"] = "\"" .. etag .. "\""

-- Check gzip version
gzip = false
if nil ~= lighty.request["Accept-Encoding"] then
gzip = nil ~= string.find(lighty.request["Accept-Encoding"], "gzip")
end

-- Retrieve type and send the appropriate header
type = string.match(lighty.env["request.uri"], "^/(%w+)/.*")
lighty.header["Content-Type"] = types[type]


-- If file does not exist
if nil == lighty.stat(cache) then
	exec = ""
	files = string.gsub(lighty.env["uri.path"], "^/" .. type, "")
	for a in string.gmatch(files, "(.-[.]" .. extensions[type] .. "),?") do

		if nil == lighty.stat(prefix .. type .. a) then
			print("NOT FOUND: " .. prefix .. type .. a)
			return 404
		end
		exec = exec .. prefix .. type .. a .. " "
	end

	if "" == exec then
		print("NO VALID FILE: " .. lighty.env["uri.path"])
		return 404
	end

	-- Use cat to merge files (fastest method so far)
	os.execute("/bin/cat " .. exec .. ">" .. cache)

	-- Optimize with third party applications
	if type == "style" then
		os.execute("/usr/bin/csstidy " .. cache .. " --remove_last_\\;=true --silent=true --template=high " .. cache)
	else
		os.execute("/usr/local/java/bin/java -jar /usr/local/java/yuicompressor.jar --charset utf-8 --type js " .. cache .. " -o " .. cache)
	end

	-- Generate gzip version
	os.execute("/bin/gzip -c " .. cache .. " > " .. cache .. ".z")
end

-- Get request time using magnet patch (or use the general way of lua...this is optimized to save the time(NULL)
ts = lighty.env["request.time"] + 60 * 60 * 24 * 30

-- HTTP 1.0
lighty.header["Expires"] = os.date("%a, %d %b %Y %H:%M:%S GMT", ts)
-- HTTP 1.1
lighty.header["Cache-Control"] = "max-age=" .. ts

-- Decide what version we send to the client
if gzip then
	lighty.header["Content-Encoding"] = "gzip"
	lighty.env["physical.path"] = cache .. ".z"
else
		lighty.env["physical.path"] = cache
end

To get the script working, you have to add the following line to your lighttpd.conf, assumed that your configuration directory of lighttpd is /etc/lighttpd:

magnet.attract-physical-path-to = ( "/etc/lighttpd/combine.lua" )