Skip to content

Cloudfront, nginx and gzip: not that simple

Published on Oct 3 2011

Amazon Cloudfront's request to origin servers are marked as HTTP/1.0. By default nginx does not send a gzipped response to a HTTP/1.0 request. To tackle this, give gzip_http_version a value of 1.0 and it will work correctly.

Enabling gzip compression is key for having a high performance website and is part of Google's Web Performance Best Practices. It can reduce the size of HTML, CSS, JavaScript and other textual files by up to 80%, resulting in lower round trip times and improved overall page load times.

nginx is a highly popular and efficient event driven webserver for serving static files and/or serving as a frontend or spoonfeeder to its relatively bloated counterparts like Apache.

How does gzip work?

When browsers make a request to a server, they send a Accept-Encoding header. For IE8 this is Accept-Encoding: gzip, deflate. The server then knows that this browser accepts data compressed using gzip or deflate.

A HTTP request from IE8 may look like this:

GET / HTTP/1.1
Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-ms-application, application/x-ms-xbap, application/, application/xaml+xml, */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.2; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Now, the server sees Accept-Encoding: gzip, deflate, sends the response compressed as gzip and marks it with the response header Content-Encoding: gzip. The server can also optionally send another header Vary: Accept-Encoding. This tells proxies to vary the object in the proxy cache based on the Accept-Encoding header. The result is that the proxy will have a compressed and uncompressed version f the file in cache (and maybe even three: uncompressed, gzip compressed, deflate compressed). Failing to provide the Vary header may result in the wrong encoding going to an incompatible browser. The Vary header was introduced in HTTP/1.1.

A cache finds a cache entry by using a key value in a lookup algorithm. The simplistic caching model in HTTP/1.0 uses just the requested URL as the cache key. However, the content negotiation mechanism (described in Section 10) breaks this model, because the response may vary not only based on the URL, but also based on one or more request-headers (such as Accept-Language and Accept-Charset).

Requests from Cloudfront

Requests made from Cloudfront to nginx are tagged as HTTP/1.0. To verify this, check access logs for requests made by the user-agent Amazon CloudFront. These will be flagged as HTTP/1.0.

Trying to be helpful, nginx assumes that the requesting user-agent will not honor the Vary header and does not send it a gzipped response.

I have met people who tell me "Cloudfront does not support gzip". This is not true. It just takes more tinkering compared to other Content Delivery Networks, but it works just the same.

Getting gzip to work on nginx with Cloudfront

The fix is simple: set the gzip_http_version setting to 1.0 (the default is 1.1). This would enable nginx to serve gzipped content to HTTP/1.0 clients as well, ensuring everyone who asks for gzipped content gets gzipped content. If you just applied this fix, you need to invalidate these files in your Cloudfront distribution so that it fetches a fresh gzipped file from your server.

If you are using nginx as origin server for Cloudfront, it is highly recommended that you test your site with tools like, Firebug or developer tools in Chrome/Safari, to get the confirmation that the files are now being sent compressed and everthing works fine.