The HBO sitcom Silicon Valley hilariously followed Pied Piper, a team of developers with startup dreams to create a compression algorithm so powerful that high-quality streaming and file storage concerns would become a thing of the past.
In the show, Google is portrayed by the fictional company Hooli, which is after Pied Piper’s intellectual property. The funny thing is that, while being far from a startup, Google does indeed have a powerful compression engine in real life called Brotli.
This article is about my experience using Brotli at production scale. Despite being really expensive and a truly unfeasible method for on-the-fly compression, Brotli is actually very economical and saves cost on many fronts, especially when compared with gzip or lower compression levels of Brotli (which we’ll get into).
At that time, I was working as a freelance website performance consultant. I was really excited for the 20-26% improvement Brotli promised over Zopfli. Zopfli in itself is a dense implementation of the deflate compressor compared with zlib’s standard implementation, so the claim of up to 26% was quite impressive. And what’s zlib? It’s essentially the same as gzip.
So what we’re looking at is the next generation of Zopfli, which is an offshoot of zlib, which is essentially gzip.
A story of disappointment
It took a few months for major CDN players to support Brotli, but meanwhile it was seeing widespread adoption in tools, services, browsers and servers. However, the 26% dense compression that Brotli promised was never reflected in production. Some CDNs set a lower compression level internally while others supported Brotli at origin so that they only support it if it was enabled manually at the origin.
Server support for Brotli was pretty good, but to achieve high compression levels, it required rolling your own pre-compression code or using a server module to do it for you — which is not always an option, especially in the case of shared hosting services.
This was really disappointing for me. I wanted to compress every last possible byte for my clients’ websites in a drive to make them faster, but using pre-compression and allowing clients to update files on demand simultaneously was not always easy.
Taking matters into my own hands
I started building my own performance optimization service for my clients.
I had several tricks that could significantly speed up websites. The service categorized all the optimizations in three groups consisting of several “Content,” “Delivery,” and “Cache” optimizations. I had Brotli in mind for the content optimization part of the service for compressible resources.
Like other compression formats, Brotli comes in different levels of power. Brotli’s max level is exactly like the max volume of the guitar amps in This is Spinal Tap: it goes to 11.
Brotli:11, or Brotli compression level 11, can offer significant reduction in the size of compressible files, but has a substantial trade-off: it is painfully slow and not feasible for on demand compression the same way gzip is capable of doing it. It costs significantly more in terms of CPU time.
In my benchmarks, Brotli:11 takes several hundred milliseconds to compress a single minified jQuery file. So, the only way to offer Brotli:11 to my clients was to use it for pre-compression, leaving me to figure out a way to cache files at the server level. Luckily we already had that in place. The only problem was the fear that Brotli could kill all our processing resources.
I put my fears aside and built Brotli:11 as a configurable server option. This way, clients could decide whether enabling it was worth the computing cost.
It’s slow, but gradually pays off
Among several other optimizations, the service for my clients also offers geographic content delivery; in other words, it has a built-in CDN.
Of the several tricks I tried when taking matters into my own hands, one was to combine public CDN (or open-source CDN) and private CDN on a single host so that websites can enjoy the benefits of shared browser cache of public resources without incurring separate DNS lookup and connection cost for that public host. I wanted to avoid this extra connection cost because it has significant impact for mobile users. Also, combining more and more resources on a single host can help get the most of HTTP/2 features, like multiplexing.
I was happy. Clients were happy. But I didn’t have numbers. I started analyzing the impact of enabling this high density compression on public resources. For this, I recorded file transfer sizes of several popular libraries — including jQuery, Bootstrap, React, and other frameworks — that used common compression methods implemented by other CDNs and found that Brotli:11 compression was saving around 21% compared to other compression formats.
It’s important to note that some of the other public CDNs I compared were already using Brotli, but at lower compression levels. So, the 21% extra compression was really satisfying for me. This number is based on a very small subset of libraries but is not incorrect by a big margin as I was seeing this much gain on all of the websites that I tested.
Here is a graphical representation of the savings.
|Library||Original||Avg. of Common Compression (A)||Brotli:11 (B)||(A) / (B) – 1|
|Ant Design||1,938.99 KB||438.24 KB||362.82 KB||20.79%|
|Bootstrap||152.11 KB||24.20 KB||17.30 KB||39.88%|
|Bulma||186.13 KB||23.40 KB||19.30 KB||21.24%|
|D3.js||236.82 KB||74.51 KB||65.75 KB||13.32%|
|Font Awesome||1,104.04 KB||422.56 KB||331.12 KB||27.62%|
|jQuery||86.08 KB||30.31 KB||27.65 KB||9.62%|
|React||105.47 KB||33.33 KB||30.28 KB||10.07%|
|Semantic UI||613.78 KB||91.93 KB||78.25 KB||17.48%|
|three.js||562.75 KB||134.01 KB||114.44 KB||17.10%|
|Vue.js||91.48 KB||33.17 KB||30.58 KB||8.47%|
The results are great, which is what I expected. But what about the overall impact of using Brotli:11 at scale? Turns out that using Brotli:11 for all public resources reduces cost all around:
- The smaller file sizes are expected to result in lower TLS overhead. That said, it is not easily measurable, nor is it significant for my service because modern CPUs are very fast at encryption. Still, I believe there is some tiny and repeated saving on account of encryption for every request as smaller files encrypt faster.
- It reduces the bandwidth cost. The 21% savings I got across the board is the case in point. And, remember, savings are not a one-time thing. Each request counts as cost, so the 21% savings is repeated time and again, creating a snowball savings for the cost of bandwidth.
- We only cache hot files in memory at edge servers. Due to the widespread browser support for Brotli, these hot files are mostly encoded by Brotli and their small size lets us fit more of them in available memory.
- Visitors, especially those on mobile devices, enjoy reduced data transfer. This results in less battery use and savings on data charges. That’s a huge win that gets passed on to the users of our clients!
This is all so good. The cost we save per request is not significant, but considering we have a near zero cache miss rate for public resources, we can easily amortize the initial high cost of compression in next several hundred requests. After that, we’re looking at a lifetime benefit of reduced overhead.
It doesn’t end there
This all is done smoothly and seamlessly using our integration tools. The added benefit of this approach for clients is that the bandwidth on the public CDN is totally free with unprecedented performance levels.
Try it yourself!
Additionally, you can use our search tool to immediately find a corresponding resource on the public CDN by supplying a URL of a resource on your website. If none of these tools work for you, then you can check the relevant library page and pick the URLs you want.
Looking toward the future
We started by hosting only the most popular libraries in order to prevent malware spread. However, things are changing rapidly and we add new libraries as our users suggest them to us. You are welcome to suggest your favorite ones, too. If you still want to link to a public or private Github repo that is not yet available on our public CDN, you can use our private CDN to connect to a repo and import all new releases as they appear on GitHub and then apply your own aggressive optimizations before delivery.
What do you think?
Everything we covered here is solely based on my personal experience working with Brotli compression at CDN scale. It just happens to be an introduction to my public CDN as well. We are still a small service and our client websites are only in the hundreds. Still, at this scale the aggressive compression seems to pay off.
I achieved high quality results for my clients and now you can use this free service for your websites as well. And, if you like it, please leave feedback at my email and recommend it to others.