Wednesday, January 16, 2008

Memcached: When You Absolutely Positively Have to Get It To Scale the Next Day

Memcached has long been the answer to most questions containing the word scale. There are some spectacular memcached installations out there. Facebook is said to run a 200 server with 3TB of memory solely for servicing memcached; Shopify, twitter, digg, Slashdot and just about every other public facing application depends on it. Facebook’s installation is said to deliver a 99% cache hit rate while servicing tens of thousands of requests a second.

Suppose your team have just finished working feverishly to implement “Virus-a-Go-Go”, your new Facebook widget that is guaranteed to soar to the top of the charts. You launch it, and sure enough you were right. Zillions of hits are suddenly raining down on your new widget.

Translation: they want a lot of time to make things faster, time you don’t have. What to do, what to do?

With appologies to Federal Express, the point of my title is that memcached may be one of the fastest things you can retrofit to your software to make it scale. Memcached when properly used has the potential to increase performance by hundreds or sometimes even thousands of times. I don’t know if you can quite manage it overnight, but desperate times call for desperate measures. Hopefully next time you are headed for trouble, you’ll start out with a memcached architecture in advance and buy yourself more time before you hit a scaling crunch. Meanwhile, let me tell you more about this wonder drug, memcached.

What is memcached?

Simply put, memcached sits between your database and whatever is generating too many queries on it and attempts to avoid repetitious queries by caching the answer in memory. If you ask it for something it already knows, it retrieves it very quickly from memory. If you ask it for something it doesn’t know, it gets it from the database, copies it into the cache for future reference and hands it over.

The beautiful thing about memcached is that it can usually be added to your software without huge structural changes being necessary. It sits as a relatively transparent layer that does the same thing your software has always done, but just a whole lot faster. Most of the big sites use memcached to good effect. Facebook, for example, uses 200 quad core machines that
each have 16GB of RAM to create a 3 Terabyte memcached that apparently has a 99% hit rate.


Here’s another beautiful thought: memcached gives you a way to leverage lots of machines easily instead of rewriting your software to eliminate the scalability bottleneck. You can run it on every spare machine you can lay hands on, and it soaks up available memory on those machines to get smarter and smarter and faster and faster. Cool! Think of it as a short term bandaid to help you overcome your own personal Multicore Crisis.


Cache Performance Comparison

- see results section
- see section: So what my recommendations would be about using these caches for your application ?
- look at lots of user comments at the end of the post to learn more

No comments: