Amazon S3 = The Holy Grail
I should have posted this a few weeks ago, but better late than never. We now use Amazon S3 for a significant part of our storage solution. We're absolutely in love with it - and our customers are too (even if they don't know it).
As you probably know, SmugMug has been profitable since our first year, with no investment capital. We've had a great track record for keeping our customers' priceless photos safe and secure using only the profits we've accrued to purchase our storage (yes, I said purchase. We have no debt - we own all of our storage, we don't lease). And every SmugMug customer gets unlimited storage - so that's no mean feat. (Currently, unlimited means ~300TB of storage and nearly 500,000,000 images. To put that into perspective, that's more than 65,000 DVDs or 480,000 CDs).
But Amazon's S3 takes our storage architecture to the next level:
- Your priceless photos are stored in multiple datacenters, in multiple states, and at multiple companies. They're orders of magnitude more safe and secure.
- We'd already built a custom, low-cost commodity-hardware redundant scalable storage infrastructure. Nonetheless, it's significantly cheaper to use S3 than using our own - especially when you factor multiple states & datacenters into the equation.
- Perhaps even more importantly, our cash-flow situation is vastly improved. Instead of paying $25,000 for a handful of terabytes of redundant storage up-front, even before they're used, we now pay $0.15/GB/month as we use it.
- When we have some sort of internal outage with storage, it doesn't matter - Amazon's always on. They eat their own dogfood - S3 is in production use on dozens of Amazon products. We've had storage-related internal outages a few times already, and our customers haven't been able to tell. We'll still have rare outages on our site, unfortunately, (everyone does), but storage is now vastly less likely to be part of the cause.
- I started writing our S3 interface on a Monday, and by that Friday, we were live and in production. It really is that simple to pick up and use, and it was basically a drop-in addition to our existing storage.
- It's fast. I don't mean 15K-SCSI-RAID0-fast, but I do mean internet-latency-fast. It's basically as fast as our internal local storage + the roundtrip speed of light to Amazon. I can measure the difference with computer timing, but in blind tests, humans haven't been able to tell the difference. Everything we serve from Amazon feels fast.
I hate to admit this, but Amazon has built a playing-field leveler. It's now much much easier for a competitor of ours to spring fully-formed from two guys in a garage than it was. Anyone who doesn't get on board with Amazon S3 (or the inevitable S3 competitors) may get left behind. I'm glad we're first, but I doubt it'll last.
Tim O'Reilly, technology visionary extraordinaire, recently said of Sun's new 'Thumper', the Sun Fire X4500: "This is the Web 2.0 server." While I think Tim has perhaps the clearest vision in the industry, and the Thumper does truly look awesome, this time I think he may have missed the mark. The Web 2.0 server is *any* cheap Linux box coupled with utility storage like S3.
Initially this post had a lot of technical detail (I am the 'Chief Geek', afterall), but I removed it since it was probably getting boring. So this is the quick-and-dirty 'Business Case for Amazon S3 and How it Helps our Customers' post. If there's enough interest, I can write up a detailed post about exactly how we use S3, how it works in conjunction with our own local distributed filesystem, and post our S3 library (which was derived from someone else's). Post in the comments if that's of interest.
Also, we'll be presenting at a storage conference in Florida in late October (I'm sorry, I don't have the name of the con with me, but I'll update this post when I do), and have had a few other people request conferences talks on the subject. Comment if that's of interest, too, so we know where to go speak.
Finally, one last geek thought: Anyone using the SmugMug API is now actually using multiple APIs through ours (depending on what you're doing, you may be using Google and/or Yahoo, but you're almost certainly using Amazon). The stack continues to grow.
UPDATE #1: In response to a comment below, I don't feel like we "bet the company" on S3 - every photo our customers entrust us with, we keep local copies in our existing distributed storage infrastructure. We use S3 as redundant secondary storage for use in cases of outages, data loss, or other catastrophe.





August 13th, 2006 at 2:39 am
I’d love to hear more about how you use S3. It sounds fascinating. I’ve got a client who pays big monthly bucks for some dedicated commerce servers and doesn’t want to have to add more servers just because they’re starting to serve a lot more static content.
August 13th, 2006 at 6:22 am
as a geek, the more info, the better! i never get board of reading stuff like this. i would, as Mark Draughn said, like to here more. S3 is an amazing peice of (software) engineering, and its pretty impressive to use that the price they are selling it. Out other question i would have to ask is in relation to the data being served. do you download from amazon and then serve to your customers, or do amazon do the serving of content?
–Tiernan
August 13th, 2006 at 4:38 pm
[...] Don MacAskill, CEO and Chief Geek of Smugmug, one very impressive Photo sharing site, has a post on his blog today about Amazon S3. there is a fair amount of info in it, but he says he had to remove some because he though it might be boring! as a geek, this stuff can never be boring! So if your interested, have a look, and leave a comment asking for more info! [...]
August 14th, 2006 at 1:31 am
The moment I saw Amazon S3 I knew it was something special - the first thing I did was make sure that support for it got built into Cardbox (http://www.cardbox.com).
In a more reflective mood I’ve just started a two-week series of blog postings about S3 - the way it can transform business models and (in the second week) a careful look at the risks of using it. Before you bet the company on S3, you should check whether any of those risks apply to you! The first posting is at http://cardbox.wordpress.com/2006/08/14/s3-in-business-1/
August 14th, 2006 at 3:05 am
I have a small site am interested in saving some $$. The technical details willl be quite helpful. There is not a lot of info out there.
Keep up the good work.
August 14th, 2006 at 6:56 am
I’d love to hear more of the internal details.
August 14th, 2006 at 5:15 pm
This is really fascinating. It’s great that you’ve been able to leapfrog the competition by offering unlimited storage as part of your standard plans (and that the plans are differentiated by *traffic* limits, not storage limits).
It’s a bit of a gamble–what if suddenly lots and lots of users started pumping hundreds of gigs more per day into the system? But I suspect that’ll never happen. “Unlimited” is one of those things that appeals to everyone, but is taken advantage of by only a few. Probably a standard power-law here where 20% of the users cause 80% of your storage costs.
August 15th, 2006 at 2:55 am
I would like some of that boring technical geek detail please. Especially now with your added comment that you use S3 as a backup rather than as your live storage provider.
August 15th, 2006 at 10:49 am
I’d love to hear more about the inner workings of porting SmugMug to S3.
Is it really only used as a backup service? Do you not “trust” the reliability of it to serve your users directly?
-Dan
August 16th, 2006 at 9:31 am
+1 on the details of your S3 setup…. I’m currently assessing using their service in a similar manner that you are and would be interested to see how you’re using it.
August 18th, 2006 at 2:37 am
What about unfair competition? It can download terabytes of data just to get you into high costs. How do you apply protection against such attacks?
August 18th, 2006 at 5:15 am
[...] S3 is fantastic but I see some serious challenges in monetizing it directly. Using it as SmugMug does is grand but with TuneSafe and similar apps there is a problem. I hope some smart business chap comes along and shows us the way. [...]
August 18th, 2006 at 9:43 am
Another +1 on the details of your S3 use and library, etc. We are using it now as well and find it very much a great solution like you.
August 18th, 2006 at 1:08 pm
Please post the details on how exactly you are using S3. I am very interested to know how you guys did this.
August 22nd, 2006 at 2:33 pm
What made you decide not to use S3 for your primary data storage?
August 23rd, 2006 at 12:36 pm
This is pretty funny. I just finished publishing a blog entry on our site entitled “Amazon s3: the holy grail of bandwidth problems?” While the scale of our success is trivial compared to your current bandwidth, the magnitude of the effect was the same - s3 went a long way to saving us money, time, and letting us grow.
http://blog.pairwise.net/2006/08/22/amazon-s3-the-holy-grail-of-bandwidth-problems/
August 28th, 2006 at 5:52 pm
[...] In the meantime, you can watch a video of me describing a portion of our storage infrastructure here, both our physical local storage, and little bit about Amazon’s S3. [...]
August 30th, 2006 at 7:13 am
[...] Amazon’s S3 is so fast, cheap, and reliable, it’s changed SmugMug’s 300TB storage infrastructure. It levels the playing field so anyone can build the next Flickr, YouTube, or SmugMug without tons of cash for disks.read more | digg story [...]
September 18th, 2006 at 9:30 pm
[...] smugblog: blog » Amazon S3 = The Holy Grail There’s something very compelling about Amazon S3 and EC2 (ie storage and servers on demand). The costs are lower or on par what you’d pay for your own boxes, and you don’t have to commit upfront cash. Smugmug are storing 300TB..read about it (tags: smugmug amazon s3 storage) [...]
October 7th, 2006 at 2:28 pm
[...] Amazon Web Services is offering two relatively new services: ubiquitous storage via S3 and ‘elastic’ computing via EC2. The Cardbox folks have an in-depth analysis of S3 online, and Smugmug uses an S3 backend.This is very interesting. If it wasn’t for all the problems with the S3 contract, I would consider using this. [...]
October 10th, 2006 at 9:16 am
[...] Not too long ago, Amazon released their Simple Storage Service (or “S3″ for short). It provides a hosted storage platform which developers can build all sorts of applications on top of. Smugmug, a popular photo sharing web site, is using it to store and host pictures. [...]
October 18th, 2006 at 3:37 pm
+1 on the inner workings of how you use S3. I’d be very interested in finding out more details.
October 18th, 2006 at 9:55 pm
Wait a minnit… Amazon’s website says that it’s:
$0.15 per GB-Month of storage used.
$0.20 per GB of data transferred.
300TB = 307200GB
307200 * $0.15 = $46,080.00/mo
Wow — you guys spend upwards of $46K/mo for secondary storage? And that doesn’t include the cost to upload all that data there (307200 * $0.20 = $61,440.00, but let’s be generous and say that’s a “one time” cost.)
October 19th, 2006 at 9:58 am
Yeah, and $46K/mo is *cheap* once you factor in TCO of doing your own storage.
We pay $46K and love it.
October 19th, 2006 at 8:56 pm
I’m surprised most posters are drinking in this Amazon propaganda so naively. Come on, was this a marketing guy at Amazon who wrote the intitial post. But the real reason is probably that Amazon is paying smugmug for a glowing report and so Catskill willingly delivered. Darryl just begins to shed light on the issues involved. I think S3 is great for a startup to get going quickly but not as viable for an ongoing concern. Would you really trust your crown jewels so willingly to the fate of another company’s infrastructure?
October 20th, 2006 at 12:32 am
Vald is clearly trolling, but I’ll take the bait anyway:
We’re Amazon’s customers, not the other way around. As anyone can tell, we’re a serious business with serious competitors and looking for any advantage we can get. We found it in Amazon.
We’re paying them a hefty chunk of money every month, not the other way around, but the best part is that we’re actually saving money by doing so.
November 4th, 2006 at 2:37 am
[...] We’re in the article, since we’re a big believer in this “new” vision of Amazon’s. Amazon calls the stuff they’re exposing the “muck” of doing business online, and I think it’s a perfect term. Some people see this as some radical departure from Amazon’s core business, but I don’t at all. Just like much of Amazon’s business, it’s an evolution. They began as a bookstore online (no-one remembers this, but they weren’t the first. BookStacks was relatively huge when Amazon launched), and eventually evolved by adding more and more products. They sell nearly everything, including groceries, now. [...]
November 7th, 2006 at 4:48 pm
[...] OmniDrive is an online storage aggregator. This is something that’s been on my mind a lot since S3 launched, since I’d love to have most of my storage “in the cloud” where I can get to it from anywhere. A buddy of mine wrote Jungle Disk, which is very cool, using S3. I wouldn’t be surprised if OmniDrive is also using S3, but they haven’t said. The big idea is that your stuff, whether it’s your documents, photos, videos, music, whatever is normally “stuck” on your PC where you can’t get at it if you’re somewhere else. It’s also prone to loss, since your PC could crash or get infected with a virus or something. Using OmniDrive, you can get that data from anywhere. OmniDrive seems like most of the other online storage providers that have been around for awhile, and plenty more are cropping up now. I’m not sure what’s unique about their offering - their pitch sounded very similar to existing services. Nonetheless, it’s something that everyone wants, whether they realize it or not. Personally, I’m afraid things like Time Machine in Leopard are going to obsolete stuff like this, if Apple (and Microsoft) start linking to things like dot Mac, LiveDrive, or S3 for storage. [...]
November 8th, 2006 at 10:24 am
[...] Jeff Bezos just gave a great presentation and had an interesting chat with Tim O’Reilly here at the Web 2.0 Summit. I’ve written about Amazon’s web services a few times, including the BusinessWeek cover story this week. [...]
November 10th, 2006 at 6:39 am
[...] Amazon S3: The Holy Grail [...]
January 23rd, 2007 at 7:22 am
Can you pelase provide me more details about S3 and how can i implement ? If you can provide me some sample code it will be great.
January 25th, 2007 at 1:57 pm
[...] I still haven’t posted the in-depth technical details and code samples I promised about our use of Amazon S3, but fear not - I’m actively working on it and will post it as soon as it’s done. [...]
January 30th, 2007 at 4:30 pm
[...] Amazon S3 = The Holy Grail [...]
February 6th, 2007 at 10:34 am
[...] Look no further than Amazon S3. Just check it out… It’s sweet, and allows for full customization and integration with your favorite open source backup solution. [...]
February 7th, 2007 at 5:33 pm
[...] Look no further than Amazon S3. Just check it out… It’s sweet, and allows for full customization and integration with your favorite open source backup solution. [...]
March 5th, 2007 at 7:56 am
Thanks for the help i got alot of customers looking for off site storage and would be interested in these services.
Thanks
May 3rd, 2007 at 6:31 pm
[...] Update: Don wrote in the comments SmugMug also uses over 200 TB of Amazon’s S3, which he talks about here. I’ve got to spend some more time reading his blog! [...]
July 22nd, 2007 at 11:30 am
side effects for tri levlen
news
August 14th, 2007 at 10:27 am
[...] Wow, more money… Actually, 500GB of data transfer is a lot if you’re not planning to host big pictures and videos. As a proof, codinghorror (a developer blog that is usually slashdotted/digged) peaked 9GB in one day of his most memorable posts. If you’re able to use 500GB of data transfer, I think you can get the extra bandwidth money easily with adsense or another advertisement tool. I would say that the S3 bandwidth can be paid by itself (again, if you’re not running a flickr/youtube clone). Who knows? Maybe Amazon just started EC2 when they realized about all that people moving his heavy content to S3. [...]
August 14th, 2007 at 10:40 am
[...] Uau, mas dinero… De hecho, 500GB de transferencia es mucho si no estas planeando hostear fotos de gran resolución y vídeos. Como prueba, codinghorror (un blog de desarrolladores que es normalmente enlazado por slashdot/digg) llego a un pico de 9GB en uno de sus posts más memorables. Si eres capaz de usar 500GB de transferencia, yo creo que puedes conseguir el dinero del ancho de banda extra fácilmente con adsense u otra herramienta de publicidad. Yo diría que el el ancho de banda S3 se paga por si mismo (otra vez, si no estas moviendo un clon de flickr/youtube). ¿Quién sabe? A lo mejor Amazon empezó con EC2 cuando se dieron cuenta de toda la gente que movía su contenido pesado a S3. [...]
December 15th, 2007 at 4:42 pm
[...] One of the poster children of AWS is SmugMug. Don MacAskill is one of my favorite writers. He has a post “Amazon S3 = The Holy Grail“. At the very end is this little nugget of information - In response to a comment below, I don’t feel like we “bet the company” on S3 - every photo our customers entrust us with, we keep local copies in our existing distributed storage infrastructure. We use S3 as redundant secondary storage for use in cases of outages, data loss, or other catastrophe. [...]
January 29th, 2008 at 6:22 pm
I think it’s lame that the details were taken out. This article is fluff. One rule about blogs is LISTEN to your readers. Almost every posting here is requesting the information. yet no report?
January 29th, 2008 at 6:25 pm
@JoeJoe: I have *lots* of S3 details up. Just search for Amazon S3 on the sidebar.
February 5th, 2008 at 6:26 pm
[...] SmugBlog: Don MacAskill » Blog Archive » Amazon S3 = The Holy Grail But Amazon’s S3 takes our storage architecture to the next level: Your priceless photos are stored in multiple datacenters, in multiple states, and at multiple companies. They’re orders of magnitude more safe and secure. (tags: Backup Business MySQL hosting server Startup) [...]
February 20th, 2008 at 10:33 pm
[...] your system relies on AWS, always have a backup solution in place so that you can keep your basic service [...]
June 5th, 2008 at 8:07 pm
[...] uses Amazon Web Services(S3 and EC2) and very happy with [...]
July 8th, 2008 at 12:48 am
I strongly think that every new company needs to create a new account in amazon s3
and start using it now without spending 10K on storage.
Amazon S3 is real bless for many companies.