Amazon S3 = The Holy Grail

I should have posted this a few weeks ago, but better late than never. We now use Amazon S3 for a significant part of our storage solution. We're absolutely in love with it - and our customers are too (even if they don't know it).

As you probably know, SmugMug has been profitable since our first year, with no investment capital. We've had a great track record for keeping our customers' priceless photos safe and secure using only the profits we've accrued to purchase our storage (yes, I said purchase. We have no debt - we own all of our storage, we don't lease). And every SmugMug customer gets unlimited storage - so that's no mean feat. (Currently, unlimited means ~300TB of storage and nearly 500,000,000 images. To put that into perspective, that's more than 65,000 DVDs or 480,000 CDs).

But Amazon's S3 takes our storage architecture to the next level:

  • Your priceless photos are stored in multiple datacenters, in multiple states, and at multiple companies. They're orders of magnitude more safe and secure.
  • We'd already built a custom, low-cost commodity-hardware redundant scalable storage infrastructure. Nonetheless, it's significantly cheaper to use S3 than using our own - especially when you factor multiple states & datacenters into the equation.
  • Perhaps even more importantly, our cash-flow situation is vastly improved. Instead of paying $25,000 for a handful of terabytes of redundant storage up-front, even before they're used, we now pay $0.15/GB/month as we use it.
  • When we have some sort of internal outage with storage, it doesn't matter - Amazon's always on. They eat their own dogfood - S3 is in production use on dozens of Amazon products. We've had storage-related internal outages a few times already, and our customers haven't been able to tell. We'll still have rare outages on our site, unfortunately, (everyone does), but storage is now vastly less likely to be part of the cause.
  • I started writing our S3 interface on a Monday, and by that Friday, we were live and in production. It really is that simple to pick up and use, and it was basically a drop-in addition to our existing storage.
  • It's fast. I don't mean 15K-SCSI-RAID0-fast, but I do mean internet-latency-fast. It's basically as fast as our internal local storage + the roundtrip speed of light to Amazon. I can measure the difference with computer timing, but in blind tests, humans haven't been able to tell the difference. Everything we serve from Amazon feels fast.

I hate to admit this, but Amazon has built a playing-field leveler. It's now much much easier for a competitor of ours to spring fully-formed from two guys in a garage than it was. Anyone who doesn't get on board with Amazon S3 (or the inevitable S3 competitors) may get left behind. I'm glad we're first, but I doubt it'll last.

Tim O'Reilly, technology visionary extraordinaire, recently said of Sun's new 'Thumper', the Sun Fire X4500: "This is the Web 2.0 server." While I think Tim has perhaps the clearest vision in the industry, and the Thumper does truly look awesome, this time I think he may have missed the mark. The Web 2.0 server is *any* cheap Linux box coupled with utility storage like S3.

Initially this post had a lot of technical detail (I am the 'Chief Geek', afterall), but I removed it since it was probably getting boring. So this is the quick-and-dirty 'Business Case for Amazon S3 and How it Helps our Customers' post. If there's enough interest, I can write up a detailed post about exactly how we use S3, how it works in conjunction with our own local distributed filesystem, and post our S3 library (which was derived from someone else's). Post in the comments if that's of interest.

Also, we'll be presenting at a storage conference in Florida in late October (I'm sorry, I don't have the name of the con with me, but I'll update this post when I do), and have had a few other people request conferences talks on the subject. Comment if that's of interest, too, so we know where to go speak.

Finally, one last geek thought: Anyone using the SmugMug API is now actually using multiple APIs through ours (depending on what you're doing, you may be using Google and/or Yahoo, but you're almost certainly using Amazon). The stack continues to grow.

UPDATE #1: In response to a comment below, I don't feel like we "bet the company" on S3 - every photo our customers entrust us with, we keep local copies in our existing distributed storage infrastructure. We use S3 as redundant secondary storage for use in cases of outages, data loss, or other catastrophe.


47 Responses to “Amazon S3 = The Holy Grail”

  1. Mark Draughn Says:

    I’d love to hear more about how you use S3. It sounds fascinating. I’ve got a client who pays big monthly bucks for some dedicated commerce servers and doesn’t want to have to add more servers just because they’re starting to serve a lot more static content.

  2. Tiernan OToole Says:

    as a geek, the more info, the better! i never get board of reading stuff like this. i would, as Mark Draughn said, like to here more. S3 is an amazing peice of (software) engineering, and its pretty impressive to use that the price they are selling it. Out other question i would have to ask is in relation to the data being served. do you download from amazon and then serve to your customers, or do amazon do the serving of content?

    –Tiernan

  3. Tiernans Comms Closet : Amazon S3 = the holy grail Says:

    [...] Don MacAskill, CEO and Chief Geek of Smugmug, one very impressive Photo sharing site, has a post on his blog today about Amazon S3. there is a fair amount of info in it, but he says he had to remove some because he though it might be boring! as a geek, this stuff can never be boring! So if your interested, have a look, and leave a comment asking for more info! [...]

  4. Martin Kochanski Says:

    The moment I saw Amazon S3 I knew it was something special - the first thing I did was make sure that support for it got built into Cardbox (http://www.cardbox.com).

    In a more reflective mood I’ve just started a two-week series of blog postings about S3 - the way it can transform business models and (in the second week) a careful look at the risks of using it. Before you bet the company on S3, you should check whether any of those risks apply to you! The first posting is at http://cardbox.wordpress.com/2006/08/14/s3-in-business-1/

  5. John Smith Says:

    I have a small site am interested in saving some $$. The technical details willl be quite helpful. There is not a lot of info out there.

    Keep up the good work.

  6. Richard Steele Says:

    I’d love to hear more of the internal details.

  7. Phil Says:

    This is really fascinating. It’s great that you’ve been able to leapfrog the competition by offering unlimited storage as part of your standard plans (and that the plans are differentiated by *traffic* limits, not storage limits).

    It’s a bit of a gamble–what if suddenly lots and lots of users started pumping hundreds of gigs more per day into the system? But I suspect that’ll never happen. “Unlimited” is one of those things that appeals to everyone, but is taken advantage of by only a few. Probably a standard power-law here where 20% of the users cause 80% of your storage costs.

  8. Paul Watson Says:

    I would like some of that boring technical geek detail please. Especially now with your added comment that you use S3 as a backup rather than as your live storage provider.

  9. Dan Says:

    I’d love to hear more about the inner workings of porting SmugMug to S3.

    Is it really only used as a backup service? Do you not “trust” the reliability of it to serve your users directly?

    -Dan

  10. Stephen Caudill Says:

    +1 on the details of your S3 setup…. I’m currently assessing using their service in a similar manner that you are and would be interested to see how you’re using it.

  11. Przemek Piotrowski Says:

    What about unfair competition? It can download terabytes of data just to get you into high costs. How do you apply protection against such attacks?

  12. Life is grand » A golden bucket Says:

    [...] S3 is fantastic but I see some serious challenges in monetizing it directly. Using it as SmugMug does is grand but with TuneSafe and similar apps there is a problem. I hope some smart business chap comes along and shows us the way. [...]

  13. Chris Bailey Says:

    Another +1 on the details of your S3 use and library, etc. We are using it now as well and find it very much a great solution like you.

  14. Shyamala Says:

    Please post the details on how exactly you are using S3. I am very interested to know how you guys did this.

  15. Martin Kochanski Says:

    What made you decide not to use S3 for your primary data storage?

  16. Bryan Kennedy Says:

    This is pretty funny. I just finished publishing a blog entry on our site entitled “Amazon s3: the holy grail of bandwidth problems?” While the scale of our success is trivial compared to your current bandwidth, the magnitude of the effect was the same - s3 went a long way to saving us money, time, and letting us grow.

    http://blog.pairwise.net/2006/08/22/amazon-s3-the-holy-grail-of-bandwidth-problems/

  17. smugblog: Don MacAskill » Blog Archive » Take a peek inside our datacenters Says:

    [...] In the meantime, you can watch a video of me describing a portion of our storage infrastructure here, both our physical local storage, and little bit about Amazon’s S3. [...]

  18. EveryDigg » Blog Archive » Amazon S3 = The Holy Grail Says:

    [...] Amazon’s S3 is so fast, cheap, and reliable, it’s changed SmugMug’s 300TB storage infrastructure. It levels the playing field so anyone can build the next Flickr, YouTube, or SmugMug without tons of cash for disks.read more | digg story [...]

  19. scale|free » links for 2006-09-19 Says:

    [...] smugblog: blog » Amazon S3 = The Holy Grail There’s something very compelling about Amazon S3 and EC2 (ie storage and servers on demand). The costs are lower or on par what you’d pay for your own boxes, and you don’t have to commit upfront cash. Smugmug are storing 300TB..read about it (tags: smugmug amazon s3 storage) [...]

  20. Off you go… into the purple yonder! » Amazon’s S3 and EC2 services Says:

    [...] Amazon Web Services is offering two relatively new services: ubiquitous storage via S3 and ‘elastic’ computing via EC2. The Cardbox folks have an in-depth analysis of S3 online, and Smugmug uses an S3 backend.This is very interesting. If it wasn’t for all the problems with the S3 contract, I would consider using this. [...]

  21. Dreamhost versus Amazon S3 - Backup solutions « Drew’s Blog Says:

    [...] Not too long ago, Amazon released their Simple Storage Service (or “S3″ for short). It provides a hosted storage platform which developers can build all sorts of applications on top of. Smugmug, a popular photo sharing web site, is using it to store and host pictures. [...]

  22. Brenton Says:

    +1 on the inner workings of how you use S3. I’d be very interested in finding out more details.

  23. Darryl Says:

    Wait a minnit… Amazon’s website says that it’s:
    $0.15 per GB-Month of storage used.
    $0.20 per GB of data transferred.

    300TB = 307200GB
    307200 * $0.15 = $46,080.00/mo

    Wow — you guys spend upwards of $46K/mo for secondary storage? And that doesn’t include the cost to upload all that data there (307200 * $0.20 = $61,440.00, but let’s be generous and say that’s a “one time” cost.)

  24. onethumb Says:

    Yeah, and $46K/mo is *cheap* once you factor in TCO of doing your own storage. :)

    We pay $46K and love it.

  25. Vald Says:

    I’m surprised most posters are drinking in this Amazon propaganda so naively. Come on, was this a marketing guy at Amazon who wrote the intitial post. But the real reason is probably that Amazon is paying smugmug for a glowing report and so Catskill willingly delivered. Darryl just begins to shed light on the issues involved. I think S3 is great for a startup to get going quickly but not as viable for an ongoing concern. Would you really trust your crown jewels so willingly to the fate of another company’s infrastructure?

  26. onethumb Says:

    Vald is clearly trolling, but I’ll take the bait anyway:

    We’re Amazon’s customers, not the other way around. As anyone can tell, we’re a serious business with serious competitors and looking for any advantage we can get. We found it in Amazon.

    We’re paying them a hefty chunk of money every month, not the other way around, but the best part is that we’re actually saving money by doing so.

  27. SmugBlog: Don MacAskill » Blog Archive » Amazon + Two Guys in a Dorm + $0 = The Next YouTube Says:

    [...] We’re in the article, since we’re a big believer in this “new” vision of Amazon’s. Amazon calls the stuff they’re exposing the “muck” of doing business online, and I think it’s a perfect term. Some people see this as some radical departure from Amazon’s core business, but I don’t at all. Just like much of Amazon’s business, it’s an evolution. They began as a bookstore online (no-one remembers this, but they weren’t the first. BookStacks was relatively huge when Amazon launched), and eventually evolved by adding more and more products. They sell nearly everything, including groceries, now. [...]

  28. SmugBlog: Don MacAskill » Blog Archive » Web 2.0 Summit: Launch Pad Says:

    [...] OmniDrive is an online storage aggregator. This is something that’s been on my mind a lot since S3 launched, since I’d love to have most of my storage “in the cloud” where I can get to it from anywhere. A buddy of mine wrote Jungle Disk, which is very cool, using S3. I wouldn’t be surprised if OmniDrive is also using S3, but they haven’t said. The big idea is that your stuff, whether it’s your documents, photos, videos, music, whatever is normally “stuck” on your PC where you can’t get at it if you’re somewhere else. It’s also prone to loss, since your PC could crash or get infected with a virus or something. Using OmniDrive, you can get that data from anywhere. OmniDrive seems like most of the other online storage providers that have been around for awhile, and plenty more are cropping up now. I’m not sure what’s unique about their offering - their pitch sounded very similar to existing services. Nonetheless, it’s something that everyone wants, whether they realize it or not. Personally, I’m afraid things like Time Machine in Leopard are going to obsolete stuff like this, if Apple (and Microsoft) start linking to things like dot Mac, LiveDrive, or S3 for storage. [...]

  29. SmugBlog: Don MacAskill » Blog Archive » Web 2.0 Summit: Jeff Bezos Says:

    [...] Jeff Bezos just gave a great presentation and had an interesting chat with Tim O’Reilly here at the Web 2.0 Summit. I’ve written about Amazon’s web services a few times, including the BusinessWeek cover story this week. [...]

  30. SmugBlog: Don MacAskill » Blog Archive » Amazon S3: Show me the money Says:

    [...] Amazon S3: The Holy Grail [...]

  31. Nirav Says:

    Can you pelase provide me more details about S3 and how can i implement ? If you can provide me some sample code it will be great.

  32. SmugBlog: Don MacAskill » Blog Archive » Quickies: Hack Day, Sun T1000, Amazon S3 Says:

    [...] I still haven’t posted the in-depth technical details and code samples I promised about our use of Amazon S3, but fear not - I’m actively working on it and will post it as soon as it’s done. [...]

  33. SmugBlog: Don MacAskill » Blog Archive » Amazon S3: Outages, slowdowns, and problems Says:

    [...] Amazon S3 = The Holy Grail [...]

  34. Online Backup Solution Says:

    [...] Look no further than Amazon S3. Just check it out… It’s sweet, and allows for full customization and integration with your favorite open source backup solution. [...]

  35. Tired of all the Windows or Mac Only Backup Solutions? Says:

    [...] Look no further than Amazon S3. Just check it out… It’s sweet, and allows for full customization and integration with your favorite open source backup solution. [...]

  36. Ed Kearns Says:

    Thanks for the help i got alot of customers looking for off site storage and would be interested in these services.

    Thanks

  37. StorageMojo » What a web business wants from storage Says:

    [...] Update: Don wrote in the comments SmugMug also uses over 200 TB of Amazon’s S3, which he talks about here. I’ve got to spend some more time reading his blog! [...]

  38. side effects for tri levlen Says:

    side effects for tri levlen

    news

  39. Geek Synapse » Blog Archives » Amazon Web Services: EC2 Says:

    [...] Wow, more money… Actually, 500GB of data transfer is a lot if you’re not planning to host big pictures and videos. As a proof, codinghorror (a developer blog that is usually slashdotted/digged) peaked 9GB in one day of his most memorable posts. If you’re able to use 500GB of data transfer, I think you can get the extra bandwidth money easily with adsense or another advertisement tool. I would say that the S3 bandwidth can be paid by itself (again, if you’re not running a flickr/youtube clone). Who knows? Maybe Amazon just started EC2 when they realized about all that people moving his heavy content to S3. [...]

  40. Geek Synapse » Blog Archives » Amazon Web Services: EC2 Says:

    [...] Uau, mas dinero… De hecho, 500GB de transferencia es mucho si no estas planeando hostear fotos de gran resolución y vídeos. Como prueba, codinghorror (un blog de desarrolladores que es normalmente enlazado por slashdot/digg) llego a un pico de 9GB en uno de sus posts más memorables. Si eres capaz de usar 500GB de transferencia, yo creo que puedes conseguir el dinero del ancho de banda extra fácilmente con adsense u otra herramienta de publicidad. Yo diría que el el ancho de banda S3 se paga por si mismo (otra vez, si no estas moviendo un clon de flickr/youtube). ¿Quién sabe? A lo mejor Amazon empezó con EC2 cuando se dieron cuenta de toda la gente que movía su contenido pesado a S3. [...]

  41. Amazon Web Services - the backbone of your new startup? « Justin Rudd’s Drivel Says:

    [...] One of the poster children of AWS is SmugMug.  Don MacAskill is one of my favorite writers.  He has a post “Amazon S3 = The Holy Grail“.  At the very end is this little nugget of information - In response to a comment below, I don’t feel like we “bet the company” on S3 - every photo our customers entrust us with, we keep local copies in our existing distributed storage infrastructure. We use S3 as redundant secondary storage for use in cases of outages, data loss, or other catastrophe. [...]

  42. JoeJoe Says:

    I think it’s lame that the details were taken out. This article is fluff. One rule about blogs is LISTEN to your readers. Almost every posting here is requesting the information. yet no report?

  43. Don MacAskill Says:

    @JoeJoe: I have *lots* of S3 details up. Just search for Amazon S3 on the sidebar.

  44. links for 2008-02-06 oggin.net Says:

    [...] SmugBlog: Don MacAskill » Blog Archive » Amazon S3 = The Holy Grail But Amazon’s S3 takes our storage architecture to the next level: Your priceless photos are stored in multiple datacenters, in multiple states, and at multiple companies. They’re orders of magnitude more safe and secure. (tags: Backup Business MySQL hosting server Startup) [...]

  45. Amazon AWS at TNC :: Inside Out Says:

    [...] your system relies on AWS, always have a backup solution in place so that you can keep your basic service [...]

  46. SmugMug - Photo Sharing Site « Rudimentary Art of Programming & Development Says:

    [...] uses Amazon Web Services(S3 and EC2) and very happy with [...]

  47. Free S3 Analytics Says:

    I strongly think that every new company needs to create a new account in amazon s3
    and start using it now without spending 10K on storage.

    Amazon S3 is real bless for many companies.

Leave a Reply