Archive for the ‘amazon’ Category

Quickies: Hack Day, Sun T1000, Amazon S3

Thursday, October 5th, 2006

Really quick…

Yahoo! Hack Day

SmugMug was in the house at Hack Day 2006, and we had a great time! Many thanks to Yahoo for putting on such a great event - we learned a lot about Yahoo technologies and put together a great demo. Anytime they want to throw another one, we’ll be there. Fantastic group of people over at Yahoo.

Best part about it is that our demo will shortly be a shipping product our customers will love and that’ll generate extra revenue for our company. Oh, and BigWebGuy got his official hazing there at Hack Day - he coded for 36 hours straight (no sleep!) his first week on the job even though he was sick! Welcome to the family, Lee!

Sun T1000

The Sun T1000 is very much still on our radar. I don’t want to do an in-depth update until we’re absolutely sure about what’s going on, but here’s a short summary of where we are.

I spent 5 hours over at Sun a few days after our initial results were posted with some very intelligent people. They were as perplexed at the results as I was, and were determined to get to the bottom of it. The good news is we now have a T1000 running Solaris side-by-side with a T1000 running Ubuntu which is side-by-side with our dual dual-core Opteron running Red Hat. The bad news is the Sun guys weren’t able to coax any more performance (yet!) out of the T1000.

We have a theory that we might be saturating the GigE port with raw # of interrupts per second, so it’s getting throttled there and starving the CPUs. So we have a gameplan for what to attack next - I’ve just been too swamped to deal with it for the last few weeks. We’ll get to it, though, I promise and I’ll share all the details.

Amazon S3

I still haven’t posted the in-depth technical details and code samples I promised about our use of Amazon S3, but fear not - I’m actively working on it and will post it as soon as it’s done.

Just wanted you to know I hadn’t forgotten about you. :)

Incidentally, Jeremy Zawodny is playing around with using it for his personal backup storage. Sounds sweet!

Amazon S3 = The Holy Grail

Saturday, August 12th, 2006

I should have posted this a few weeks ago, but better late than never. We now use Amazon S3 for a significant part of our storage solution. We're absolutely in love with it - and our customers are too (even if they don't know it).

As you probably know, SmugMug has been profitable since our first year, with no investment capital. We've had a great track record for keeping our customers' priceless photos safe and secure using only the profits we've accrued to purchase our storage (yes, I said purchase. We have no debt - we own all of our storage, we don't lease). And every SmugMug customer gets unlimited storage - so that's no mean feat. (Currently, unlimited means ~300TB of storage and nearly 500,000,000 images. To put that into perspective, that's more than 65,000 DVDs or 480,000 CDs).

But Amazon's S3 takes our storage architecture to the next level:

  • Your priceless photos are stored in multiple datacenters, in multiple states, and at multiple companies. They're orders of magnitude more safe and secure.
  • We'd already built a custom, low-cost commodity-hardware redundant scalable storage infrastructure. Nonetheless, it's significantly cheaper to use S3 than using our own - especially when you factor multiple states & datacenters into the equation.
  • Perhaps even more importantly, our cash-flow situation is vastly improved. Instead of paying $25,000 for a handful of terabytes of redundant storage up-front, even before they're used, we now pay $0.15/GB/month as we use it.
  • When we have some sort of internal outage with storage, it doesn't matter - Amazon's always on. They eat their own dogfood - S3 is in production use on dozens of Amazon products. We've had storage-related internal outages a few times already, and our customers haven't been able to tell. We'll still have rare outages on our site, unfortunately, (everyone does), but storage is now vastly less likely to be part of the cause.
  • I started writing our S3 interface on a Monday, and by that Friday, we were live and in production. It really is that simple to pick up and use, and it was basically a drop-in addition to our existing storage.
  • It's fast. I don't mean 15K-SCSI-RAID0-fast, but I do mean internet-latency-fast. It's basically as fast as our internal local storage + the roundtrip speed of light to Amazon. I can measure the difference with computer timing, but in blind tests, humans haven't been able to tell the difference. Everything we serve from Amazon feels fast.

I hate to admit this, but Amazon has built a playing-field leveler. It's now much much easier for a competitor of ours to spring fully-formed from two guys in a garage than it was. Anyone who doesn't get on board with Amazon S3 (or the inevitable S3 competitors) may get left behind. I'm glad we're first, but I doubt it'll last.

Tim O'Reilly, technology visionary extraordinaire, recently said of Sun's new 'Thumper', the Sun Fire X4500: "This is the Web 2.0 server." While I think Tim has perhaps the clearest vision in the industry, and the Thumper does truly look awesome, this time I think he may have missed the mark. The Web 2.0 server is *any* cheap Linux box coupled with utility storage like S3.

Initially this post had a lot of technical detail (I am the 'Chief Geek', afterall), but I removed it since it was probably getting boring. So this is the quick-and-dirty 'Business Case for Amazon S3 and How it Helps our Customers' post. If there's enough interest, I can write up a detailed post about exactly how we use S3, how it works in conjunction with our own local distributed filesystem, and post our S3 library (which was derived from someone else's). Post in the comments if that's of interest.

Also, we'll be presenting at a storage conference in Florida in late October (I'm sorry, I don't have the name of the con with me, but I'll update this post when I do), and have had a few other people request conferences talks on the subject. Comment if that's of interest, too, so we know where to go speak.

Finally, one last geek thought: Anyone using the SmugMug API is now actually using multiple APIs through ours (depending on what you're doing, you may be using Google and/or Yahoo, but you're almost certainly using Amazon). The stack continues to grow.

UPDATE #1: In response to a comment below, I don't feel like we "bet the company" on S3 - every photo our customers entrust us with, we keep local copies in our existing distributed storage infrastructure. We use S3 as redundant secondary storage for use in cases of outages, data loss, or other catastrophe.