Effortless scalability
Effortless scalability. I want it. I’ll pay for it.
Planning a smooth scale-out growth of a web application can be a nightmare. You are never certain how popular the application will be. You can’t tell when the users will actually use the service. When your site hits the first page on Digg, you feel elation and terror at the same time. Even worse, it can be very difficult to estimate how your users will use the site, how many videos they will watch, and how many database updates they will trigger.
It is easy to set up a single-server web application, or even to separate the major tiers onto separate servers. But when it comes to scaling out to more servers, particularly more database servers, life gets difficult. You face data concurrency issues, latency challenges, failover needs, and backup requirements. Solving each of these is complex, time-consuming, and often expensive.
None of this is conducive to building the next big Internet startup. Solving the scaling issue is a necessary evil… for now. In my opinion, whoever fixes it will make a heap of money. It’s a commodity problem. Someone ought to fix it.
The cloud computing services (such as AWS and now Google AppEngine) are the first steps towards effortless scalability, but they leave a lot to be desired. With a combination of EC2 and S3, you can build a very credible Linux (or even Windows with emulation) solution that can scale with reasonable ease and cost efficiency. However, you still have to bake your own solution for load balancing, database clustering, and server provisioning. Amazon’s SimpleDB is not a great database alternative (it’s not really a database in my opinion), and has some limitations (eventual consistency, data type limitation) that require a substantial layer of custom code to serve the needs of most applications.
I want a better solution. I suppose the market is heading for it, but just in case, I’m going to whine until I get it. I want effortless scalability.
What is effortless scalability? I build my application on my local machine. Once it’s working, I upload it to the cloud, assign it a URL, and I’m done. I get 100 visitors, I pay a bit. I get 10 million visitors, no problem (although I pay a lot more). Some key requirements:
- No server requisitioning… computing resources are allocated automatically as needed
- Built-in database scaling, without the limitations of SimpleDB
- Automated load balancing, with configurable settings for picky developers
- Automated data storage that looks like a file system to me (think cPanel like you would see in a shared hosting environment)
- Flexible stack support (check out CohesiveFT for an example of how that might work)
- Automatic failover
Amazon Web Services is a good start. A highly scalable relational database in the cloud would be a great next step. Or at least a cloud datastore that more closely approximates the capabilities of a relational database, and abstracts away the complexities from the developer.
Some will argue that one of the things that makes Amazon Web Services so powerful is its very simplicity. Complexity abstracted away from the developer will inevitably accrue to the cloud. That is likely true, although from an allocation of resources standpoint, it makes more sense than requiring developers to solve the same problem over and over.
Solutions such as Media Temple only chip away at the problem, for example offering dedicated MySQL containers as their database scaling solution. How can that support a heavy duty site?
Anyway, someone please turn application scaling into a click of a button. I’ll be happy to pay for it.
[Disclosure: I have a relationship with CohesiveFT mentioned in the post through my role at OCA Ventures, a venture firm that has an investment in the company.]
April 16th, 2008 at 11:24 pm
One of the biggest challenges — and you touched on it– is the developer’s ability to customize. Scalability is not “free”, i.e., there are inherent tradeoffs that are made when building a system to scale. If scalability is effortless, it implies that a choice for each tradeoff was made. For many appications, this may be fine. But I can also think of a lot where this may be problematic, especially high performance applications. therefore, the ability to customize– which I’ll rephrase as an ability to control the tradeoffs that are made– I believe will be crucial. Unfortunately, I think this makes the solution hard to architect.
A smaller example is Java. Java abstracts away memory allocation and cleanup from the developer. A nice perk, but the developer forefeits control over when the garbage collector runs, which means that resources are consumed and freed on a whim at run time rather at the behest of the developer. Again, this is fine for a lot of appliations, but not for all.
j
April 16th, 2008 at 11:37 pm
J -
I very much agree that scalability is not free, and tradeoffs have to be made. Then again, many web applications have similar needs, and therefore (within reason) can probably tolerate some generic choices. Similarly, Java works for most developers needs, and if not they can choose a language with fewer abstractions.
But more importantly, your comment brings up an excellent point… assuming the market does head toward effortless scalability, there will probably be several types of cloud systems. For example, financial transactions have specific requirements such as low latency, and therefore I can see a sort of “financial scaling cloud” in the market. Meanwhile, social media sites have different needs such as robust scaling for media delivery, and might have a different scaling cloud. I think a few types of clouds might cover enough of the market. Of course, there would always be some applications that require fully custom development for one reason or another.
April 17th, 2008 at 1:13 am
Hey Joe. It’s great that you posted on this. I actually talked a little about this yesterday on my blog. I really think that this movement is groundbreaking and the move value these bigcos provide, the more the Joe Dwyer’s of the world can focus on solving real problems, and furthering innovation.
April 17th, 2008 at 1:14 am
should be “more value” not “move value”