Cloud: Elasticity is more important than Scalability June 5, 2009
Posted by inukonda in Uncategorized.add a comment
There is a lot of buzz around “cloud computing” in the tech space these days. Industry leaders have generally come to agree that cloud computing is a computing model where instantly (and hugely) scalable IT capabilities (software, hardware, services) are delivered in the form of a service over the internet.
The emphasis seems to be that cloud platforms have to be “scalable” but I want to argue that they have to be “elastic” as well. While scalability is key, without elasticity a platform cannot be truly considered a cloud platform.
Wikipedia defines scalability as:
In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load.
The definition for elasticity is:
(In Physics) Elasticity refers to continuum mechanics of bodies which deform reversibly under stress. Elastic refers to a reversible deformation of a material. In essence, Elasticity is the ability to grow and contract as needed.
So, while scalability ensures that the cloud platform can handle an increased load of users working on a ‘cloud’ application, elasticity makes sure that the cloud platform scales up or down based on need without stopping the way the business is handled. Without this, the economies of moving a business/application to the cloud do not make sense. In a typical enterprise, resources are only scaled up. When you want to handle more users, you buy more resources which idle at times of normal/low user load. This model can get pretty expensive over time and is the main reason why enterprises want to move to the cloud. A user scenario for this would be: target.com which is hosted in the cloud (EC2 I believe) can request more resources during the holiday season when more people visit the site and can release those resources when not needed at the end of the holiday season without changing the website or application. Elasticity of the cloud platform ensures this.
In fact, even Amazon EC2 is not truly elastic as the least unit of expansion there is a slice of compute power. In my view, “scale” indicates the size of a cloud computing provider while “elastic” indicates their agility. A small provider can be elastic and offer good cloud services as well. So, when considering cloud platforms – dont just look at “scalability”, review how “elastic” the platform is as well.
Understanding infrastructure in the cloud June 4, 2009
Posted by inukonda in Uncategorized.1 comment so far
There are several definitions of what makes up “Cloud Computing” but everyone agrees that at the bottom of the stack, someone needs to provide the core hardware resources. This layer has come to be known as “Infrastructure as a Service” (IaaS). Cloud Security Alliance has provided a great definition for IaaS:
The capability provided to the consumer (by IaaS) is to rent processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly select networking components (e.g., firewalls, load balancers).
Note: in the above definition, consumer is typically an independent software vendor – one who is developing software applications or services.
Taking it one step further, IaaS is in turn made up of several different architectural models: (Note: when I use “buy” below – I refer to “buy” in the sense of pay-per-use of that resource)
1. Shared platform with virtualized resources: In this model, the consumer ends up buying and sharing resources with other consumers. The hardware resources are virtualized using platforms such as VMWare, Xen etc. and think slices of compute power are offered to the consumers. Each slice maps to a certain processing capacity (CPU cores) and memory. You pay on a per-slice basis. While the model is very flexible, the downside is that it is not truly elastic – in the sense that if you need more resources, you have to get another slice. So, the slice is the minimum denomination. Examples: Amazon EC2, ServePath GoGrid.
2. Non-shared platform with non-virtualized resources: In this the consumer buys dedicated servers but the servers are located in the cloud. Typically the servers are not virtualized. This is very similar to hosting environments, but the main difference (also huge) is that if you need more capacity, you can get an additional server in a very short amount of time. The provider gives API’s to provision new servers instantaneously. This gives greater control but is also more expensive. Examples: IronScale, AppNexus.
3. Non-shared platform with virtualized resources: In this model, the consumer buys a bunch of hardware resources and then virtualizes them using platforms such as VMWare etc. and then uses the server slices as needed. The hardware resources are not shared with other consumers. This is also an expensive model and new resources can be provisioned & virtualized instantaneously. Examples: GridLayer
4. Hosted server: This is the traditional model that has been available till recently. In this model, consumer specifies the # of servers he needs, types of servers and buys server space. The consumer gets full access to the server via remote login and he has to typically manage the server himself. If he needs additional capacity, he has to go to the vendor and get more servers which might take some time. Examples: RackSpace (though RackSpace has cloud offerings as well). The next phase of evolution of these platforms is going to be their interaction with on-premise resources and the hybridization of internal and external data centers.
The bottom of the stack is evolving pretty nicely and I believe that we have made significant progress in giving the adminstrators the ‘warm n fuzzy’ about migrating their applications to the cloud. However, I think SMB’s will lead the pack in adopting the cloud and large enterprises will move very slowly over a long period of time. The next phase of evolution will be the hybridization of internal and external data centers and how the IaaS layer interacts with on-premise resources.
Bing Travel June 4, 2009
Posted by inukonda in Uncategorized.add a comment
Well done MS. Love the new bing travel site . This is based on technology that MS acquired from Farecast. Like I mentioned in a previous post, MS needs to focus on “topical search” and differentiate itself. I am getting closer and closer to ditching Google. I recommend you do the same!
With its massive war chest of cash, I think MS might be looking to acquire other vertical search engines. The good thing is that the Live division can integrate new acquisitions pretty quickly unlike the rest of MS. Finally, some real competition to Google.
I am going to speculate and say that Kayak will be acquired pretty soon. (maybe by GOOG). Remember, you heard it here first.
Mining twitter data June 3, 2009
Posted by inukonda in Uncategorized.1 comment so far
Everyone is talking about “real-time” search and data and twitter being a good medium for that. Fred Wilson even posted an article on the potential of Twitter being a substitute for Set Top Box Data. While I love twitter and use it religiously, I am going to go out on a limb and say that we are far from taking meaningful revenue generating actions from an enterprise/business standpoint based on the data mined from twitter streams. Here are a few reasons for my premise:
1. Based on a recent HBS study, about 10% of the population on twitter is responsible for 90% of the content. This indicates that twitter can excel in acting as a collective real time polling mechanism for 10% of the population. The accuracy of these results depends on how close this collective group is to the broader population and I suspect that it is not. It is almost like a prediction market and the reasons for the failure of prediction markets are pretty similar.
2. There is no measurement. Without a meaningful way to measure, the data is not going to be highly reliable. CPC is useful as we can measure the clicks. Measuring consumption is critical. Similar to click fraud, will we start seeing tweet-fraud?
3. I cannot measure many-to-many engagement using twitter. I can measure the one-to-many engagement. But if I were able to measure the many-to-many, this would be killer data to have.
4. From a BI perspective, the data in twitter is pretty unstructured. How do I take the unstructured data and convert it into a structured form that can be interpreted? How do you measure emotions and sentiments?
Bottomline, I think mining twitter data is going to be a major thing going forward. But in the short term, it is going to be one of the many sources that companies use for BI/MI type activities. (and also not a major one). I do believe that down the road, twitter data will be a reckoning force in the BI space.
What would make me bing June 2, 2009
Posted by inukonda in Uncategorized.3 comments
Since everyone seems to be talking about Bing – MS’ new search engine, I figured I’d throw in my few cents as well. To start with, MS has actually released a pretty good v1 product in a long time. The focus seems to be Bing v/s Google. I don’t think that is the right argument, the right conversation should be whether Bing and Google can co-exist. I believe the answer is yes, however I would like to see a few more things from Bing before I make the switch.
1. It is the user interface not the search results: I don’t think the accuracy of the search results is going to determine whether people will use Bing or Google. At the end of the day search results are search results. Give me something different! Give me a better user interface. For example: if I search for something, give me a visual depiction based on categories. Say, if I search for “semantic web”, give me results categorized by news articles, blogs, research reports etc. in a nice visual form.
2. Topical Search: Recognize the search category and do a deep search. For ex: if I searched for ‘french restaurants in beacon hill’, I’d like for Bing to give me a list of french restaurants in Beacon Hill, a list of reviews from Yelp, a list of available reservation times from Open Table. Same thing for a flight search, give me a kayak like experience.
3. More is not always better: Limit the search results. Google gives me like 20,000 results for every search but I never go beyond Page 2. So what’s the point? Bing should limit the search results to say 20 but give me confidence that the 20 are what I need.
4. Store my search: Right now after a search, if I find something important – I bookmark it or store it in a xls file for that topic or something like that. And I notate each bookmark so I can refer to it later. Get rid of this behavior for me completely. Store the searches that I did and then also give me a visual history so I can go back and easily reference what I did. I never want to bookmark, tag – ever again.
5. Improve local search: An easy way that Bing can get people to use them is by focusing on local search. Google already does this, but it is not good enough. What I’d like to see from Bing is to create pages for local businesses and let the business owners personalize these pages. So when I search for “coffee, harvard square” – I should get 3 or 4 results and each result should be that of a webpage of the coffee store (if it has one) and if it does not have one, there should be a Bing created landing page for that business with more information. In effect, Bing will become the one stop shop for all local businesses. This will change the game. (Fred Wilson proposed somehting similar at some point, but I cannot find the link to that post)
Bottomline, I think Bing and Google can and should co-exist. Competition is always good for innovation! Looking forward to the search wars!