From the first day I arrived at Microsoft, my academic colleagues have been asking me about Microsoft's strategy for cloud computing and when (or if) there would be public announcements. Those questions rose to a crescendo as academic groups prepared responses to the NSF eXtreme Digital (XD) TeraGrid solicitation. All I could say was that we were working on a plan, and it would become clear soon.
I don't normally pitch Microsoft products in the blog, preferring to discuss science policy, technology research and development and global competitiveness. However, something big just happened at Microsoft, something I think will affect all of us. Moreover, as I write this, the Pacific Northwest sky is clear and azure blue, and that doesn't happen often this time of year. An omen, perhaps?
Microsoft Azure Cloud Services
At our Professional Developers Conference (PDC), Microsoft announced Azure, our cloud computing platform, with on-demand compute and storage to host, scale and manage Internet or cloud applications. The press release has additional business perspective and a link to the presentation. Azure is one element of the vision Ray Ozzie (See "Mind to Mind: Building Innovation") described in his 2005 Internet Services Disruption memorandum.
The simplest description of Azure is that the initial release allows you to develop hosted Windows applications using .NET Services, though future releases will support unmanaged code and open source tools as well (Eclipse, Ruby, PHP, and Python). Within Azure, a fabric controller manages application instances and access to storage via SQL Data Services (SDS), and it hosts applications atop virtualized multicore hardware. Finally, Microsoft's Live Services offerings will be layered atop the Azure framework.
You can read the white paper for details on the Azure design and usage approach. In addition, the software development kit (SDK) is available for download. In addition to the Azure SDK itself, there are SDKs for Visual Studio, .NET and SDS Services. Finally, there are Java and Ruby SDKs for .NET Services as well. This is a Community Technology Preview (CTP), meaning Microsoft welcomes feedback on these early capabilities and will continue to expand the capabilities of Azure over the coming months.
Science and Technology Implications
Earlier in the year, I wrote on both my blog and in HPCWire ("Dan's Cloudy Crystal Ball") about the possibility of outsourcing research computing services and infrastructure to the cloud. I noted then that the explosive growth of computing as an enabler of scientific discovery had strained university capabilities and Federal research budgets. Given our current economic crisis, university operating budgets and Federal research expenditures will be under even greater strain and there will be increased scrutiny on the need for each investment.
In a world of (at best) modest research budget increases, we must ask hard questions about the best use of limited funds. Cloud computing offers a potential mechanism to increase the efficiency of current research, ensure continuity of critical data and enable new kinds of research not now feasible.
In this model, researchers focus on the higher levels of the software stack -- applications and innovation, not low-level infrastructure. University and Federal research agency administrators, in turn, procure services from the providers based on capabilities and pricing. Finally, the cloud service providers deliver economies of scale and capabilities driven by a large market base and energy efficient infrastructure. Remember, computing infrastructure exists to enable discovery, not as monuments to technological prowess.
In addition to efficiency, the scalability of cloud services and infrastructure opens new research possibilities. Not only is it possible federate multidisciplinary research data at far larger scales than possible in a university environment (think tens to hundreds of petabytes of low latency storage), we can escape the pernicious cycle of transitory research infrastructure.
How often have we created data repositories as part of research projects, only to find few mechanisms to ensure their long-term sustainability and access by the broader research community? How often have we faced a miasma of distributed data sources with unknown provenance and non-compatible metadata, each supported pro bono on a best effort basis? (See my recent comments on digital document preservation.) Instead, imagine multidisciplinary data fusion and mining, where students can pose queries against integrated but diverse data sources using robust tools?
Finally, by leveraging "pay as you go" models, we can trade time and scale on a continuous basis. Imagine applying 50,000 processors for one hour at the same cost as 50 processors for one thousand hours. In the cloud, the integral under the curve is the same and the costs are comparable, but the research effects are qualitatively different.
The Standard Questions
The standard questions always arise about new approaches to computing. Cloud services and data storage inevitably raise the standard ones.
- Is it reliable and will my data persist?
- Is it safe, private and secure?
- Will I be captured and become captive?
- What does it cost and what if I can't continue paying?
We tend to forget that there are complementary issues about local infrastructure because we have already internalized and accepted the implications and risks. Moreover, local failures are rarely publicized.
- What happens if my disks crash?
- What if I can't pay for backups or maintenance or physical plant or …?
- What if my network is penetrated?
These are the standard cost/benefit/risk tradeoffs. One must make them based on statistics, economics and practical constraints. Remember that we debated the same issues when we shifted research computing from vendor-backed HPC designs to predominantly commodity components.
Let's Reason Together
I welcome discussion of how we can exploit cloud services and infrastructure effectively – all cloud infrastructure, not just Microsoft's Azure. To do this, the cloud service providers, hardware vendors, universities and Federal government must work together to outline an agenda, conduct experiments at scale and speak with a united voice on the opportunities.
It's a sunny day, but my head is in the clouds.