Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Software

    October 28, 2008

    Beyond The Azure Blue

    From the first day I arrived at Microsoft, my academic colleagues have been asking me about Microsoft's strategy for cloud computing and when (or if) there would be public announcements. Those questions rose to a crescendo as academic groups prepared responses to the NSF eXtreme Digital (XD) TeraGrid solicitation. All I could say was that we were working on a plan, and it would become clear soon.

    I don't normally pitch Microsoft products in the blog, preferring to discuss science policy, technology research and development and global competitiveness. However, something big just happened at Microsoft, something I think will affect all of us. Moreover, as I write this, the Pacific Northwest sky is clear and azure blue, and that doesn't happen often this time of year. An omen, perhaps?

    Microsoft Azure Cloud Services

    At our Professional Developers Conference (PDC), Microsoft announced Azure, our cloud computing platform, with on-demand compute and storage to host, scale and manage Internet or cloud applications. The press release has additional business perspective and a link to the presentation. Azure is one element of the vision Ray Ozzie (See "Mind to Mind: Building Innovation") described in his 2005 Internet Services Disruption memorandum.

    The simplest description of Azure is that the initial release allows you to develop hosted Windows applications using .NET Services, though future releases will support unmanaged code and open source tools as well (Eclipse, Ruby, PHP, and Python). Within Azure, a fabric controller manages application instances and access to storage via SQL Data Services (SDS), and it hosts applications atop virtualized multicore hardware. Finally, Microsoft's Live Services offerings will be layered atop the Azure framework.

    You can read the white paper for details on the Azure design and usage approach. In addition, the software development kit (SDK) is available for download. In addition to the Azure SDK itself, there are SDKs for Visual Studio, .NET and SDS Services. Finally, there are Java and Ruby SDKs for .NET Services as well. This is a Community Technology Preview (CTP), meaning Microsoft welcomes feedback on these early capabilities and will continue to expand the capabilities of Azure over the coming months.

    Science and Technology Implications

    Earlier in the year, I wrote on both my blog and in HPCWire ("Dan's Cloudy Crystal Ball") about the possibility of outsourcing research computing services and infrastructure to the cloud. I noted then that the explosive growth of computing as an enabler of scientific discovery had strained university capabilities and Federal research budgets. Given our current economic crisis, university operating budgets and Federal research expenditures will be under even greater strain and there will be increased scrutiny on the need for each investment.

    In a world of (at best) modest research budget increases, we must ask hard questions about the best use of limited funds. Cloud computing offers a potential mechanism to increase the efficiency of current research, ensure continuity of critical data and enable new kinds of research not now feasible.

    In this model, researchers focus on the higher levels of the software stack -- applications and innovation, not low-level infrastructure. University and Federal research agency administrators, in turn, procure services from the providers based on capabilities and pricing. Finally, the cloud service providers deliver economies of scale and capabilities driven by a large market base and energy efficient infrastructure. Remember, computing infrastructure exists to enable discovery, not as monuments to technological prowess.

    In addition to efficiency, the scalability of cloud services and infrastructure opens new research possibilities. Not only is it possible federate multidisciplinary research data at far larger scales than possible in a university environment (think tens to hundreds of petabytes of low latency storage), we can escape the pernicious cycle of transitory research infrastructure.

    How often have we created data repositories as part of research projects, only to find few mechanisms to ensure their long-term sustainability and access by the broader research community? How often have we faced a miasma of distributed data sources with unknown provenance and non-compatible metadata, each supported pro bono on a best effort basis? (See my recent comments on digital document preservation.) Instead, imagine multidisciplinary data fusion and mining, where students can pose queries against integrated but diverse data sources using robust tools?

    Finally, by leveraging "pay as you go" models, we can trade time and scale on a continuous basis. Imagine applying 50,000 processors for one hour at the same cost as 50 processors for one thousand hours. In the cloud, the integral under the curve is the same and the costs are comparable, but the research effects are qualitatively different.

    The Standard Questions

    The standard questions always arise about new approaches to computing. Cloud services and data storage inevitably raise the standard ones.

    • Is it reliable and will my data persist?
    • Is it safe, private and secure?
    • Will I be captured and become captive?
    • What does it cost and what if I can't continue paying?

    We tend to forget that there are complementary issues about local infrastructure because we have already internalized and accepted the implications and risks. Moreover, local failures are rarely publicized.

    • What happens if my disks crash?
    • What if I can't pay for backups or maintenance or physical plant or …?
    • What if my network is penetrated?

    These are the standard cost/benefit/risk tradeoffs. One must make them based on statistics, economics and practical constraints. Remember that we debated the same issues when we shifted research computing from vendor-backed HPC designs to predominantly commodity components.

    Let's Reason Together

    I welcome discussion of how we can exploit cloud services and infrastructure effectively – all cloud infrastructure, not just Microsoft's Azure. To do this, the cloud service providers, hardware vendors, universities and Federal government must work together to outline an agenda, conduct experiments at scale and speak with a united voice on the opportunities.

    It's a sunny day, but my head is in the clouds.

    October 15, 2008

    Preserving the Past: Educating the Future

    A recent front page article in the New York Times, entitled In the Digital Age, Federal Files Slip into Oblivion, really caught my attention. The article described a problem with which I am painfully and intimately familiar, namely the struggle to preserve the electronic record of government processes and deliberations. Quoting from the article,

    Many federal officials admit to a haphazard approach to preserving e-mail and other electronic records of their work. Indeed, many say they are unsure what materials they are supposed to preserve.

    This confusion is causing alarm among historians, archivists, librarians, Congressional investigators and watchdog groups that want to trace the decision-making process and hold federal officials accountable. With the imminent change in administrations, the concern about lost records has become more acute.

    Even with an army of government clerks, there is a limit to how many pieces of paper the federal government could produce. However, the explosive growth of digital communications and document preparation has far outstripped the processes and technology available to the Library of Congress and the National Archives and Records Administration (NARA). However, it is not just the volume of digital data, it is the diversity of electronic formats and the myriad of physical devices on which the data is stored.

    Imagine receiving a truck filled with PC disk drives and being expected to identify, curate and manage the data contained on them. Sound daunting and farfetched? It isn't. This is precisely what the Clinton White House delivered to the National Archives for preservation; though it included a mere 32 million e-mail messages. (Remember that the White House did not have Internet access until DARPA and Randy Katz wired it in the 1990s.)

    Given the growth of electronic communication since the early 1990s, the Bush administration will undoubtedly have generated hundreds of millions of e-mail messages that must be preserved, along with a plethora of electronic documents in a dizzying array of file formats. In addition to the standard challenges of document identification, extraction and preservation, the Archives of course must deal with national security and classification issues, further exacerbating the challenge.

    I have seen this struggle first hand, as a member of the Advisory Committee for the Electronic Records Archive (ACERA), the digital document preservation project of the National Archives. The National Archives are building a web accessible, indexed repository that will eventually host at least a portion of the torrent of digital data pouring from the federal government. It is an arduous and difficult journey, with more work ahead.

    October 07, 2008

    Three Views on Multicore

     

    Andrew Chien, Dave Patterson and I have each written articles on the challenges and opportunities inherent in multicore hardware and software for the Community Computing Consortium (CCC) blog. My recent article, on the challenge of software, is now posted. In the article, I argued that we must re-envision parallel computing and a new generation of applications that explicitly exploit the scale and heterogeneity of multicore. You can read the article on the CCC blog or below.

    Multicore: It's The Software

    For over thirty years, we have watched the great cycle of innovation defined by the commodity hardware/software ecosystem – faster processors enable software with new features and capabilities that in turn require faster processors, which beget new software. The great wheel has turned, but it no more, as power constraints and device physics now limit the performance achievable with single microprocessors.

    Multicore chips – those with multiple, lower power processors per chip – are now the norm. Moreover, current multicore chips (those with 4-8 cores/chip) are but the beginning. We can expect hundreds of cores per chip in the future, with diverse functionality (graphics, packet protocol processing, DSP, cryptography and other features).

    The software research challenge is clear – developing effective programming abstractions and tools that hide the diversity of multicore chips and features while exploiting their performance for important applications. Hence, we need a vibrant community of researchers exploring diverse approaches to parallel programming – languages, libraries, compilers, tools – and their applicability to multiple application domains.

    Microsoft researchers are investigating all of these approaches, from coordination languages for robots and distributed systems to mobile phones to desktops and data center clouds. To engage the academic community, Microsoft funds multicore research projects and many sites, and we have partnered with Intel to fund the Universal Parallel Computing Research Centers (UPCRCs) at the University of California at Berkeley and the University of Illinois at Urbana-Champaign.

    As Richard Hamming famously noted, "The purpose of computing is insight, not numbers." In that spirit, I believe our research challenge is to break free from the limitations of the desktop metaphor and exploit the ever greater performance of multicore chips to create new human-computer interaction metaphors that are more natural and intuitive. This will require new approaches to parallel computing education and increased collaboration with researchers in application domains.

    As an example, consider one possible future – "spatial computing" – where real-time vision and speech processing, coupled with knowledge bases, distributed sensors and responsive objects, enhance human activities in contextually relevant ways while remaining otherwise unobtrusive. Such an infosphere would adapt to its user's needs and behavior and move seamlessly across home, work and play.

    Multicore brings enormously interesting intellectual challenges and the opportunity to rethink much of how we approach computing. Let's embrace the opportunity!

    July 13, 2008

    Showing Up and Two Corollaries

    "Eighty to ninety percent of life is showing up." The line has been variously attributed to Yogi Berra, Woody Allen or even an anonymous wag. It's wise, though obvious advice – showing up and doing the expected generally allows one to avoid a host of problems. Appearing for jury duty avoids one being held in contempt of court, and you can't fly if you don't show up at the airport on time.  I was reflecting on the implications of "showing up" while at a recent meeting in Italy.

    Show Up and See What Happens

    My friend, Dave Turek, IBM's Vice President for Deep Computing, once explained IBM's open source and Linux strategy by saying that IBM had a deeply considered, two phase strategy for Linux and clusters for HPC, "Show up and see what happens." As he once remarked at an NCSA Private Sector Partners (PSP) meeting, "We've showed up. Now, we are waiting to see what happens."

    At NCSA, we partnered with IBM in 2001 to deploy two of the first large-scale commodity clusters for open scientific use: two 1 teraflop systems based on Intel Pentium III and Itanium processors. At the time, this was a radical, almost heretical idea – deploying commodity PC clusters as production HPC platforms. Of course, such commodity clusters now dominate the Top500 list.

    In a reprise of this experience, Microsoft and NCSA recently partnered to deploy Windows HPC Cluster 2008 on the latest incarnation of commodity cluster hardware. (The customer story has the technical details). I don't generally evangelize for Microsoft products in this blog, but I was very impressed that Windows HPC Cluster achieved substantially higher performance on the same hardware than did Linux. Microsoft, in the form of Kyril Faenov's HPC team, has definitely "showed up" in this space in a big way, and I think there are great opportunities to offer not only Windows compute clusters but also backend acceleration for desktop applications. Of course, all of this is ultimately connected to the ferment in cloud computing.

    Avoid the Obviously Wrong

    At the recent Cetraro meeting on High-Performance Computing and Grids, Miron Livny extended the "show up and see what happens" maxim by offering a corollary, "Show up and avoid doing something stupid." His observation was that evolutionarily, human success was defined by avoiding being trampled by a woolly mammoth, eaten by a hungry Bengal tiger or falling into a crevasse.

    The computing implication of Livny's corollary is that one should do reasonable things when presented with opportunities. In terms of research infrastructure, this means avoiding our academic tendency to delight in second system syndrome – building complex systems that embody all of our personally favorite features without determining if they are either needed or useful.

    At Cetraro, we debated the impact of the multicore revolution, the similarities and differences between Grids and clouds, and the commonalities between future exascale systems and the architecture of megascale data centers. (By the way, if you have not read the Department of Energy's exascale computing study, I highly recommend it.)

    There are deep technical challenges in all of these areas. However, we must avoid being trampled by the woolly mammoths; this domain is fraught with academic, government and industrial politics. I believe we need a wider dynamic range (time horizon, risk/reward and fiscal scale) of research and development projects if we are to solve these problems.

    I have made this point many times, most recently as part of the PCAST report on the U.S. NITRD program. I am scheduled to testify about this again to the House Science and Technology Committee on July 31. I will report on the hearing in August.

    Do Simple Things Quickly

    At the same Cetraro meeting, I opined that there was a second corollary, "Do the obvious, simple things quickly." I think this is the key lesson to be drawn from web2.0 mashups, and the rapid evolution of commercial clouds. The simplicity of the APIs and hosted infrastructure encourages external groups to innovate rapidly. We have seen the clear evidence of this in the explosive growth in social networking sites and in the hosted services that have appeared.

    By contrast, I think this is one of the places we have struggled with academic Grids. The software has often been too complex, and this complexity has been engendered by the distributed nature of the participating organizations, requiring "glue code" to integrate disparate policies and infrastructure across virtual organizations. In contrast, mashups and cloud services can be deployed quickly (by academic standards) using very simple APIs and service level agreements (SLAs). It will be interesting to see how the Grid/Cloud mashup evolves.