Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Innovation

    June 20, 2009

    The Fallacy of Rankings

    N.B. I also write for the Communications of the ACM (CACM). The following essay recently appeared on the CACM blog.

    The world's tallest mountain (Everest), the biggest desert (Sahara), the richest person (Bill Gates), the fastest airplane (SR-71) and even the world's champion hotdog eater (59 hotdogs) – we are fascinated by records and rankings. The good folks who operate the Guinness World Records do a brisk business chronicling our interests in the sometimes unusual aspects of human endeavors and their rankings.

    At this juncture, you might be questioning the relationship between hotdog eating contests and the nominal topic of these essays: high-performance computing (HPC). Or, you may simply be, as I am, dumfounded and awed that any human could eat 59 hotdogs. This is a prodigious feat of perhaps questionable value, but I digress.

    Top 500 Ranking

    The latest, semi-annual Top 500 ranking of the world's fastest supercomputers will be revealed at the International Supercomputing Conference (ISC) in June. Last year, two systems broke the petascale barrier using a GPU/game accelerator processor cluster (LANL's Roadrunner) and a cluster of commodity microprocessors (ORNL's Jaguar). As always, we can expect the latest announcement to garner interest among the technological community, receive coverage in the popular press, and secure bragging rights for the organizations, vendors and countries involved. I await them eagerly myself.

    However, there are many figures of merit for high-performance computing systems, including suitability for target workloads, total cost of ownership (TCO), energy consumption and efficiency, reliability, productivity and ease of use, the richness of available software tools, extensibility and replication across markets, funding models and market viability. Many of these are difficult to quantify, and any multivariate ranking based on these may well differ from that derived from performance on a single technical computing benchmark.

    Partial Orders Matter

    Though rankings (total orders) are intuitive and easily explained – a valuable attribute in today's attention constrained society – they rarely capture the true complexity of multidimensional comparison. Mathematically, this is simply an argument for order theory and partially ordered sets (posets), recognizing that in a multivariate (multidimensional) comparison, some elements may well be unordered or equivalent.

    Is an inexpensive, energy efficient computer system superior or inferior to an expensive, high-performance system? The answer, of course, depends on the intended use. A smart phone and a supercomputer both have value, but one is a poor substitute for the other.

    As valuable as ranking the world's fastest machines is, I believe we would benefit even more from publishing a vector of metrics regarding each HPC system, and focusing less on the extremal points of the poset. Many years ago, the Perfect Club (PERFormance Evaluation by Cost-effective Transformations) benchmarks were created to facilitate one variant of such a multivariate analysis. The SPEC benchmarks are another example from the commercial space.

    These multivariate analyses are not easy. They require much more work than univariate rankings, some of the data is not easily obtained, and some information is viewed as competitive. That does not mean we should not try again to define more diverse evaluation criteria.

    Sixty Four Hotdogs

    Despite fascination with ultrafast computing systems, the mind cannot help but return to hot dog eating. Incredibly, not one, but two individuals ate 59 hotdogs during the allotted time during this year's contest. This necessitated a 5 hotdog "eat off" to determine a winner. As a computer scientist, I recognize sixty-four as an interesting power of two. Hotdogs and supercomputers, both are driven by human competitiveness.

    May 12, 2009

    NITRD Reauthorization: Enabling the Future

    As a computing researcher, as chair of the Computing Research Association (CRA), and as a former member of the President's IT Advisory Committee and the President's Council of Advisors on Science and Technology (PCAST), I have spoken and written repeatedly about the state of computing research in the United States, the importance of long-term, strategic investment and the critical need for strategic, interagency planning.

    Today, the U.S. House of Representatives passed H.R. 2020, the Networking and Information Technology Research and Development Act of 2009, which embodies many of those recommendations. As press release from the House Committee on Science and Technology notes, this reauthorization of the Networking and Information Technology R&D (NITRD) program

    … strengthens interagency planning, coordination, and prioritization for NITRD by requiring the development and periodic update of a strategic plan informed by both industry and academia. This plan is meant to create a vision for networking and information technology R&D across the federal government, and provide specific metrics for measuring progress toward that vision.

    The Road to the Present

    As chair of the Computing Research Association (CRA), I was pleased to endorse H.R. 2020. Simply put, H.R. 2020 is the culmination of many years of background work, reports, discussions and Congressional hearings by diverse groups.

    In July 2008, i testified before Rep. Gordon and the House Science and Technology Committee ("NITRD: Come, Let Us Reason Together"), summarizing the 2007 recommendations of the President's Council of Advisors on Science and Technology (PCAST) report, Leadership Under Challenge: Information Technology R&D in a Competitive World, whose production I had the privilege to co-chair. The report included the following recommendations ("PCAST, NITRD and the Future"), emphasizing the contributions of information technology to our continued prosperity and well being, something especially timely given current circumstances:

    • Address the demand for skilled IT professionals by revamping curricula, increasing fellowships, and simplifying visa processes.
    • Emphasize larger-scale, longer-term, multidisciplinary IT R&D and innovative, higher-risk research
    • Give priority to R&D in IT systems connected with the physical world, software, digital data, and networking
    • Develop and implement strategic and technical plans for the NITRD program

    Thanks to the hard work of many people, all of these were addressed in the NITRD reauthorization bill. Specifically, the reauthorization includes creation of a five year strategic plan, to be updated every three years and assessed by an independent committee whose co-chairs are members of PCAST. The reauthorization also emphasizes the importance of long term, multidisciplinary research and identifies cyberphysical systems as a critical element of the research agenda.

    These are sufficiently noteworthy that I feel compelled to quote from H.R. 2020, regarding the strategic plan:

    (A) foster the transfer of research and development results into new technologies and applications for the benefit of society, including through cooperation and collaborations with networking and information technology research, development, and technology transition initiatives supported by the States;

    (B) encourage and support mechanisms for interdisciplinary research and development in high-performance computing, including through collaborations across agencies, across Program Component Areas, with industry, with Federal laboratories (as defined in section 4 of the Stevenson-Wydler Technology Innovation Act of 1980 (15 U.S.C. 3703)), and with international organizations;

    (C) address long-term challenges of national importance for which solutions require large-scale, long-term, interdisciplinary research and development;

    (D) place emphasis on innovative and high-risk projects having the potential for substantial societal returns on the research investment; and

    (E) strengthen all levels of networking and information technology education and training programs to ensure an adequate, well-trained workforce.

    Yes, high-performance computing is identified explicitly, as a collaborative activity across government, industry and academia and with international partners.

    Cyberphysical Systems

    As some of you may recall, cyberphysical systems (i.e., computing systems that interact with the physical world) emerged as the top research priority from the PCAST assessment of NITRD needs. Today, our critical national and international infrastructure (financial systems, telecommunications, transportation, and utility grid), national security, and our personal lives (communications, biomedical devices, household appliances, automobiles and entertainment systems) are all computer enhanced and mediated.

    Computing is an inseparable part of our culture and our prosperity, and ensuring the reliable, correct and secure operation of this cyberphysical infrastructure is central to our future. Hence, I am especially delighted that the reauthorization calls for a joint university/industry task force to develop a research and development agenda for cyberphysical systems, together with defining roles and responsibilities, suggesting funding mechanisms and discussing intellectual property (IP) mechanisms. I am especially pleased that IP mechanisms was identified explicitly, as I believe we need to rethink how our public-private sector partnerships are best organized for mutual benefit.

    The Road Ahead

    The work by the House Science and Technology Committee and the passage of H.R. 2020 by the full House paves the way for the future. We have defined a scaffold for the future. Now we must erect the enabling infrastructure for a knowledge-centric society. I am confident the new incarnation of PCAST, which includes my Microsoft colleague, Craig Mundie, will continue to watch the progress of the NITRD program as our ever-changing field helps shape our future.

    April 19, 2009

    Escaping from Flatland

    In 1884 (no, that's not a typographical error), Edwin Abbott wrote a satire about Victorian England and its social hierarchy, in the guise of a mathematical story about life in a two-dimensional world, whence the whimsical title, Flatland: A Romance of Many Dimensions. If one can look beyond Abbott's misogyny to the crux of the mathematical story, it is an illuminating introduction to geometry in one, two and three dimensions, with some generalizing hints about higher dimensional geometries. (You can read a copy at the Internet Archive, or purchase a hardcopy).

    The story is told from the perspective of a square, a resident of Flatland, who receives a visit from an inhabitant of a three-dimensional world – Spaceland. In Flatland, of course, the sphere is perceived only as a circle whose diameter varies based on the sphere's orientation to the plane. The sphere preaches the existence of higher dimensions, but the Flatlander leaders attempt to suppress all such information.

    Denizens of Waferland

    In fine recursive fashion, the Flatland story is itself a metaphor for our tenacious embrace of our two-dimensional world of semiconductors, particularly as it relates to memory technologies. We are deeply entangled in the angst, ennui, despair and perhaps even the clinical depression related to our encounter with the limits of instruction level parallelism (ILP), sequential execution semantics and the microprocessor power wall.

    As a community, we have grudgingly and guardedly recognized the need for multicore processors (See Three Views on Multicore and Manycore: Able Was I Ere I Saw Elba.) However, we are still clinging tenaciously to our dual in-line memory module (DIMM), two-dimensional packaging and double data rate (DDR) memory designs. We need a visitor from the third dimension, preaching the gospel of chip stacking to the denizens of chip waferland.

    Chip Stacking: Beyond DDR

    We are approaching scaling limits for our pin-based interfaces. Each generation, we have dropped voltages, increased clock rates and doubled the number of words transferred. DDR memory first operated at 2.5V and up to 400 Mb/s, dropped to 1.8V and increased to 800 Mb/s for DDR2, and is at 1.5V and 1600 Mb/s for DDR3. We can see DDR4 on the horizon, perhaps in 2012 at ~1V.

    It is time – long past time – for us to move to the third dimension and stack our chips. With chip (die) stacking, need not be constrained by connections to the perimeters of our chips, but can exploit connectivity across a larger fraction of their area. IBM, Intel, Samsung and others are exploring variations of this idea, as this smattering of press releases and articles illustrates.

    With lower power, multicore designs, through silicon vias (TSVs), and wafer thinning for heat dissipation, we can crack the memory wall that has plagued us for so long. More to the point, those of us in the big iron/fast iron camp could learn a few things from our compatriots in the embedded systems world, where innovative packaging is a fundamental market driver.

    Make no mistake; this will not be easy, as it requires new approaches to via fabrication, as well changing our ecosystem of chipsets and interface standardization processes. We may not have tachyon-based hyperspatial communication, but surely we can escape from Flatland.

    February 24, 2009

    Seeding The Clouds

    Since I joined Microsoft in late 2007, I have written about science policy, Federal government interactions, and national competitiveness studies, in my role as a member of PCAST and chair of the Computing Research Association (CRA). Throughout, I have emphasized the need for strategic investment in long-term, basic research, especially as part of the economic stimulus package..

    I have also discussed the rise of multicore computing, the consequent software crisis and the need for innovation in both architecture and software, including Microsoft's support for the Microsoft/Intel-funded Universal Parallel Computing Research Centers (UPCRCs) at Illinois and UC-Berkeley. I have also mused on the future of high-performance computing and its role as an enabler of scientific discovery. I have even written about my family, my rural childhood and my life experiences.

    What I have not done is write about why I came to Microsoft and what I am doing – until now. Yes, my team manages the UPCRCs in partnership with Intel. Yes, I devote time and energy to research policy, both for the community and on behalf of Microsoft. Yes, I am involved in the future of high-performance computing, both politically and technically. However, that's not the entire story.

    It's time to talk infrastructure so large it makes petascale systems seem small. It's time to talk about why I can't remember the last time I had this much fun. It's time to pull back the curtain and talk about the future of clouds. No, I'm not talking about weather forecasting, though I really enjoyed my past collaboration with the LEAD partnership.

    I came to Microsoft to lead a new research initiative in cloud computing, one that complements our production data center infrastructure and our nascent Azure cloud software platform. You can read the press release and the web site for the official story. What follows is my personal perspective.

    The Infrastructure of Our Lives

    We all know the cloud premise – Internet delivery of software and services to distributed clients, from mobile devices to desktops. We tend not to think about how dependent we now are on those delivered services, though we are, just as we depend on the telephone and our water and electrical utilities.

    Imagine a day without the web, without search engines, without social networks, without online games, without electronic commerce, without streaming audio and video. Our world has changed, and government, business, education, recreation and social interaction are now critically dependent on reliable Internet services and the hardware and software infrastructure behind them. However, more research and technology evaluation are needed to make them as trustworthy as the telephone network.

    Building Internet services infrastructure using standard, off-the-shelf technology made sense during the 1990s Internet boom. (And yes, I remember how cool Mosaic was, when I first saw it at Illinois.) The facilities were small by today's standards, and the infrastructure could be deployed quickly. Today, however, the scale is vastly larger, our social and economic dependence is much greater and the consequences of failure are profound. Web service outages are now international news, and a cyberattack is considered an act of war.

    For background on some of the challenges and problems in scaling, you might want to follow the Data Center Knowledge and High Scalability web sites. If you are new to this space, they and other reading will redefine your notions of large and reliable. You might not think 100 megawatts could be a data center design constraint, but it is. More importantly, you should fear – yea, verily, be absolutely terrified by –the wrath of 100 million unhappy customers should your Internet service fail. Every nightmare that has ever awakened a CIO in a cold sweat at 2am is real, but magnified a thousand fold. If it were easy, though, it would neither be exciting nor fun.

    Cloud Infrastructure Challenges

    Microsoft's business, like that of other cloud service providers -- Amazon, Google, Yahoo and others – depends on an ever-expanding network of massive data centers: hundreds of thousands of servers, many, many petabytes of data, hundreds of megawatts of power, and billions of dollars in capital and operational expenses. This enormous scale – far larger than even the largest high-performance computing facilities – brings new design, deployment and management challenges, including energy efficiency, rapid deployment, resilience, geo-distribution, composability, and graceful recovery.

    I have been a "big iron" guy for a long time, and Internet and cloud services infrastructures do have analogs with petascale and exascale computing, but the workloads and optimization axes are different. Like today's HPC systems, cloud computing facilities are being built with hardware and software technologies not originally designed for deployment at such massive scale. Consequently, they are less efficient and less flexible than they either can or should be. If we built utility power plants the same way we build cloud infrastructure, we would start by visiting The Home Depot and buying millions of gasoline-powered generators. This must change.

    Imagine a world where heterogeneous multicore processors are design and optimized for diverse workloads, where solid state storage changes our historical notions of latency and bandwidth, where on-chip optics, system interconnects and LAN/WAN networking simplify data movement, where scalable systems are resilient to component failures, where programming abstractions facilitate functional dispersion across devices and facilities, where new applications are developed more quickly and efficiently. This can be.

    Cloud Computing Futures

    Over the past fourteen months, I have been quietly building the Cloud Computing Futures (CCF) team, starting with a key concept. We must treat cloud service infrastructure as an integrated system—a holistic entity—and optimize all aspects of hardware and software. I have recruited hardware and software researchers, software developers and industry partners to pursue this vision. It's been a blast.

    The CCF agenda spans next-generation storage devices and memories, new processors and processor architectures, networks, system packaging, programming models and software tools. We are a research and technology transfer team, whose roles are to explore radical new alternatives – "blank sheet of paper" approaches to cloud hardware and software infrastructure – and to drive those ideas into implementation and practice.

    Effective research in this space requires changes to both hardware and software, and the resulting prototypes must be constructed and tested at a scale difficult for small teams. This type of research and technology transfer is in academia, because the efforts often cross many research disciplines.

    For this reason, the CCF team is taking an integrated approach, drawing insights and lessons from Microsoft's production services and data center operations, and partnering with researchers, vendors and product teams worldwide. Our work builds on technical partnerships and collaborations across Microsoft, including Microsoft Research, Debra Chrapaty's Global Foundation Services (GFS) data center construction, operations and delivery team, and Ray Ozzie's Azure cloud services group. We are also partnering with an array of hardware-technology providers and companies as we build prototypes.

    Now You Know

    For me, CCF has been an opportunity to apply research experiences and ideas gleaned over the past twenty-five years of my academic career. Equally importantly, it is a chance to build prototypes at scale to test those ideas, and then help drive the promising technologies into practice. The past year has been great fun, and I have been privileged to attract and partner with some wonderful people to this adventure, including Jim Larus and Dennis Gannon.

    Now you know why I came to Microsoft. It was a chance to practice what I've been preaching. It was a chance to help design the biggest of big iron. It was a chance to help invent the future. It's a pretty cool gig for a balding old geezer like me!

    February 08, 2009

    A Few Thoughts on the Stimulus Package

    The political maneuvering and theater are well underway as the U.S. Congress debates the merits of various proposals to stimulate the economy. The U.S. House of Representatives has passed H.R. 1, the American Recovery and Reinvestment Act of 2009, and the Nelson/Collins (Senators Ben Nelson and Susan Collins) adjustments to S. 336 will likely come to the floor of the U.S. Senate for a vote in a few days. If the modified bill is approved by the Senate, we will await the negotiations that follow in conference.

    Support for scientific research is a small fraction of the stimulus plan, and the House and Senate plans differ in some marked ways. ASTRA has a handy comparison of the two proposals with respect to research investment.

    If you haven't seen legislative sausage made before, it is important to understand the process. After each legislative branch passes its version of a bill, a conference committee reconciles the differences, and the compromise must then be approved (again) by both branches. It is a competitive and often messy rugby scrum. Hence, we do not yet know what may emerge in support of scientific research and evelopment.

    Steve Ballmer on Science

    Microsoft's CEO, Steve Ballmer, recently spoke to the U.S. House Democratic Caucus Retreat. Although you can read the complete speech, I would like to highlight a few excerpts that emphasize Microsoft's strong support for innovation and the importance of continued investment in basic research. In his speech, Steve noted

    … America really has to return to growth that's built on innovation and productivity, rather than leverage and private debt.  That must happen.

    He went on to say,

    We need to pursue breakthroughs over the coming years in green technology, alternative energy, bioengineering, parallel computing, quantum computing.  Without greater government investment in the basic research, there is a danger that important advances will happen in other countries.  This is truly I think not only an issue of competitiveness, but also in a sense of national security.  Companies like ours and others can do our fair share in terms of funding of basic research, but government needs to take the lead.

    I could not agree more wholeheartedly.

    Microsoft Policy Blog

    On the subject of Microsoft and policy, the company recently launched a policy blog (Microsoft on the Issues), including support for research. A few weeks ago, I penned an entry for the Microsoft policy blog on the federal stimulus plan and scientific innovation. In addition to noting the critical importance of innovation to fuel the economy, I observed that we should treat the current crisis and any new research funds as an opportunity to rethink the way we approach university research and public/private partnerships:

    Beyond critically needed funding, the bill gives government, academia and industry a chance to rethink research partnerships and policies in ways that will harness the benefits of scientific innovation for the good of the entire nation.   …

    We now have the opportunity to further streamline our nation's research infrastructure, particularly in U.S. research universities.  …

    By rethinking public-private sector partnerships, and refining processes for acquiring and deploying information technology, we can increase research efficiency and catalyze new discoveries while reducing costs for both universities and the federal government.

    The potential influx of research funds from the stimulus package creates a great opportunity for research innovation. However, these are perilous times, and we should not (by default) assume that "business as usual" is the best approach to accelerating research. It may indeed be the best approach, but we should face the issues squarely and thoughtfully.

    What is the best way to apply information technology to science and engineering research? How can we best advance computing research itself? How can we retain our research strengths while also addressing the rising cost of higher education? What can we learn from new and effective approaches elsewhere? How can we continue to compete effectively and efficiently? As Spiderman says, "With great power, comes great responsibility."

    As always, I welcome your thoughts and ideas.