Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Multicore Architecture

    October 07, 2008

    Three Views on Multicore

     

    Andrew Chien, Dave Patterson and I have each written articles on the challenges and opportunities inherent in multicore hardware and software for the Community Computing Consortium (CCC) blog. My recent article, on the challenge of software, is now posted. In the article, I argued that we must re-envision parallel computing and a new generation of applications that explicitly exploit the scale and heterogeneity of multicore. You can read the article on the CCC blog or below.

    Multicore: It's The Software

    For over thirty years, we have watched the great cycle of innovation defined by the commodity hardware/software ecosystem – faster processors enable software with new features and capabilities that in turn require faster processors, which beget new software. The great wheel has turned, but it no more, as power constraints and device physics now limit the performance achievable with single microprocessors.

    Multicore chips – those with multiple, lower power processors per chip – are now the norm. Moreover, current multicore chips (those with 4-8 cores/chip) are but the beginning. We can expect hundreds of cores per chip in the future, with diverse functionality (graphics, packet protocol processing, DSP, cryptography and other features).

    The software research challenge is clear – developing effective programming abstractions and tools that hide the diversity of multicore chips and features while exploiting their performance for important applications. Hence, we need a vibrant community of researchers exploring diverse approaches to parallel programming – languages, libraries, compilers, tools – and their applicability to multiple application domains.

    Microsoft researchers are investigating all of these approaches, from coordination languages for robots and distributed systems to mobile phones to desktops and data center clouds. To engage the academic community, Microsoft funds multicore research projects and many sites, and we have partnered with Intel to fund the Universal Parallel Computing Research Centers (UPCRCs) at the University of California at Berkeley and the University of Illinois at Urbana-Champaign.

    As Richard Hamming famously noted, "The purpose of computing is insight, not numbers." In that spirit, I believe our research challenge is to break free from the limitations of the desktop metaphor and exploit the ever greater performance of multicore chips to create new human-computer interaction metaphors that are more natural and intuitive. This will require new approaches to parallel computing education and increased collaboration with researchers in application domains.

    As an example, consider one possible future – "spatial computing" – where real-time vision and speech processing, coupled with knowledge bases, distributed sensors and responsive objects, enhance human activities in contextually relevant ways while remaining otherwise unobtrusive. Such an infosphere would adapt to its user's needs and behavior and move seamlessly across home, work and play.

    Multicore brings enormously interesting intellectual challenges and the opportunity to rethink much of how we approach computing. Let's embrace the opportunity!

    September 08, 2008

    ManyCore: Able Was I Ere I Saw Elba



    As I write this, I am attending the ETH
    LASER summer school on concurrency, which is being held on the island of Elba. The island sits off the coast of Tuscany, a few miles from Pisa. It is perhaps best known as the place where Napoleon was exiled after his forced abdication and where he spent the interregnum before his final defeat at Waterloo. (Let me express my thanks to Bertrand Meyer for the invitation to speak at the summer school.)

    As I prepare to deliver six lectures on multicore and cloud computing here on Elba, the geographic irony of grand ambition, hubris and ignominious defeat is not lost on me. We have been struggling for the past forty years to find elegant and efficient parallel and distributed programming paradigms, with modest success. To continue my 19th century metaphor, we remain, as Matthew Arnold sadly put it, "Swept with confused alarms of struggle and flight, where ignorant armies clash by night."

    The Virtuous Cycle

    Metaphors aside, our struggle is real and extraordinarily important. The virtuous cycle that has long driven the computing industry is in flux, and if it is broken, we will struggle restart it – for deep economic reasons. The desire for new functionality leads to richer, more complex software, which imposes greater demands on extant hardware, with concomitant performance constraints. In turn, this stimulates demand for faster processors, and the cycle of innovation turns.

    One interesting corollary of this cycle is that we demand new, faster processors at the same price, rather than the same performance at a lower price. This consumer demand generates the revenue needed to fuel commercial software development, power new chip designs and fund semiconductor fabrication line construction. These are multibillion dollar (U.S.) investments, ones only repaid if tens to hundreds of millions of units are sold. In turn, this creates deep partnerships among companies such as Microsoft, Intel,
    AMD and the PC vendors. A similar virtuous cycle exists in the mobile telephone market.

    ManyCore Directions

    This ecosystem of software and hardware innovation is challenged by consumer parallelism, in the form of large-scale multicore (manycore) chips. No longer can we expect dramatic increases in single core performance, due to power and heat dissipation constraints on consumer devices. Perhaps more tellingly, all of us wonder what the next "killer app" will be that excites and incents consumers to buy new, manycore systems. Personally, I believe it will be some combination of graphics-intensive massively multiplayer games (MPGs) and contextually-adaptive, situationally-aware information spheres. (One can think of the latter as Vannevar Bush's
    Memex reborn.)

    Of course, as James Thornton (CDC) said many years ago, "Anyone who says he knows how computers should be built should have his head examined! The man who says it is either inexperienced or really mad." I'm too old to be inexperienced, so perhaps I am really mad after all!

    The fundamental question is how large multicore (manycore) chips and development software will evolve. I see at least four architectural directions, at least two of which are already commercially prevalent. The first is the "cookie cutter" homogeneous multicore design, exemplified by today's Intel and AMD flagship x86 offerings, along with similar homogeneous multicore designs from SUN (Niagara) and IBM (Power5-7). Tilera's
    TILE64 and Intel's Larrabee are other examples of this approach, combining standard cores (x86 and MIPS-derived, respectively) with a regular interconnect (mesh for Tilera and ring for Larrabee).

    The second is ISA-compatible homogeneous, but performance heterogeneous multicore. In this case, one combines, for example, a smaller number of complex, out-of-order cores with a larger number of simpler, in-order cores. The motivations for this approach are simple – the implications of Amdahl's Law and the need to execute legacy code efficiently while still delivering some of the performance and power advantages of low-power multicore. In this same spirit, Mark Hill has recently written a great paper about the importance of performance heterogeneity in multicore design.

    The third is functional heterogeneity, currently exemplified by chips such as the IBM Cell, AMD's announced Fusion chip and a host of embedded and domain-specific chips. I believe there are many opportunities for architectural innovation in this space, combining graphics, DSP, packet processing, cryptographic functions, SDR and a host of other functions with novel interconnects and memory sharing approaches.

    The fourth is what I call non-traditional architectures that embody more radical alternatives. One great example of this class is Doug Burger's
    TRIPS system, based on data graph execution. Doug recently joined Microsoft Research from the University of Texas at Austin, and I am excited about the collaboration possibilities.

    Back to the War

    I believe we are at an inflection point in parallel computing, with the economic impetus of consumer parallelism now driving us. Let us hope that we fare better than the warriors in another 19th century war, the Battle of Balaclava during the Crimean War. As Tennyson wrote so well, the Light Brigade charged into the mouth of hell. With cannons to the right (multicore architecture), cannons to the left (programming models) and cannons in front (next-generation applications), we ride into an unknown future, one fraught with peril but also with opportunity.

    July 13, 2008

    Showing Up and Two Corollaries

    "Eighty to ninety percent of life is showing up." The line has been variously attributed to Yogi Berra, Woody Allen or even an anonymous wag. It's wise, though obvious advice – showing up and doing the expected generally allows one to avoid a host of problems. Appearing for jury duty avoids one being held in contempt of court, and you can't fly if you don't show up at the airport on time.  I was reflecting on the implications of "showing up" while at a recent meeting in Italy.

    Show Up and See What Happens

    My friend, Dave Turek, IBM's Vice President for Deep Computing, once explained IBM's open source and Linux strategy by saying that IBM had a deeply considered, two phase strategy for Linux and clusters for HPC, "Show up and see what happens." As he once remarked at an NCSA Private Sector Partners (PSP) meeting, "We've showed up. Now, we are waiting to see what happens."

    At NCSA, we partnered with IBM in 2001 to deploy two of the first large-scale commodity clusters for open scientific use: two 1 teraflop systems based on Intel Pentium III and Itanium processors. At the time, this was a radical, almost heretical idea – deploying commodity PC clusters as production HPC platforms. Of course, such commodity clusters now dominate the Top500 list.

    In a reprise of this experience, Microsoft and NCSA recently partnered to deploy Windows HPC Cluster 2008 on the latest incarnation of commodity cluster hardware. (The customer story has the technical details). I don't generally evangelize for Microsoft products in this blog, but I was very impressed that Windows HPC Cluster achieved substantially higher performance on the same hardware than did Linux. Microsoft, in the form of Kyril Faenov's HPC team, has definitely "showed up" in this space in a big way, and I think there are great opportunities to offer not only Windows compute clusters but also backend acceleration for desktop applications. Of course, all of this is ultimately connected to the ferment in cloud computing.

    Avoid the Obviously Wrong

    At the recent Cetraro meeting on High-Performance Computing and Grids, Miron Livny extended the "show up and see what happens" maxim by offering a corollary, "Show up and avoid doing something stupid." His observation was that evolutionarily, human success was defined by avoiding being trampled by a woolly mammoth, eaten by a hungry Bengal tiger or falling into a crevasse.

    The computing implication of Livny's corollary is that one should do reasonable things when presented with opportunities. In terms of research infrastructure, this means avoiding our academic tendency to delight in second system syndrome – building complex systems that embody all of our personally favorite features without determining if they are either needed or useful.

    At Cetraro, we debated the impact of the multicore revolution, the similarities and differences between Grids and clouds, and the commonalities between future exascale systems and the architecture of megascale data centers. (By the way, if you have not read the Department of Energy's exascale computing study, I highly recommend it.)

    There are deep technical challenges in all of these areas. However, we must avoid being trampled by the woolly mammoths; this domain is fraught with academic, government and industrial politics. I believe we need a wider dynamic range (time horizon, risk/reward and fiscal scale) of research and development projects if we are to solve these problems.

    I have made this point many times, most recently as part of the PCAST report on the U.S. NITRD program. I am scheduled to testify about this again to the House Science and Technology Committee on July 31. I will report on the hearing in August.

    Do Simple Things Quickly

    At the same Cetraro meeting, I opined that there was a second corollary, "Do the obvious, simple things quickly." I think this is the key lesson to be drawn from web2.0 mashups, and the rapid evolution of commercial clouds. The simplicity of the APIs and hosted infrastructure encourages external groups to innovate rapidly. We have seen the clear evidence of this in the explosive growth in social networking sites and in the hosted services that have appeared.

    By contrast, I think this is one of the places we have struggled with academic Grids. The software has often been too complex, and this complexity has been engendered by the distributed nature of the participating organizations, requiring "glue code" to integrate disparate policies and infrastructure across virtual organizations. In contrast, mashups and cloud services can be deployed quickly (by academic standards) using very simple APIs and service level agreements (SLAs). It will be interesting to see how the Grid/Cloud mashup evolves.

    May 10, 2008

    HPC and Climate Change: Senate Hearing

    On Thursday, May 8, I testified to the U.S. Senate Committee on Commerce, Science and Technology. The full committee hearing was on improving the "Capacity of U.S. Climate Modeling for Decision Makers and End-Users." The other members of the hearing panel were

    Jim Hack and I represented the computing and computational science issues, and the other four focused on the climate aspects. Within a few weeks, our written testimony will be posted on the Committee's hearing page, and in due time (many months), our oral testimony will appear in the Congressional Record.

    Continue reading "HPC and Climate Change: Senate Hearing" »

    April 23, 2008

    Random Musings

    Like many of you, I give lots of public (and not so public) presentations, on a variety of topics. A couple of those were recently captured and placed on the web. The first was one of the opening talks at the Big Data Symposium, recently held at Yahoo!, as part of the Computing Community Consortium (CCC) big ideas series. If you are struggling to sleep, you can catch the video here.

    The second was an open-ended discussion on Microsoft's Channel9. I rambled on about multicore processors, big data center design challenges and a bit of web history. If the first video didn't put you to sleep, you can get the double feature effect here