A famous financial quote, "A billion here, a billion there, pretty soon, you're talking real money," has long been attributed to the late Everett Dirksen, who served in the U.S. Senate from 1950-1969. In a world where the headlines highlight the potential for trillion dollar deficits and the possibility of debt default by soverign nations, the notion that a billion dollars is a lot of money now seems rather quaint. Time, inflation and rising GDP transform the small into the large. The same is true of computing, albeit in different ways.
Kilobyte, Megabyte, Gigabyte, Petabyte, Exabyte
Not that long ago, a megabyte was a lot of storage, whether primary or secondary. In 1987, I co-authored a book (with Richard Fujimoto) using a PC with dual (!) 1.4 MB floppy disks. A few years later, I was thrilled to have a PC with a 40 MB drive. Today, I carry a handful of multiple gigabyte USB keys for sharing documents at meetings, consumer terabyte disks can be purchased for less than $100, and petascale storage systems are competitive enablers.
Although capacity increases are testament to advances in both FLASH and magnetic disk storage technology, rises in storage capacity have brought new challenges and new opportunities. Simply put, bigger is not just bigger, bigger is different.
At scale, indexing and content analysis are not just luxuries; they are necessities, for disk capacities have risen far more rapidly than have disk transfer rates. Thus, the only thing worse than being unable to store desired content due to limited storage is being unable to find an item in a debilitating blizzard of diverse content on high-latency storage systems.
Perhaps most importantly, though, scale brings new possibilities, for statistics becomes the data miner's friend. We have seen this manifest most clearly in supply chain and logistics management where buying trends and consumer preferences can be gleaned from point of sale data. We have also seen it in the increased accuracy of machine language translation, where the corpus of multilingual data vastly exceeds anything previous available.
As one of my colleagues noted, however, free storage is like free puppies. The true cost is not in the acquisition, it is in the use, the maintenance and the mess.
Megaflop, Gigaflop, Petaflop, Exaflop
Not long ago supercomputers were defined by the number of megaflops they achieved. The Cray-1, after all, was a sub-gigaflop machine, despite its vector architecture. Today, we carry smartphones with megaflop performance, and I still have t-shirts in my closet from my DARPA-funded performance analysis project with the slogan, "When you gotta have more gigaflops." Today, a desktop PC with a GPU can deliver hundreds of gigaflops, and the world's highest performance petascale systems combine commodity microprocessors and GPUs in cluster configurations.
Like storage capacity, computing performance brings new challenges and new opportunities, for bigger and faster is not just bigger and faster, it is different. The availability of inexpensive, high-performance computing systems means higher fidelity models can become the province of every day users – if we fully solve the productivity and programming problems (See HPC and The Excluded Middle). It also means we can illuminate the interplay and interdependence of processes on ever more widely varying temporal and spatial scales – if we can manage the reliability, complexity and scaling of trans-petascale systems.
Bigger is not just bigger, bigger is different. Quantitative change begets qualitative change.