We all remember some variant of the fairy tale of Goldilocks and the Three Bears. It is a tale of the search for a good meal and a nap, though it also involves breaking and entering, culinary theft and accidental vandalism. If, perchance, you feel a worrisome gap in your erudition and erstwhile encyclopedic knowledge of this element of the Western cultural lexicon, fear not, for I will summarize the key elements of the tale. And please bear with me, this does relate to science! (Pun intended.)
Porridge and Bears
The young girl, Goldilocks, visits the home of the bears and samples the porridge of Papa, Mama and Baby Bear, pronouncing each, in turn, too hot, too cold and just right. Sated with porridge, she then seeks a place to sit, finding the chairs of Papa and Mama Bear too large. As she settles comfortably into Baby Bear's chair, it collapses. She then explores each of the beds before falling asleep in Baby Bear's bed, where she is found by the returning bear family. On discovery, she flees.
This tale, of course, raises worrisome questions. Do bears really like porridge? Could they sit in chairs and would they sleep in beds? More tellingly, are they underwater on their cottage mortage? Though all worthy of whimsical rumination, let us defer literal scrutiny of the fairy tale's purported ursine facts and the anthromophrized bears and focus on the porridge bowls and their metaphorical implications for science.
Science: Matching Need and Availability
Like porridge bowls, science comes in many sizes, from research conducted by a single investigator (e.g., a theoretical computer scientist studying computational complexity) to experimental projects that involve thousands of people (e.g., the ATLAS and CMS detectors at CERN). More to the point, there is no "right size" for science, other than matching the resources to the nature of the problem and the approach, albeit with the usual assessment of investment versus potential payoff. Nor is there, a priori, differential value between large and small science. Groundbreaking discoveries with broad and transformative effects have occurred across the entire spectrum of project sizes.
Scientific culture does vary widely with discipline and scale, however. Just as the ethos and metrics of biology differ markedly from those in physics, so too do the approaches and politics of small and large science. Large-scale experimental science typically involves long-term planning, geo-political negotiations, and infrastructure construction that may span multiple years before data can be captured and analyzed.
Big Infrastructure for Small Science
Historically, these two worlds – small-scale science (both theoretical and experimental) and large-scale science – have largely depended on separate and distinct infrastructure. That historical separation is now disappearing, mediated by the tsunami of data now being produced by new generations of scientific instruments and computational models. (See The Zeros Matter: Bigger Is Different and Language Shapes Behavior: Our Poor Cousin, Data.)
Historically, the researcher who had unique experimental infrastructure also maintained a competitive advantage, for he or she could conduct experiments and capture unique data. With the rise of large-scale, shared instrumentation in a host of disciplines, most notably in biology, astronomy and physics, and open access to the resulting research data, advantage instead accrues to the researcher who can ask and answer more interesting questions. This is a profound shift in scientific culture with deep implications.
It is now incumbent upon us to rethink how we facilitate discovery and innovation in this brave new world of large data, for practitioners of both small and large science. Simply put, we must reconsider how we fund, construct, manage and operate scientific data repositories.
As I have learned at Microsoft, there are lessons that can be drawn from the construction of cloud data centers, web search engines and tools for analyzing ill-structured data that could both accelerate and simplify scientific discovery. These tools and practices are not only applicable to individual scientific domains, they are also especially relevant to the problems that lie at the intersection of multiple disciplines, where scientific cultural divides and divergent terminologies often inhibit collaboration and exploration. We will also need new models of public/private partnership to realize this vision, something we are pursuing with Microsoft's worldwide engagements on client plus cloud infrastructure.
Remember the porridge, the bears and Goldilocks. We can match the needs of all scientists, working together. Whether you like it hot or cold, large or small, it can be "just right" for everyone.