An iconic cartoon by Peter Steiner, which appeared in The New Yorker in 1993, captured the nature of the nascent Internet. It shows a dog seated at a computer, remarking to a second dog on the floor that, "On the Internet, nobody knows you are a dog."
Although Internet anonymity was then the norm, it is no longer the case. Not only does the Internet now know you are a dog, it knows the location of your doghouse, your dry and canned dog food preferences, and your predilection to chase cars. A web search will also undoubtedly reveal embarrassing pictures of you wearing that necktie the kids thought was so cute. Perhaps most disconcerting, there are those videos of you rolling in mud puddles that time you had fleas.
Humor aside, the Internet has changed dramatically from the early days when the odds were high you personally knew a non-trivial fraction of the people with an ARPANET email address. When the Mosaic web browser appeared at Illinois, I remember fondly how the NCSA Mosaic home page listed the birth announcements of new web sites, created by proud organizational parents. Shortly thereafter, my students (Will Scullin and Stephen Lamm) and I launched an early web analytics project to mine and visualize this data. (See the graphic in this post.)
Today, our web surfing behavior is tracked by cookies and processed by web analytics to deliver targeted results. In our social compacts, we trade information and behavior for free access to email, consumer cloud storage, web search engines and social networks. However, the information, text, images and video we share enters an ecosystem where we as its producers often have little or no control over its long term usage or disposition.
In a recent New York Times article entitled, "The Web Means the End of Forgetting," Jeffrey Rosen discussed the longevity of web materials, their unexpected and long-term consequences and some possible solution approaches. This article has stimulated considerable discussion and debate.
In that same spirit, I believe it is important to consider technical and social mechanisms for user configurable control over personal information dissemination, information transitivity and lifetimes. These controls, together with informed choice, would create clearer social, legal and business compacts among all involved.
There are at least three specification axes of interest. The first, and the one with which we are most familiar, is that of access groups – who (and what) can see and examine what you post. The complex, changing and often confusing rules for social network sharing are the most obvious example of this. After all, a "friend" is not binary concept, for it ranges from a deep personal or family relationship of many years through professional colleagues and casual acquaintances to those you might know only by reputation.
Likewise, context determines the desirability of information sharing. I might want my colleagues to know I am speaking at an upcoming conference, but I might not want to bother my family with such technical detail. We need more formal and more user friendly mechanisms for defining the characteristics of access groups.
Limited transitivity is the second concept. After information has been shared with others, what rights do they have and what responsibilities do they bear regarding further sharing or use of that information? For example, I may happily share a photograph of a family picnic with friends, but not want them to post that same photograph on a public site. I would be pleased, though, if professional colleagues shared a link to this blog entry. We need to define transitivity options, along with mechanisms to allow the original creator to manage those options.
Finally, there is the notion of finite lifetime. How do we associate lifetimes with personal electronic objects, allowing (or forcing) them to disappear when their "shelf life" has been exceeded? There is a delicate balance here between the value of the web as a historical repository and the persistence of personal information.
The provenance, access, transitivity, lifetime and ownership of personal information are in social, economic and legal flux. We in computing can and should offer technical solutions and stimulate informed debate.