February 28, 2009
I am going to begin by asking a philosophical question about the Internet. But I can hear some of you saying, “Philosophy? What does that have to do with the Internet? Maybe I will have a siesta.” Well, before you close your eyes, let me assure you that the question is deeply important to some recent debates about the future of the Internet.
The question is: what is the purpose of the Internet? What is the Internet good for? Perhaps you had never thought that something as vast and diverse as the Internet might have a single purpose. In fact, I am going to argue that it has at least two main purposes.
To begin with, think about what the Internet is: a giant global information network. To ask what the Internet is for is about the same as asking what makes information valuable to us, and what basic reasons there might be for networking computers and their information together.
The two purposes of the Internet: communication and information
I think the Internet has at least two main purposes: first, communication and socialization, and second, finding the information we need in order to learn and to live our daily lives. In short, the Internet is for both communication and information.
Let me explain this in a simple way. On the one hand, we use the Internet for e-mail, for online forum discussions, for putting our personalities out there on social networking sites, and for sharing our personal creativity. These are all ways we have of communicating and socializing with others.
On the other hand, we are constantly looking things up on the Internet. We might check a news website, look up the meaning of a word in an online dictionary, or do some background reading on a topic in Wikipedia. These are all ways of finding information.
I want to explain an important difference between communication and information. Communication is, we might say, creator-oriented. It’s all about you, your personal needs and circumstances, and your need for engagement and recognition. So communication is essentially about the people who are doing the communicating. If we have no interest in some people, we probably have no interest in their communications. This is why, for example, I have zero interest in most MySpace pages. Almost nobody I know uses MySpace. MySpace is mainly about communication and socialization, and since I’m not actually communicating or socializing with anybody on that website, I don’t care about it.
Information, on the other hand, is not about the person giving the information but about the contents of the information. In a certain way, it really does not matter who gives the information; all that matters is that the information is valid and is of interest to me. And the same information might be just as interesting to another person. So, we might say, communication is essentially personal, and information is essentially impersonal.
I say, then, that the Internet’s purposes are communication and information. In fact, the Internet has famously revolutionized both.
The Internet is addictive largely because it gives us so many more people to talk to, and we can talk to them so efficiently. It allows us to compare our opinions with others’, to get feedback about our own thinking and creative work. In some ways, the Internet does this more efficiently than face-to-face conversation. If we are interested in a specific topic, we do not need to find a friend or a colleague who is interested in the topic; we just join a group online that has huge numbers of people already interested, and ready to talk about the topic endlessly.
Online discussions of serious topics are often a simplistic review of research, with a lot of confused amateur speculation thrown in. We could, if we wanted to, simply read the research—go to the source material. But often we don’t. We often prefer to debate about our own opinions, even when we have the modesty to admit that our opinions aren’t worth very much. Discussion is preferred by many people; they prefer active discussion over passive absorption. Who can blame them? You can’t talk back to a scientific paper, and a scientific paper can’t respond intelligently to your own thoughts. The testing or evaluation of our own beliefs is ultimately what interests us, and this is what we human beings use conversation to do.
But the Internet is also wonderfully efficient at delivering impersonal information. Search engines like Google make information findable with an efficiency we have never seen before. You can now get fairly trustworthy answers to trivial factual questions in seconds. With a little more time and skilled digging, you can get at least plausible answers to more many complex questions online. The Internet has become one of the greatest tools for both research and education that has ever been devised by human beings.
So far I doubt I have told you anything you didn’t already know. But I am not here to say how great the Internet is. I wanted simply to illustrate that the Internet does have these two purposes, and that the purposes are different—they are distinguishable.
How the Internet confuses communication and information
Next, let me introduce a certain problem. It might sound at first like a purely conceptual, abstract, philosophical problem, but let me assure you that it is actually a practical problem.
The problem is that, as purposes, communication and information are inherently confusable. They are very easy to mix up. In fact, I am sure some of you were confused earlier, when I was saying that there are these two purposes, communication and information. Aren’t those just the same thing, or two aspects of the same thing? After all, when people record information, they obviously intend to communicate something to other people. And when people communicate, they must convey some information. So information and communication go hand-in-hand.
Well, that is true, they do. But that doesn’t mean that one can’t draw a useful distinction fairly clearly. Here’s a way to think about the distinction. In 1950, a researcher would walk into a library and read volumes of information. If you wanted to communicate with someone, you might walk up to a librarian and ask a question. These actions—reading and talking—were very different. Information was something formal, edited, static, and contained in books. Communication was informal, unmediated, dynamic, and occurred in face-to-face conversation.
Still, I have to agree that communication and information are indeed very easy to confuse. And the Internet in particular confuses them deeply. What gives rise to the confusion is this. On the Internet, if you have a conversation, your communication becomes information for others. It is often saved indefinitely, and made searchable, so that others can benefit from it. What was for you a personal transaction becomes, for others, an information resource. This happens on mailing lists and Web forums. I myself have searched through the public archives of some mailing lists for answers to very specialized questions. I was using other people’s discussions as an information resource. So, should we say that a mailing list archive is communication, or is it information? Well, it is both.
This illustrates how the Internet confuses communication and information, but many other examples can be given. The Blogosphere has confused journalism, which used to be strictly an information function, with sharing with friends, which is a communication function. When you write a commentary about the news, or when you report about something you saw at a conference, you’re behaving like a journalist. You invite anyone and everyone to benefit from your news and opinion. Perhaps you don’t initially care who your readers are. But when you write about other blog posts, other people write about yours, and you invite comments on your blog, you’re communicating. Personalities then begin to matter, and who is talking can become more important to us than what is said. Information, as it were, begins to take a back seat.
Moreover, when news websites allow commenting on stories, this transforms what was once a relatively impersonal information resource into a lively discussion, full of colorful personalities. And, of course, online newspapers have added blogs of their own. I have often wondered whether there is a meaningful difference between a newspaper story, a blog by a journalist, and a well-written blog written by a non-journalist. That precisely illustrates what I mean. The Internet breaks down the distinction between information and communication—in this case, the distinction between journalism and conversation.
Why is the distinction between communication and information important?
I’ll explore more examples later, but now I want to return to my main argument. I say that the communication and information purposes of the Internet have become mixed up.
But—you might wonder—why is it so important that we distinguish communication and information, and treat them differently, as I’m suggesting? Is having a conversation about free trade, for example, really all that different from reading a news article online about free trade? To anyone who writes about the topic online, they certainly feel similar. The journalist seems like just another participant in a big conversation, and you are receiving his communication, and you could reply online if you wanted to.
I think the difference between information and communication is important because they have different purposes and therefore different standards of value. When we communicate, we want to interface with other living, active minds and dynamic personalities. The aim of communication, whatever else we might say about it, is genuine, beneficial engagement with other human beings. Communication in this sense is essential to such good things as socialization, friendship, romance, and business. That, of course, is why it is so popular.
Consider this: successful communication doesn’t have to be particularly informative. I can just use a smiley face or say “I totally agree!” and I might have added something to a conversation. By contrast, finding good information does not mean a significant communication between individuals has taken place. When we seek information, we are not trying to build a relationship. Rather, we want knowledge. The aim of information-seeking is reliable, relevant knowledge. This is associated with learning, scholarship, and simply keeping up with the latest developments in the news or in your field.
Good communication is very different from good information. Online communication is free and easy. There are rarely any editors to check every word you write, before you post it. That is not necessary, because these websites are not about creating information, they are about friendly, or at least interesting, communication. No editors are needed for that.
These communities, and blogs, and much else online, produce a huge amount of searchable content. But a lot of this content isn’t very useful as information. Indeed, it is very popular to complain about the low quality of information on the Internet. The Internet is full of junk, we say. But to say that the Internet is full of junk is to say that most conversations are completely useless to most other people. That’s obviously true, but it is irrelevant. Those who complain that the Internet is full of junk are ignoring the fact that the purpose of the Internet is as much communication as it is information.
Personally, I have no objection whatsoever to the communicative function of the Internet. In fact, it is one of my favorite things about the Internet. I have had fascinating conversations with people from around the world, made online friendships, and cultivated interests I share with others, and I could not possibly have done all this without the communicative medium that is the Internet.
But, as I will argue next, in making communication so convenient, we have made the Internet much less convenient as an information resource.
Communicative signal is informational noise
You are probably familiar with how the concept of the signal-to-noise ratio has been used to talk about the quality of online information and communication. A clear radio transmission is one that has high signal and low noise. Well, I’d like to propose that the Internet’s two purposes are like two signals: the communication signal and the information signal. The problem is that the two signal are sharing the same channel. So I now come to perhaps the most important point of this paper, which I will sum up in a slogan: communicative signal is informational noise. That is at least often the case.
Let me explain. The Internet’s two purposes are not merely confusable. In fact, we might say that the communicative function of the Internet has deeply changed and interfered with the informative function of the Internet. The Internet has become so vigorously communicative that it has become more difficult to get reliable and relevant information on the Internet.
I must admit that this claim is still very vague, and it might seem implausible, so let me clarify and support the claim further.
The basic idea is that what works well as communication does not work so well as information. What might seem to be weird and frustrating as information starts to make perfect sense when we think of it as communication.
Let me take a few examples—to begin with, Digg.com. In case you’re not familiar with it, it’s a website in which people submit links for everyone else in the community to rate by a simple “thumbs up” or “thumbs down.” This description makes it look like a straightforward information resource: here are Internet pages that many people find interesting, useful, amusing, or whatever. Anyone can create an account, and all votes are worth the same. It’s the wisdom of the crowd at work. That, I assume, is the methodology behind the website.
But only the most naïve would actually say that the news item that gets the most “Diggs” is the most important, most interesting, or most worthwhile. Being at the top of Digg.com means only one thing: popularity among Digg participants. I am sure most Digg users know that the front page of Digg.com is little more than the outcome of an elaborate game. It can be interesting, to be sure. But the point is that Digg is essentially a tool for communication and socialization masquerading as an information resource.
YouTube is another example. On its face, it looks like a broadcast medium. By allowing anyone to have a YouTube account, carefully recording the number of video views and giving everyone an equal vote, it looks like the wisdom of the crowd is harnessed. But the fact of the matter is that YouTube is mainly a communication medium. Its ratings represent little more than popularity, or the ability to play the YouTube game. When people make their own videos (as opposed to copying stuff from DVDs), they’re frequently conversational videos. They are trying to provoke thought, or get a laugh, or earn praise for their latest song. They want others to respond, and others do respond, by watching videos, rating videos, and leaving comments. I suspect that YouTube contributors are not interested, first and foremost, in building a useful resource for the world in general. They are glad, I am sure, that they are doing that too. But what YouTube contributors want above all is to be highly watched and highly rated, and in short a success within the YouTube community. This is evidence that they have been heard and understood—in short, that they have communicated successfully.
I could add examples, but I think you probably already believe that most of the best-known Web 2.0 websites are set up as media of communication and socialization—not primarily as impersonal information sources.
But what about Wikipedia and Google Search? These are two of the most-used websites online, and they seem to be more strictly information resources.
Well, yes and no. Even Wikipedia breaks down the difference between a communication medium and an information resource. There has been a debate, going back to the very first year of Wikipedia, about whether Wikipedia is first and foremost a content-production project or a community. You might want to say that it is both, of course. That is true, but the relevant question is whether Wikipedia’s requirements as a community are actually more or less important than its requirements as a project. For example, one might look at many Wikipedia articles and say, “These badly need the attention of a professional editor.” One might look at Wikipedia’s many libel scandals and say, “This community needs real people, not anonymous administrators, to take responsibility so that rules can be enforced.” Wikipedia’s answer to that is to say, “We are all editors. No expert or professional is going to be given any special rights. That is the nature of our community, and we are not going to change it.” The needs of Wikipedia’s community outweigh the common-sense requirements of Wikipedia as an information resource.
Please don’t misunderstand. I am not saying that Wikipedia is useless as an information resource. Of course it is extremely useful as an information resource. I am also not saying that it is merely a medium of collaborative communication. It clearly is very informational, and it is intended to be, as well.
Indeed, most users treat Wikipedia first and foremost as an information resource. But, and this is my point, for the Wikipedians themselves, it is much more than that: it is their collaborative communication, which has become extremely personal for them, and this is communication they care passionately about. The personal requirements of the Wikipedians have dampened much of the support for policy changes that would make Wikipedia much more valuable as an information resource.
Why do we settle for so much informational noise?
Let me step back and try to understand what is going on here. I say that Web 2.0 communities masquerade as information resources, but they are really little more than tools for communication and socialization. Or, in the case of Wikipedia, the community’s requirements overrule common-sense informational requirements. So, why do we allow this to happen?
Well, that’s very simple. People deeply enjoy and appreciate the fact that they can share their thoughts and productions without the intermediation of editors or anything else that might make their resources more useful as information resources. And why is it so important to so many people that there be no editors? Because editors are irrelevant and get in the way of communication.
The fact that Web 2.0 communities are set up for communication, more than as information resources, explains why they have adopted a certain set of policies. Consider some policies that Wikipedia, YouTube, MySpace, and the many smaller Web 2.0 websites have in common.
First, on these websites, anyone can participate anonymously. Not only that, but you can make as many accounts as you want. Second, when submissions are rated, anyone can vote, and votes are (at least initially, and in many systems always) counted equally. Third, if there is any authority or special rights in the system, it is always internally determined. Your authority to do something or other never depends on some external credentials or qualification. University degrees, for example, are worth nothing on YouTube.
The result is that, on a website like Wikipedia, a person is associated with one or more accounts, and the performance of the accounts against all other accounts is all that the system really cares about.
To Internet community participants, this seems very rational. A person is judged based on his words and creations alone, and on his behavior within the system. This seems meritocratic. People also sometimes persuade themselves, based on a misinterpretation of James Surowiecki’s book The Wisdom of Crowds, that ratings are an excellent indicator of quality.
But these systems are not especially meritocratic. It is not quality, but instead popularity and the ability to game the system that wins success in Web 2.0 communities. High ratings and high watch counts are obviously not excellent indicators of quality, for the simple reason that so much garbage rises to the top. There is no mystery why there is so much time-wasting content on the front page of YouTube, Digg.com, and many of the rest: it’s because the content is amusing, titillating, or outrageous. Being amusing, titillating, and outrageous is not a standard of good information, but it can be a sign of successful communication.
The less naïve participants, and of course the owners of these websites, know that Internet community ratings are largely a popularity contest or measure the ability to play the game. They don’t especially care that the websites do not highlight or highly rank the most important, relevant, or reliable information. The reason for this is perfectly clear: the purpose of these websites is, first and foremost, communication, socialization, and community-building. Building an information resource is just a very attractive side-benefit, but still only a side-benefit, of the main event of playing the game.
The attraction, in fact, is very similar to that of American Idol—I understand you have something similar called “Latin American Idol,” is that correct? Well, I have been known to watch American Idol. It is a television competition in which ordinary people compete to become the next Idol, who earns a record contract, not to mention the attention of tens of millions of television viewers. The singing on American Idol, especially in the early weeks, is often quite bad. But that is part of its entertainment value. We do not watch the program to be entertained with great singing—that is, of course, nice when it happens. Instead, we watch the program mainly because the drama of the competition is fascinating. Even though the quality of the singing is supposed to be what the program is about, in fact quality is secondary. The program’s attraction stems from the human element—from the fact that real people are putting themselves in front of a mass audience, and the audience can respond by voting for their favorites. The whole game is quite addictive, in a way not unlike the way Internet communities are addictive.
But let’s get back to the Internet. I want to suggest that the information resource most used online, Google Search itself, is also a popularity contest. Google’s PageRank technology is reputed to be very complex, and its details are secret. But the baseline methodology is well-known: Google ranks a web page more highly if it is linked to by other pages, which are themselves linked to by popular pages, and so forth. The assumption behind this ranking algorithm is somewhat plausible: the more that popular websites link to a given website, the more relevant and high-quality the website probably is. The fact that Google is as useful and dominant as it is shows that there is some validity to this assumption.
All that admitted, I want to make a simple point. Google Search is essentially a popularity contest, and frequently, the best and most relevant page is not even close to being a popular page. That is a straightforward failure. But just as annoying, perhaps, is the prevalence of false positives. I mean the pages that rank not because they are relevant or high-quality, but because they are popular or (even worse) because someone knows how to game the Google system.
Does this sound familiar? It should. I do not claim that Google is a medium of communication. Clearly, it is an information resource. But I want to point out that Google follows in the same policies of anonymity, egalitarianism, and merit determined internally through linkings and algorithms that machines can process. As far as we know, Google does not seed its rankings with data from experts. Its data is rarely edited at all. Google dutifully spiders all content without any prejudice of any sort, applies its algorithm, and delivers the results to us very efficiently.
I speculate—I can only speculate here—that Google does not edit its results much, for two reasons. First, I am sure that Google is deeply devoted the same values, values that favor a fair playing field for communication games that many Web 2.0 websites play. But, you might say, this is a little puzzling. Why doesn’t Google seek out ways to include the services of editors and experts, and improve its results? An even better idea, actually, would be to allow everyone to rate whatever websites they want, then publish their web ratings according to a standard syndication format, and then Google might use ratings from millions of people creatively to seed its results. In fairness to Google, it may do just this with the Google SearchWiki, which was launched last November. But as far as I know, SearchWiki does not aggregate search results; each individual can edit only the results that are displayed to that user.
So there is, I think, a second and more obvious reason that Google does not adjust its results with the help of editors or by aggregating syndicated ratings. Namely, its current, apparently impersonal search algorithm seems fair, and it is easy to sell it as fair. However much Google might be criticized because its results are not always the best, or because the results are gamable or influenced by blogs, at least it has the reputation of indeed being mostly fair, largely because PageRank is determined by features internal to the Internet itself—in other words, link data.
Google’s reputation for fairness is one of its most important assets. But why is such a reputation so important? Here I can finally return to the thread of my argument. Fairness is important to us because we want communication to be fair. In a certain way, the entire Internet is a communicative game. Eyeballs are the prize, and Google plays a sort of moderator or referee of the game. If that’s right, then we certainly want the referee to be fair, not to prefer one website over another simply because, for example, some expert happens to say the one is better. When it comes to conversations, fairness means equal consideration, equal time, an equal shot at impressing everyone in the room, so to speak. Communication per se is not the sort of thing over which editors should have any control, except sometimes to keep people polite.
The fact that Google has an impersonal search algorithm really means that it conceives of itself as a fair moderator of communication, not as a careful chooser of relevant, reliable content. And a lot of people are perfectly happy with this state of affairs.
In this paper I have developed an argument, and I hope I haven’t taken too long to explain it. I have argued that the Internet is devoted both to communication and information. I went on to say that communication and information are easily confused, and the Internet makes it even easier to confuse them, since what serves as mere communication for one person can be viewed later as useful information for another person. But what makes matters difficult is that we expect communication, and the websites that support online communication, to be as unconstrained and egalitarian as possible. As a result, however, the Internet serves rather well as a communication medium, as a means to socialize and build communities, but not nearly as well as an information resource.
I can imagine a reply to this, which would say: this is all a good thing. Information is about control. Communication is about freedom. Viva communication! Should our alleged betters—professors, top-ranked journalists, research foundations, and the like—enjoy more control over what we all see online, than the average person? The fact is that in the past, they have enjoyed such control. But the egalitarian policies of the Internet have largely removed their control. In the past, what those experts and editors have happened to say enjoyed a sort of status as impersonal information. But all information is personal. The Internet merely recognizes this fact when it treats allegedly impersonal information as personal communication.
This is the common analysis. But I think it is completely wrong. First, the elites still exert control in many ways, and there is little reason to think the Internet will change this. Second, the radical egalitarianism of Internet policies does not disempower the elites so much as it disempowers intelligence, and empowers those with the time on their hands to create and enjoy popular opinion, and also those who care enough to game the system.
If more people were to emphasize the informative purpose of the Internet more, this would not empower elites; it would, rather, empower everyone who uses the Internet to learn and do research. We would have to spend less time sorting through the by-products of online communication, and could spend more time getting solid knowledge.
In fact, I think most people enjoy the Internet greatly as an information resource—at least as much as they enjoy it as a communication medium. But most of the people who create websites and Internet standards—the many people responsible for today’s Internet—have not had this distinction in mind. But I think it is very fruitful and interesting way to think about the Internet and its purposes, and—who knows?—perhaps it will inspire someone to think about how to improve the informational features of the Internet.
In fact, if my fondest hope for this paper were to come true, it would be that those building the Internet would begin to think of it a little bit more as a serious information resource, and a little bit less as just a fun medium of communication.
 As I have argued in a recent paper: “The Future of Expertise after Wikipedia,” Episteme (2009).