Wikipedia's ancient history unearthed

Wikipedia programmer Tim Starling has discovered some ancient backup files from the earliest months of Wikipedia. The files themselves (which I haven't downloaded yet, if I ever will) are here (8.4 MB) and cover some of the earliest history of Wikipedia.  This should be very interesting indeed, if anyone decides to study what happened in those, um, interesting few months!


Are child development experts getting it wrong?

I just came across this Psychology Today blog by Richard Gentry, author of Raising Confident Readers: How to Teach Your Child to Read and Write -- from Baby to Age 7.  He poses the question, "Are Commercial-Product Claims that Babies Can Read Overblown?"  He goes on:

Or are too many child development experts from prestigious universities getting it wrong?

There is a controversy brewing over the definition of reading and whether babies and toddlers can learn to read. Driven by negative reaction to some of the commercial products that claim to teach babies and toddlers to read, print media and major news reports on television have recently quoted child development experts who state emphatically that "the baby's brain is not developed enough to read." WAIT A MINUTE! Sit back and take a deep breath. It may be a very good thing for a pre-school age child to learn to read words and phrases before age three and it may be a bad thing to equate this remarkable accomplishment with "the brain of a parrot." Show me a parrot that reads scores of flash cards with words and phrases through paired associate learning or operant conditioning! Reading word cards is not something trivial. When child development experts were asked if babies who pronounced the words or demonstrated actions to word cards such as "clap" or "arms up" were reading, many were emphatic: "No! The babies memorize cue cards. That's not reading." But automatic recognition of words, repetition, and memory are all aspects of proficient reading at any level. Joyful parent-child interaction helping the baby learn to read word cards is a good thing!

Read the whole thing, including the "What Does the Research Say?" section.  It's nice to know that there are some experts who are willing to buck the establishment on this.


Could you teach your baby to read?

Is your reaction, "If it sounds too good to be true, it probably is"?  I claim that you can teach your baby, toddler, or preschooler to read--probably.  What do you say to that?

I was thinking about how my essay on baby reading hardly made a ripple on its first day out in the world, despite being announced pretty far and wide.  There was no negative reaction; but there was hardly any positive reaction.  There was essentially no reaction.  I'm not sure why that might be, but my best guess is that people don't believe that there's anything to it, not enough to investigate it much.  To be sure, a 140-page essay is a bit much to expect an instant reaction to, but what about the video, the flash cards, and the presentations?  Nothing!  My explanation is that people simply don't believe that there's enough "to" claims like "your baby can read" to warrant much caring, much less investigation.

Let me make several claims, each of which I can back up with a lot of argumentation:

1. It's not just me.  Lots of people have done this.  You didn't know that?  Read my essay, especially Part 2, and you'll see.

2. It's really reading.  By age 2 or 3, lots and lots of kids who start out with Your Baby Can Read (YBCR) and the Glenn Doman method and similar methods are able to sound out new words, and understand age-appropriate books.  By the time they enter first grade, those kids read well above grade level.

3. And no, it's not because they're geniuses.  I'm not a genius, and I'm sure my little boy isn't either.  Lots of more or less average people have taught their little kids to read, and long before I found out about it.

4. I didn't pressure my little boy into reading.  If you think that's the only way to teach a tiny tot how to read, you're just mistaken!

5. It's not impossibly difficult or expensive.  Yes, I work from home and have some free time to help teach my little boy, but with the free materials out there now, and as the price of YBCR has come down, basically, you just have to spend some time doing this.  With the videos, or with looking at some powerpoint presentations or my flash cards...well, sure, it takes some time, and probably some money...but it's not a full-time job or anything.  Think of it as a side-hobby.  You could get deeply into it, the way I have, but you could have a nicely positive effect without doing so, I'm very sure.

OK, folks, what else can I say that will make you take this whole opportunity seriously?


Essay on Baby Reading

I started teaching my little boy to read beginning at 22 months, and by age four, he was decoding text (reading, in that sense) quite fluently at the sixth grade level, or above.

I've discovered that there isn't a lot written about the subject of baby reading.  So I have written a 45,000-word essay on the subject:

How and Why I Taught My Toddler to Read
PDFDOCHTML
(the PDF is best)

I've worked on this for two years, off and on.  It is formatted as a 140-page book, which I'm presenting to the public free, under a Creative Commons (CC-by-nc-nd) license.  Here is a video of my boy reading to me when he was two, then three, then four.  At age 3 years, 10 months, he read the First Amendment of the Constitution (in the video at 2:47):

How'd we do it? We used a variety of methods: I read many books to him while pointing to the words, I showed him over 1,000 home-made flashcards (careful: 122 MB zip file) arranged in phonetic groupings, we watched the Your Baby Can Read videos, we used these (150+) PowerPoint presentations I made for him (here's an enormous 862MB zip file), and we did many other literacy-building activities.  All of this was done in a completely pressure-free way; I taught him to say “that’s enough” and immediately stopped when, if not before, he got tired of any activity. (UPDATE: these flashcards are in the process of being converted into a high-quality digital version at ReadingBear.org.)

I hope that by publicizing our case, we will raise awareness of the methods available that can, in fact, teach very small children to read with about as much ease as they can learn spoken language or sign language.

Working on early childhood educational content and issues is now my full-time job; among other things, I'm planning a new tool that will emulate the best aspects of Your Baby Can Read, but it will be free.  I've passed off leadership of WatchKnow.org to a new CEO, the very capable Dr. Joe Thomas.  Expect to see regular updates on this blog about my work, and I'll be asking for your feedback about my various plans and ideas.

Please use this page to comment on both the essay and the video.

UPDATE: if you want a copy of the essay on your handheld device (and can't figure out how to put the PDF on your device), you can buy it for $2.99 from the Amazon Store.  Someone asked for this, and I obliged!

UPDATE 2 (Oct. 3, 2011): my son is now five years old. He is now reading daily on his own, and has read himself a couple dozen chapter books, including The Story of the World, Vol. 1: The Ancient World (314 pgs.).

UPDATE 3 (Dec. 16, 2012): at six, my son switches between "serious" literature which he reads with a dictionary app, including Treasure Island, Tom Sawyer, and The Secret Garden, and easier literature including Beverly Cleary books, the Hardy Boys, and Encyclopedia Brown. If his answers to regular comprehension questions are any indication, he's understanding what he reads pretty well.

UPDATE 4 (Mar. 26, 2013): I'm delighted to report that my second son, following methods similar to those I used with my first, is now 2.5 years old and reading at a first grade level.

UPDATE 5 (Aug. 25, 2014): my second is following in his brother's footsteps, reading a version of the Odyssey (he's crazy about Greek mythology—go figure) at age 3.5:


A comment on Wikileaks

Over the weekend, I wrote a series of Tweets inspired by Wikileaks' then-upcoming release of U.S. diplomatic communiqués.  This caused quite an uproar, with people insulting me vociferously and demanding that I explain myself.  (A few people were supportive, and thanks to them.)  I am not going to write a whole essay in defense of my views; I don't have the time either to write one or to deal with the inevitable aftermath of such an essay.  Actually, I wish I didn't have to do even the following, because I'm busy with various new educational projects, and I have no desire to make myself into a political pundit.  But I suppose at this point it is my duty to post at least the following; I think I'm in a position where I could do some good, so I had better, if I want to follow my own advice.

Rather than write a long essay, I will put down just a few paragraphs explaining my views a little better.  This is obviously not, nor is it intended to be, a complete defense of the position I'll briefly describe.  That I leave to the policy wonks.

Here are the "offending" Tweets (from Nov. 25-26):

I'll go ahead and say the obvious: Wikileaks is an enemy of the U.S.—and not just the government. Deal with them accordingly.

How does Wikileaks repeatedly get massive troves of classified material?

Did a person or group in the U.S. govt have access to ALL these docs & leak them to Wikileaks? If so, that person or group is traitorous.

@wikileaks Speaking as Wikipedia's co-founder, I consider you enemies of the U.S.—not just the government, but the people.

@wikileaks What you've been doing to us is breathtakingly irresponsible & can't be excused with pieties of free speech and openness.

First, let me say that my main complaint is against releasing secret diplomatic communiqués, not against Wikileaks' other work, which is less important for purposes of this discussion.  Also, when I said I was "speaking as Wikipedia's co-founder," I was distinguishing wikis generally from Wikileaks, which is not a wiki.  I was and am not speaking for Wikipedia, but only for myself.  To those who said that they'd stop contributing to Wikipedia, you might not know that I left Wikipedia a little over a year after I got it started, and have since founded a competitor.  I'm no longer even the editor-in-chief of this competitor; I'm now working on brand new things.

My argument is quite simple and commonsensical.  It goes something like this. (A policy wonk would be able to explain this better than I could, but I'm in the hot seat so I'll have a go.) Diplomatic communiqués are secret precisely because they contain information that it would be dangerous, or stupid, to make public. They disclose names and quotations that, for reasons either obvious or quite impossible for us to know, might get people killed. They also contain reports of actions that might lead to serious repercussions. They might even pinpoint locations of secret installations that might come under attack. They recount discussions of important plans and personalities—information that, if known to the wrong people, might lead to various military excursions, including war.

Does that sound acceptable to you?  Let's put it this way.  Wikileaks' actions, by releasing so much consequential, incendiary information, could easily lead to the deaths of people all around the world, and not just Americans. It could destabilize foreign relations that it benefits no one to have destabilized. It could—probably will not, but given that these are secret diplomatic communiqués in a very complex world, could—lead to war.

I find it incomprehensible that Wikileaks and its defenders are not given pause by such obvious considerations. I find it sad that so many people are not able to grasp such arguments intuitively.  Perhaps they ignore them, or perhaps they only pretend that such considerations do not exist.

Now, let's talk about three common fallacies about Wikileaks' latest disastrous actions. Again, this is going to have to be brief.

Fallacy: we can already see (less than 24 hours after release) that the leaks have no damaging information, and the information in the first leaks (about Iraq and Afghanistan) did not lead to any deaths. Well—not yet they didn't, not as far as we know.  But there is a big difference between the Iraq and Afghanistan leaks and the latest leak.  Since the latest leak contains huge numbers of secret diplomatic communiqués, they do, of course, concern intelligence.  Wikileaks' defenders seem not to realize the cumulative nature of intelligence.  Intelligence-gathering is like detective work.  In a detective story, often it is one tidbit of information that sheds light on a case and blows it wide open.  Similarly, a communiqué that looks to the uninformed to be completely innocuous might turn out to be exactly the tidbit needed for enemies of the U.S.—and others—to inflict death and serious destruction.  It amazes me that otherwise intelligent people, including journalists, think that they can make such judgments, let alone promote their obviously amateur judgments online.  This does not speak well for the judgment of the New York Times' editors.  To their credit, others, such as the Washington Post, would not make deals with Wikileaks.

Fallacy: the United States is an "empire" and needs to be reined in. Exposing the inner workings of this government's foreign policy is a good thing. It's not a bad thing that the leaks damage U.S. interests, because U.S. interests are contrary to the interests of a lot of the rest of the world. This argument is made by two different groups of people who are best addressed separately.

On the one hand, people on the radical left are of course deeply opposed to the American system of government. I am not one of these people—though occasionally, as an open-minded philosopher, I have considered some such people as my personal friends. Anyway, these people naturally regard the U.S. government, the main defender of this much-hated system, as enemy #1 in world politics.  I don't.  Obviously, radical leftists will be among Wikileaks' most vociferous supporters in the latest leaks, precisely because they want the U.S. undermined.  As a patriotic, loyal American citizen, I do not want my country undermined, and I'm not ashamed to say so.  Taking this openly pro-U.S. stance as I do, radical leftists cannot be expected to treat me nicely.  Fortunately, I couldn't care less about what they think, when they use playground insults and attempt to bait me into stupid exchanges of sentiments.  I'm not about to enter an exchange with such people about the merits of the American system and hence the defensibility of undermining it.

On the other hand, there are plenty of liberals, libertarians, and social democrats who support Wikileaks. My views are closer to theirs.  I agree with them that, as a rough generality, leakage of government documents is a good thing for open government, free speech, and democracy.  This is why, when Wikileaks first appeared, I was cautiously supportive.  But it is perfectly consistent for liberals, libertarians, and social democrats—and conservatives too, of course—to draw the distinction between positive leaks that improve government and irresponsible leaks that do nothing but cause all sorts of harm and pointless chaos.  If you are an anarchist, you might celebrate all leaks, but most of us aren't anarchists and are capable of making intelligent distinctions between good and bad leaks.

Let me put this another way.  There are a lot of things that the U.S. State Department does that democracy-loving people across the political landscape can agree are positive, or at least supportable.  But some of those things have to be done in secret.  That is the nature of diplomacy, espionage, and foreign policy in the real world, which is a dangerous, complex world.  To leak three million communiqués potentially undermines everything positive that the U.S. can do in the world.  Come on, folks—can't you see that?  It should be obvious, and it's very disappointing that it isn't more so to liberals.  Unless you count yourself as one of the aforementioned radical leftists, who want to see the U.S. lose, period, then you cannot support Wikileaks' action.  It is completely unsupportable.

Fallacy: Wikileaks is a force for openness and transparency.  Openness is good.  (Oh, how can a founder of Wikipedia fail to realize this?  The horror!) There are some people who think that all of government should be conducted "in the open," always.  Such people remind me of my radical libertarian friends: their theories sound nice, beautiful even, but they quite stubbornly refuse to take seriously the reasons for the things they criticize.  The fact is that some, only some, of democratic government has always been conducted without public exposure.  In this brief comment, I cannot elaborate the reasons for occasional government secrecy, but I'll give you a hint: it has to do with privacy, public safety, and national defense.  I disagree with those people who want government to be so "open"—open far beyond anything any government has ever experienced, open far beyond anything widely thought to be required—that they are perfectly willing to undermine privacy, public safety, and national defense in order to secure that openness.  Such people are ideologues, and they are fun for other ideologues to argue with, and occasionally for philosophers too, but they can be safely ignored by more sane, grounded people and those with little time on their hands for philosophy.

Finally, Julian Assange is no hero.  He is a twit.  He should not be made into a liberal icon.  He gives hackers a bad name.  He and his organization are indeed enemies of the U.S. government and the people represented by that government; they should be stopped, and they richly deserve to be punished for this latest leak.  And that goes double for the person or people in the U.S. government who leaked the documents in the first place.  None of these people deserve your support any longer.

Discussion of this comment


More replies about Wikimedia and the fallout of my report to the FBI

Background: on April 7, I posted the text of a report I made to the FBI to the EDTECH mailing list, in which I stated that, in my opinion, the Wikimedia Foundation may knowingly have posted "child pornography," by which I meant one common usage of the term, namely, "obscene visual representations of the sexual abuse of children."  In short, the Wikimedia Commons "Category:Pedophilia" page hosted images with realistic and disturbing drawings of child molestation. The Register reported on this and it snowballed from there.  Among other venues, it was discussed on Slashdot, and I reproduced my reply to Slashdot on this site.

Then on April 27, FoxNews.com covered the story. This elicited howls of protest from the Internet geek community, and some support from the broader online community. In addition, there was a reply from the Wikimedia Foundation both in their blog and in a widely-circulated AFP wire story, and Erik Moeller also posted a reply. Lastly there was some (rather misleading) coverage by techdirt and The Inquirer.


There was a huge spate of commentary--to put it nicely--following the FoxNews.com coverage of my report to the FBI. The sheer amount of error and misinformation spread about the situation is predictable, but that won't stop me from, now, setting the record straight and offering replies on several points:

1. The images were (maybe still are) on Wikimedia Commons, not Wikipedia. The page I reported was Category:Pedophilia on Wikimedia Commons, not a Wikipedia article.  This matters because some people evidently went to the Wikipedia, looked at the pictures there, and concluded that my report was frivolous.  Indeed, if I had been talking about the pictures on Wikipedia's "Pedophilia" article, the report would have been frivolous, but it wasn't.

2. It isn't only photographs of real children that are against the law. Obscene visual representations of the sexual abuse of children are against the law as well, and in fact, that's exactly what 18 USC §1466A addresses. You may disagree with it, but that is the law of the land at present. Quite a few commentators presented the fact that the images were not of real children as some sort of "gotcha." But I clarified perfectly well in my original FBI report that I was talking about drawings. I made sure that the statute I cited applied to the images I had stumbled upon.

3. The statute required me to make the FBI report. Many people commenting on my report don't seem to realize that the statute specifically states that violations must be "reported...to a law enforcement agency." This is, in fact, a necessary part of the affirmative defense stated by the statute itself. "But," you say (agreeing with techdirt), "you didn't need to make the report public."  Read on:

4. I posted the FBI report publicly for good reason. I discussed my motives earlier, and I encourage everyone who has glibly dismissed my motives as self-interested to read that. But I want to add something here. This nest of issues badly needed to be made public. A public debate about the mere morality of hosting either drawings of child molestation or pornography in general would have done no good. Knowing them as I do, I doubt the technology press would find my public complaints worth reporting about. As to the Wikipedians, for one thing, Wikipedia allows pornography by policy, and they're not going to get rid of it, or care too much about reliably labelling adult content, just because the much-maligned Larry Sanger has some objections to it. I have enough experience with that crowd to know that the dialogue would go nowhere. The only way to get any response, I thought--and all the sniping and derision after my report have made me only more sure of this--would have been to get the broader public to look at the problem carefully. As Justice Brandeis said, sunlight is the best disinfectant. So, gritting my teeth, I made the report public. I knew I was burning bridges, but I felt that my obligation to standards and the law is higher than my own standing in the online/tech community.

5. Isn't this low priority for the FBI? Some have suggested that my report asked the FBI to focus on something that is lower-priority than catching actual purveyors of child porn. But in fact, I admitted something like that in my note to the FBI. I do not know, or pretend to have interesting opinions, about FBI priorities. I told them that of course I was leaving such decisions in their hands and will respect any decision they come to. But saying that violations of the statute by Commons (if the FBI agrees that they are violations) are low priority in this case surely implies that 18 USC §1466A should not be enforced. Besides, Wikimedia projects are now very high-profile. Long gone are the days when Wikipedia and sister projects were sites known only to geeks. If the FBI ignores violations of a statute on Wikimedia servers, that sends a message to the many others who collectively would be much harder to regulate.

6. The age of the images is irrelevant. Some of Wikimedia's defenders sniffed that some of the images in Category:Pedophilia are "historic." They supply no evidence or reasoning that the images are of historic importance; considering the truly outrageous nature of the smut they contain, I would be inclined to conclude that they are merely old, not "historic." The courts and FBI may disagre with me, but I'm fairly sure that the fact that an image is old does not mean that the statute does not apply. At first I too did not think that the images were that bad, until I actually clicked on the thumbnails and looked at them. Despite their age, they are gross and perverted, and just the sort of thing, I imagine, that the statute was designed to cover. Or will we say that images that are clearly covered by the statute today will be "historic" and therefore just fine in 100 years?

7. My concern about using pornography hosted by Wikimedia in schools is sincere and well justified. My main complaint was about the images of child abuse I saw on Wikimedia Commons, but I also discussed pornography generally, and this found its way into some reporting and discussions. Now, I support the Wikimedia Foundation's right to host (legal) pornography. I'm guessing that many of Wikipedia's most active supporters do not have school-aged children. So it is easy for them to ignore or make light of my claim that Wikipedia makes pornography accessible in schools where Wikipedia is not filtered. I think most teachers, and parents of school-aged children, are by contrast keenly aware of the difficulties of guiding their students' Internet use. They think it is important, and rightly so.

So, for those who lightly dismiss all issues about children's use of pornography on Wikipedia, let's get a few things straight. First, if there is any question about the amount and explicitness of the pornography on Wikipedia, have a look at this article and search on the page for "Clearly, it is hard to know". There you'll find some graphic descriptions of photos (not the images themselves). This is just a small amount of what can be found on Wikipedia and Wikimedia Commons. If you don't mind looking at the pictures themselves, you might start your own search from Wikipedia's "sex positions" article and category. As to Wikimedia Commons' offerings, have a look at this list of its most popular pages. This (along with the above-mentioned pornography policy) clearly puts the lie to the Wikimedia Foundation spokeman's claim that "Our community abhors issues around pornography and pedophilia and they don't want to provide opportunities for these things to take place." If they don't want to provide opportunities for pornography to take place, then why do they specifically allow it and have so much of it?

Second, while some school districts block access to Wikipedia, quite a few do not. Any curious student, armed with a few ideas of things to search for, can use computers in those school districts to view images that most people would call porn, and with just a few mouseclicks. Believe it or not, some district filter managers apparently did not know this. This actually surprised me quite a bit.

Finally, I know very well that most students who want to find porn online will be able to find holes in fallible filters and via their own connectivity. But placing vast amounts of it on a supposed reference site, and giving students access to it through school system computers, is greatly frowned upon by most school officials, not to mention parents, in the U.S. You might not like that, but it's a fact and it is not likely to change. To be a good citizen, Wikipedia should label its mature content reliably and so that it can be blocked by school district filters.  Then, there would be more school districts and families using a kid-friendly version of Wikipedia, and fewer students doing their background reading on the unexpurgated version.

8. May be, not is, and other minor corrections of the Foxnews.com article. First let me say that I appreciate the work--and dare I say the courage--of the FoxNews.com journalist, and I think she did a good job overall. But she reported a few small things about me incorrectly, I'm afraid. I don't want people to get the wrong idea about what I believe, so I'm going to correct the record. (1) I did not say that the Wikimedia Foundation is knowingly distributing child pornography, I said they may be. I used this only because, not being any sort of legal expert, I didn't know if it really was in fact covered by the statute I cited. As they say online, "I am not a lawyer." I'm trying to be modest. But for what it's worth, I have been told by two different journalists that they tried to find some legal experts who would deny that the statute in question applied to the images in question, they could not. (2) I did not implore the FBI to investigate. If you read my original message to the FBI, you'll see that I concede that they may have higher priorities, and that of course I left it up to their expert judgment. (3) I am sure I did not say that Commons was rife with renderings of children performing sexual acts. I think "rife" implies a huge or widespread number.  I didn't claim that. (4) Category:Lolicon, when I saw it, had only one image that I thought violated the statute (this is described in the article; I am not going to repeat the description here). Other than that one, as far as I recall, it did not provide cartoons "similar in detail and depiction."

9. The AFP's presentation of my position falsely implied that I had backpedaled on my claim. The AFP wire story (which, by the way, they wrote without interviewing me), opened its coverage with the Wikimedia Foundation's rejection of the allegations of hosting child pornography.  When they got around to stating what the allegations were, in the fourth paragraph, they began by quoting my clarification that the term "child pornography" was misleading to some, because some people think it means only photographs. That's not my view, by the way. I think drawings of sexual abuse of children counts as child porn, and many people agree with me. But AFP did not even bother to state the main point, which was not at all affected by my small semantic concession, that the two Wikimedia Commons categories I reported to the FBI did seem to be in violation of 18 USC §1466A.

10. How dare I suggest the law be enforced? I really think what has a lot of people hot and bothered is that I have had the outrageous gall to suggest that the law and common, reasonable societal standards be enforced against a site that has, for Wikipedia's many true believers, been a model of self-regulation. The very opposite has been the case. The notion that the government might be called in to make the Wikimedia Foundation and its community play by the (legal) rules goes completely against the idealistic, anarchistic Wikipedia spirit. To put it another way, the real world threatens to interrupt the whole insulated Wikipedia game, which is a sort of collective delusion. My failure to believe in the game, and my willingness to denounce its results publicly, is what really crosses the line for Wikipedia's true believers. (That, and the fact that I'm speaking nearly alone against a whole mass movement; this makes me especially easy to demonize.) But I think Wikipedia must become more consistent with the somewhat higher standards of the world it is a part of, and I would think of myself as lacking courage if I did not say so. I hope others will join me.

By the way, I would like to thank the people who have sent me notes of appreciation by email, Facebook, and Twitter. Your support means a lot to me.


Reply to Slashdot about my report to the FBI

On April 7, I posted the text of a report I made to the FBI to the EDTECH mailing list, in which I stated that, in my opinion, the Wikimedia Foundation may knowingly have posted "child pornography," by which I meant "obscene visual representations of the sexual abuse of children."  In short, the Wikimedia Commons "Category:Pedophilia" page hosted images with realistic and disturbing drawings of child molestation. The Register reported on this and it snowballed from there.  Among other venues, it was discussed on Slashdot, where I posted the following reply.  I decided to repost it here permanently.

I have added a more recent reply.

Larry Sanger here--let me clarify a few things.

First of all, what very few of the commenters (at least the first commenters) noticed was that the statute I cited, 18 U.S.C. §1466A, has the following title: "Obscene visual representations of the sexual abuse of children."  It specifically states: "Any person who, in a circumstance described in subsection (d), knowingly produces, distributes, receives, or possesses with intent to distribute, a visual depiction of any kind, including a drawing, cartoon, sculpture, or painting, that..."

That's drawings, cartoons, sculptures, and paintings.  "Visual depictions of any kind."  Many people who criticized my message to the FBI really seem to have a problem with the law, which I find interesting.

Anyway, I now realize with regret that "child pornography" was probably the wrong word to use.  I didn't realize that it would be so misleading.  I thought that "obscene visual representations of the sexual abuse of children" (the title of the statute) was just what we mean when we say "child pornography."  It didn't occur to me until afterward that many people restrict "child pornography" to mean photographs of real children.  If I had realized this sooner, I would have used "depictions of child sexual abuse" instead.

So, why did I report Wikimedia to the FBI?  First some background.  I am broadly a libertarian, but I am also a sincere moralist (as opposed to a cynical amoralist).  Libertarianism and moralism are not--of course--contradictory.  Being a libertarian, I think we have the right to do a lot of things, including a lot of things that broadly coarsen society; that's the price we pay for freedom.  But, just as the law provides for, I do draw one line when it comes to photographs, or even merely realistic depictions, of child sexual abuse.  Most sane libertarians recognize that some speech should be restricted by the force of law--the hackneyed examples are shouting "fire" in a crowded theater, perjury, and libel.  But for me, depictions of child sexual abuse are another.  I respect the opinion of those who have a principled disagreement with me when it comes to depictions of child sexual abuse.  But pretending that it's just obvious, even for libertarians, that we have a right to publish such depictions is simply wrong, in my opinion.

Regarding my motives, yes, I thought I was doing my civic duty, one that I didn't really want to do, but which I felt I ought to do.  Partly this was because the statute in question required me to make the report if I thought the statute applied (and it seems to me it does--those drawings sure look like obscene visual representations of the sexual abuse of children to me).  But partly also it was because I think that this sort of thing--including some pictures of children being out-and-out raped--is completely wrong, and should not be allowed in a civilized society.  Call this censorship if you like, but I don't really think you have a constitutional right to publish and consume realistic drawings of child rape and molestation.

But what outcome am I aiming at?  Contrary to the insinuations of some, I have no interest in trying to get Wikimedia shut down; that would be unnecessary, and I doubt it would happen as a result of the violation of the statute.  But I think and hope it may cause pressure on Wikimedia from law enforcement, politicians, and the general public to eliminate this sort of content.  I also hope that Wikimedia will be persuaded, or if necessary forced, to label its "adult" content as such in a consistent and reliable way, so that it can be easily filtered by school system filters.  This would be a win-win, because then Wikipedia would be used in more schools--something I don't at all oppose, except for all the grossly inappropriate material for school children--and, when used in schools, children would be less likely to find content that their parents and teachers regard as grossly inappropriate for their age.

I know that in our cynical world, a lot of people will have trouble believing that my motives and aims as stated are sincere.  Many people have said that I am motivated by a desire to get my projects in the news.  In fact, in posting about my message to the FBI on EDTECH, I had no conviction that it would aid my projects.  Actually, I worried that it might damage them, for exactly the reasons so many Slashdotters are howling now: leveling accusations against a popular project is a highly unpopular thing to do.  But I'm afraid (and again, some will have trouble believing this, but I don't care) I am "old school" on this sort of thing.  When it comes to doing what is right, I often say "Damn the consequences."  This is why I am not very popular, and never have been; I must seem totally tone-deaf socially speaking, because I frequently find myself obligated to do and say unpopular things.  I take my inspiration from Socrates.  To the sort of people who think this claimed, alleged, supposed idealism is obviously false, or stupid, or terminally naive, there is little persuasive that I can say, because they live in a completely different world than mine.  I'm sure all of the things I've been saying now about my motives will just confirm their opinion of me that I'm just a jerk, an authoritarian, or whatever else comes to mind.  C'est la vie.

A lot of people shrieked with indignation that I mentioned my qualifications and--horrors--my websites.  Conflict of interest!  But they omitted to say where I mentioned this, which was grossly misleading on their part.  The context of the statement was: it was in the first paragraph of a message to the FBI.  I stated those things so that they understood exactly who I was, what qualifications I had to post the tip, and--believe it or not--what conflicts of interest I might have, should they find those to be relevant.  If they want to disregard what I say because I have started a newer project competing with the project I am reporting, I want to make it easy for them to do so and move on to other pressing government business.  I did not write the message with public consumption in mind; I posted it on EDTECH only as an afterthought, to underscore a point I had made in that forum.  It only occurred to me later that this might be misconstrued as a plug for my own sites.  Only later did I realize that I should not have quoted that part of the letter at all.

Those of you who think that I have a "conflict of interest" might reflect that with this move I have if anything completely burned the last of my bridges to working in the mainstream (deeply libertarian) world of Web 2.0.  That is something I did realize.  After this, I am sure I have permanently ruined my chances of getting a job (if I had wanted one) or getting funding for a successful for-profit in Silicon Valley.  I know I will probably get a reputation of being a fraud who will say anything for a little publicity, or (much worse) a self-styled moralist who is in favor of censorship.  Neither of these things is true, but I know it is the reputation I will probably have among that crowd.  Well, you see, that's the price you pay for living honestly in the world: you do what you must, even if you wished you didn't have to, and you let the chips fall where they may.  In the end, you have to trust that you will be rewarded in other ways, if only in having a relatively clear conscience.

So, sure, I know that our (pathetic) cynicism is such that many people will be unable to believe the above; they will think I am merely trying to appear noble, and they will mock the stupidity of it, because everyone is cynical and cool and maximally tolerant of sexual proclivities these days, and "noble" motives no longer exist, so any pretense to having such motives will be mocked or discarded out of hand.  Oh well--that's a pity for me.

Still, I hope I will have gotten the public to consider the issue.  As to my career, well, let's just say that I am now interested in the education sector, and in this sector, there aren't so many people who think we have a constitutional right to view drawings of child molestation.

I have just one last comment, in response to a few Slashdot comments.  Some of those comments were written by people who sound like complete creeps to me, people I would not trust anywhere near my little boy.  Here is an example, and not the only one: "why should anyone care if someone masturbates to an image of a drawn child? If that gets his/her kicks so that the person can be a normal productive member of society, all's good, or at least should be good - no child is ever harmed, and the person has taken care of his/her urges."  I find this chilling.  But maybe even more chilling is that Slashdot rated this "Score: 5, Interesting."  Interesting--sure, I'll grant it's that.  But its high rating is chilling because it indicates that one of the most influential sectors of industry today, the geek sector in control of the most massive media production system in history, as represented by Slashdot, is steadfastly non-judgmental when it comes to someone who all but admits that he gets his "kicks" by masturbating to an image of a drawn child.  It's that attitude that explains why Category:Pedophilia and its contents exists on Wikimedia Commons.  Such people should not be making policy for the seventh most popular website in the world.

I suspect the people who make such grotesque remarks (and not all the critical comments are this grotesque) are mostly sick puppies who grew up with little moral guidance, who believe that virtually all desires are brute facts that cannot be criticized and [must be] always respected.  They vainly imagine themselves to be very clever, but they have very little in the way of actual wisdom.  These people will be utterly mortified by their youthful remarks when they actually have children of their own.  But then of course there are the tiny number of some real perverts--let's call a spade a spade--who might be older, are probably childless, and who are actually confident in defending sexual relations with children.  They actually have the temerity to pretend that this is the next cutting-edge frontier of the broader movement toward civil rights and equality, and that those who disagree with them must be stupid, knuckle-dragging conservatives.  These people are simply tangled in their own web of rationalizations for behavior and inclinations that, on my view, are simply evil.  (And, yes, I did notice that some Slashdotters mocked the notion that child sexual molestation was "evil.")  I feel no desire whatsoever to dignify such people in any way at all, and I could not care less what they have to say about me.  I can only hope that the rest of society is not so far gone in the way of moral relativism, or pseudo-tolerance, or whatever you want to call it, that they feel they must tolerate the advocates of free sexual relations between adults and children.

I don't want to end on that note, because I really doubt that most of the people who have objected publicly to my position are, as I described them, "sick puppies" and pedophilia activists.  Actually, I think most of them are libertarians, most of whom probably again don't have children, and who are probably disgusted (as well they should be) by depictions of child sexual abuse.  Despite our disagreement on the philosophical issue of where to draw the line regarding free expression, I have some sympathy and affection for these sort of people, who are taking a principled stand, but one that is, I feel, nonetheless wrong.


Should Science Communication Be Collaborative?

Plenary address at PCST-10 (10th conference of the International Network on Public Communication of Science and Technology), Malmö University, Malmö, Sweden, June 25, 2008.  A slightly abbreviated version of this was delivered.

I. The question, and some distinctions

Should science communication be collaborative?  There are two ways to understand this question, and so also two very different reactions to it.  One reaction is that science writing already is very collaborative.  Scientific articles are typically co-written by labs or by other collections of colleagues, because most experiments cannot be done by just one person; scientific discoveries are now typically made by several or many people cooperating.  So, of course science communication should be collaborative.

The other reaction understands me to be talking about collaboration in the wiki sense, or what I call radical collaboration.  And to that question there are typically mixed reactions.  On the one hand, what Wikipedia has done is very exciting, and if scientists can tap into the same sort of collaboration, perhaps great things will result.  On the other hand, scientists and scholars in general are very suspicious of the notion that anybody can edit our words.  Many scholars scoff at Wikipedia's motto—"you can edit this page"—as incontrovertible evidence that it cannot be very reliable.

The question I am interested in is actually the latter one: should science communication be radically collaborative?  So let me define this piece of jargon.  Collaboration is radical if it goes beyond two or more people merely working together.  In addition, the collaborators are self-selecting; they determine what they are going to do, and are not assigned their roles.  Finally, there is equal ownership or equal rights over the resulting work, or in other words, there is no "lead author."

So, should science communication be radically collaborative?  I cannot give you any simple answer to this question, but I do want to say that radical collaboration is part of our future, and will probably result in some amazing new scientific resources.  I'll be asking how big a part of our future it should be—as well as what we should not expect radical collaboration to do.

But first, it will be useful to draw a distinction between two kinds of scientific communication: original and derivative.  Original communication is aimed at advancing knowledge in the field with never-before-published findings, discoveries, first-hand accounts, survey data, theories, arguments, proofs, and so forth.  Typically, such communication takes the form of papers in peer-reviewed journals and online pre-print services, as well as conference presentations, posters, and some other things.  By contrast, derivative communication merely sums up what is already known, and takes the form of news and encyclopedia articles, textbooks, and popular science books and magazines.

I don't pretend that the distinction between original and derivative communication, if one examines it carefully, is easy to make.  One reason that it is difficult is that, whenever one reports scientific and other scholarly findings, analysis almost inevitably occurs; and sometimes, an analysis can be as interesting, challenging, and pathbreaking as the findings reported on.  So I imagine that such interesting analysis can be a borderline case between original and derivative communication.

There is another reason the distinction is difficult.  Frequently, we want to criticize certain published papers, which purport to present original findings, as being almost wholly derivative—they do not really advance the field at all.  I am told that this happens much more than it should, in scientific publishing.  So I admit that sometimes, purportedly original communication is actually derivative.

In fact, I will admit something more: it is far from clear what constitutes an advance in any given field.  If someone merely deduces something from previously published experimental findings, is that an advance?  Sometimes, sometimes not.  If someone does an experiment that is only trivially different from any of many already-published experiments, and obtains similar results, is that an advance?  Not necessarily, it seems to me.  If someone merely applies an established paradigm to a domain of knowledge for the first time in a published article, is that an advance?  Perhaps; but perhaps not, if the application was simply obvious.

So there are, I realize, several reasons to be critical of the distinction between original and derivative communication.  That admitted, I do think there are many perfectly clear cases of both original and derivative communication; in fact, I think most scientists and scholars would not have trouble classifying most communication in their fields as either original or derivative.  When Watson and Crick originally described the double helix, that was definitely original.  When Wikipedia, or a biology textbook, describes the double helix, that is definitely derivative.  And where we are uncertain, on philosophical grounds, about whether some finding really is original, at least we can tell whether the author is treating it as original.

I draw this distinction because I think that we might actually wish to give different answers to the question, "Should science communication be collaborative?" based on what type of science communication we're talking about.  In particular, I think it is very plausible that derivative science communication, like encyclopedia articles and science news reporting, are much more amenable to collaboration than original science communication.  I think, moreover, that in explaining this we will uncover some very interesting insights, or at least questions, about collaboration and perhaps even about science communication itself.

II. Derivative science communication

Let me begin with derivative science communication—again, things like encyclopedias, science news reporting, and textbooks.

Over the last few years, I have conversed with dozens of scholars and scientists about how to set up wikis or other collaborative knowledge communities.  There is a fascinating pattern to these conversations.  They go like this.  The scientist, impressed by the vast quantities of information in Wikipedia, tells me: "It is amazing what can be accomplished when many people come together, from around the world, to sum up what is known.  What would happen if we tried this in our field?  The resulting resource could be a central, authoritative clearing-house of information for everyone in the field, as well as for the general public.  So, what is the best way to set up ‘a Wikipedia' in our field?"

This is an interesting question, but it is not the question that they end up answering.  Instead, the scientist goes off and consults with his colleagues, and then I hear this: "We have a couple of concerns.  First, we are concerned about lack of credit in the Wikipedia system.  The careers of scientists depend on names being on their publications.  So we want to make sure that authors are properly named and identified on articles.  Second, we are a little nervous about the idea that just anybody can edit anybody's articles.  We understand that it's important to be collaborative, but we think it is reasonable to nominate a lead author or lead reviewer for each article, and restrict participation to experts.  So, what do you think of that?"

I think that the scientist and his colleagues are confused in a fascinating way.  I try to be diplomatic when I say this, of course.  But the scientist seems not to realize two facts:

  1. If you name authors, you award lead authorship or editorship for articles, and you carefully restrict who may participate, then you are not building a collaborative community in anything like the radical sense.  You are merely using a wiki to replicate an older sort of collaboration, common in scientific writing.
  2. It is precisely the newer, more radical sort of collaboration that explains Wikipedia's success.  Wikipedia is successful in large part precisely because everyone feels empowered to edit any article.  If you disempower people, they won't show up.

As a result, there is no reason to think that the scientist's group will enjoy success anything like Wikipedia's, because they have actually rejected the Wikipedia model.

I am not saying that using wiki software to replicate old-fashioned systems won't work at all. In fact, in 2005, I helped set up such a system myself, called the Encyclopedia of Earth, and it seems to be working reasonably well so far—but, as far as I know, not much actual collaboration goes on, and a large part of the few thousand articles that they have were imported from other sources.  Another scientist-run encyclopedia, Scholarpedia, has a somewhat similar set of policies, and has produced even fewer articles.  To be sure, the quality of the articles produced by these projects is good.  But it seems to me that the articles have little chance of ever fulfilling the original, high hopes of the project designers.  Many of them won't be incredibly detailed, balanced, authoritative, and a pleasure to read, which is what one might hope to get from a large group of experts coming together to work on a piece of text. Nor do such projects have any chance of achieving the depth of coverage that Wikipedia has.  In short, as far as I can tell, the most that projects like the Encyclopedia of Earth and Scholarpedia can hope to achieve is to produce a free version of old-fashioned sorts of encyclopedias.  I do not mean to say that there is something wrong with that.  I merely claim that they will not enjoy the advantages and potential that a radically collaborative project has, the advantages and potential that made them imitate the Wikipedia model in the first place.

This, then, raises a question.  Do those scientists, who have rejected the Wikipedia model, have a legitimate complaint about it?  Or have they made a mistake in rejecting it?  I think they are partly right in rejecting the Wikipedia model, but also partly mistaken.  Let me clarify, first by explaining what they have gotten right.

Essentially, the scientists I've advised are quite right to reject the wide-open Wikipedia model, according to which anyone can alter any article regardless even of whether the person has logged into the system or is using his or her real name.  Wikipedia's rock-solid commitment to anonymous contribution explains many of its problems, in my opinion.  It explains why Wikipedia has so much vandalism and people editing abusively and in bad faith; it also explains why the Wikipedians have never been able to enforce some of their own basic principles, such as neutrality and politeness.  Scientists and scholars generally are very well justified in rejecting Wikipedia's anonymity policy.  I have argued for this thesis elsewhere,[1] and can't spend the time to explain arguments now.

So that's why my scientist colleagues were right to reject the Wikipedia model.  But they are also mistaken to believe that articles must be signed by their authors, that they must have lead authors, and that participation should be restricted to experts.  They believe they must adopt these policies because, otherwise, the result will be unreliable or of poor quality.  They appear to think that, since all trustworthy encyclopedias in the past had signed articles, lead authors, and participation restricted to experts, there is no way to design an encyclopedia project that changes these features.

Now, I don't have time in this paper to argue for this point in detail, but I simply want to point to the example of the Citizendium, which is a wiki encyclopedia project I started a year and a half ago.  We do not sign articles; we do not have lead authors; and we open participation up to anyone who can make a positive contribution to the project.  But we do make a role for experts.  Despite the fact that we reject so much of the traditional model of content production, the quality of our articles is remarkably good, especially for such a young project.  The articles that have been approved by our expert editors, in particular, are extremely readable, as well as being authoritative.  My point, then, is that it is possible to have a radically collaborative system that produces high-quality, credible content.  So if my scientist colleagues rejected radical collaboration because they thought the results would necessarily be of substandard quality, they were simply mistaken, as our experience with the Citizendium shows.  Moreover, I should point out that we are far more productive than Scholarpedia or the Encyclopedia of Earth; we have over 7,000 articles and are growing daily.

I can imagine a reply to this, however.  One might concede that the Citizendium's articles are, or will be, of reasonably good quality.  But will they be better than articles written by small groups of experts?  Not necessarily, of course.  Still, I would like to give you some general reasons to think that they could be better.  More precisely, I want to answer this question: is there something about radical collaboration per se that improves the quality of articles?  I think so.

Given enough time, an article that is written with a large and diverse set of authors—particularly if it is under the gentle guidance of experts—can be expected to be lengthier, broader in its coverage, and fairer in its presentation of issues, than an article written by a single or a few hand-chosen authors.  It will be longer, because many collaborators will compete with each other to expand the article.  It will be broader in its coverage, because the collaborators often can fill up gaps in exposition that others leave.  It will be fairer in its presentation of issues, because self-selecting collaborators in a very open project will tend to have a diversity of views, and they must compromise in order to work together at all.

In short, radical collaboration naturally pushes articles in the direction of being longer, more detailed, and fairer.  When the collaboration is gently guided—not led and controlled—by experts, and when the collaborators respect the experts and are willing to defer to from time to time and when necessary, the resulting articles can be outstanding.  A number of the Citizendium's approved articles are outstanding for these very reasons.  We have quite a few outstanding unapproved articles as well.

So far, I have spoken only about one kind of derivative communication: encyclopedias.  But there are other kinds, as I said: journalism, textbooks, and popular science writing, for instance.  I could discuss each of these, but again I lack the time.  Instead, I want to make a general point about all of them.

Often, in expository writing and even more in fiction writing, we derive value from the text precisely because it is personal, because it presents a single, unique point of view that we find compelling.  We find the writing interesting because we find an individual mind interesting.  Why are we fascinated by the minds of Stephen Hawking, Richard Feyman, Stephen Jay Gould, or Steven Pinker?  (And for that matter, why are so many famous scientists named Stephen?)  Well, it seems that, in works by these authors, the addition of another author might subtract from the value of their text.  Why is that?  Why is it that we find individual minds interesting?  It is not because their thoughts are more accurate or more exhaustive.  Rather, a text with a single author, especially one who is expressing his personality, is a window into another mind, and so it represents how we, each of us individually, might also want to think.  Only an individual seems to be able to serve as a credible model of how to think about the world; and, for whatever reason, we do take other thinkers as models.  Collective productions can convey useful information, of course, but they necessarily do not express the views of any one person.  They are largely useless as complex, full-bodied, human models after which we can pattern our own thinking.

But almost all encyclopedia articles,[2] most news articles, and some textbooks are used just to get information, not to serve as an entrée into an interesting perspective on the world.  Idiosyncracy and personality are annoying when we merely want information.  When it's bare information we want, we don't care about persons—only facts.  The point, then, is that radical collaboration is suitable for gathering impersonal information.  That, we might say, is its proper function.

III. Original science communication

Up to this point, I've been talking about whether derivative science communication should be collaborative; the answer, in short, is yes and no.  So now let me talk about whether original science communication should be collaborative.  But first, I think we need to examine whether, and in what sense, original science communication, such as papers that express new research findings, can be radically collaborative.  Maybe a better question is this: to what extent can original science writing be collaborative?  We already know that scientific research can be collaborative in the old-fashioned sense, because it is so often is, in fact.  What is the feasibility of making it more collaborative?

Applying certain aspects of the Wikipedia model to original science communication—and even the Citizendium model—strikes me as simply impossible.  For instance, if research papers were not signed, but instead were attributed to a nameless collective, the traditional motive of scholarship—personal glory, the honor of one's peers and of history—would disappear.  In short, I very much doubt scientists would participate at all in a researech collective without definite personal credit.  We may not need prominent personal credit to create derivative works collaboratively, but original works are another matter entirely.  Indeed, the economics of the two kinds of communication are different, because our motives are different.  Many scholars and scientists will not write an encyclopedia article, news article, textbook, or a popular science book without some compensation.  But the same people routinely publish much more difficult research papers and monographs with no monetary compensation.  The glory and honor of discovery is the motivation for such work.  Wiki work is just not that glorious, or at least, not in the same way.

Another aspect of radical collaboration is open authorship, that is, the authors select themselves.  This again seems impossible, or very difficult at best, for original science communication.  For one thing, original communication expresses original thoughts, and such thoughts necessarily tend to be controversial and difficult.  To open up authorship of original work very wide would, hence, permit the participation of persons who disagree with the conclusions or who don't even understand them.  But if participation is limited to like-minded scholars who understand the research, the collaboration can no longer be called "radical."  It's just a variant on old-fashioned collaboration.

In fact, beyond issues of feasibility or difficulty, I detect an incoherence in the very idea that original research might be radically collaborative.  The act of publishing a research paper does more than merely convey some findings; it also stakes a claim, that is, it has the force or effect of attaching some definite name or names to the findings.  To make original science communication radically collaborative would be to nullify the act of taking credit.  If we were to list as co-authors people who are not responsible for the research, the author list would not longer be honoring those people actually responsible for the finding.  It would just be a list of people who happened to work on the paper that summed up the research, even if some of the people listed had none of the thoughts or conclusions contained in the paper.

One might say that open collaboration on communication of original research would help to elaborate the full range of arguments and analysis releated to the research.  But that already happens, I suppose, in the give-and-take of scientific and scholarly conversation that happens before and after a paper is published.  Indeed, it has often been observed that science and scholarship generally are massively collaborative in the sense that researchers build on each others' work; it was Newton who pointed this out when he said that he saw farther only because he stood on the shoulders of giants.  I have no doubt that new Internet methods can and already do facilitate this very old sort of scientific collaboration.  But I see no need, in addition, to permit others, who had nothing to do with some research, to participate in the writing itself of original research findings.

That said, there is at least one way that original science communication might be amenable to radical collaboration: I mean what has been called "open research" and "open science."  As I understand it, this involves inviting others to participate actively in a study—not merely collaborating on the writing, but actually doing the research for, designing, and performing experiments, surveys, and so forth.  This is something I know very little about, and I will not embarrass myself by pretending to know more than I do.  An example of such research, perhaps, was the lightning-fast investigation in multiple labs that identified the avian flu virus.  Such research can be somewhat open and self-selecting.  So perhaps that is one sense, and a very interesting sense, in which original science communication can be radically collaborative.  I'm afraid I can't presume to say anything else about that, though.

IV. Conclusion

So, to sum up, should scientific communication be collaborative?  I've made it clear, I hope, that it depends on the type of communication.  Derivative communication that merely aims to express impersonal information can, and in some cases perhaps should, become radically collaborative; the Citizendium system shows how.  But when a specific personality, or point of view, forms an important part of the value of the communication, collaboration is denaturing and devaluing.[3] And original scientific communication should be collaborative only to the extent that the research it reports has been collaborative.

In the interests of keeping this paper short and provocative, I have not answered many important questions.  Perhaps the most important unanswered question is: what constitutes a contribution to knowledge?  Also, I said that some derivative communication should not be collaborative, because its value depends on its coming from an individual mind; I said that the productions of individual minds sometimes have some special value because they "model" how to think about the world.  What do I mean by that, and what is valuable about it?  I also asserted that scientists would not participate in research programs without the expectation of credit.  That seems obvious, but perhaps I should have explained why not; that is really a core issue.  Finally, I only barely glanced at the prospects of open research, or open science.  What is such research, really?  Is it radically collaborative in anything like the wiki sense, or is it merely the practice of making our research available to others for free, and talking a lot?

Without having given clearer answers to these fundamental questions, I can't say I have adequately discussed whether science communication should be collaborative.  Clearly, this is a big question, with many ramifications.  But I do hope I have at least introduced a few of the salient issues and given you something interesting to think and talk about.


[1] "A Defense of Modest Real Names Requirements," delivered at the Harvard Journal of Law & Technology 13th Annual Symposium: Altered Identities, Harvard University, Cambridge, Massachusetts, March 13, 2008.  Available at http://www.larrysanger.org/realnames.html

[2] Diderot's Encyclopedie and the 11th Encyclopedia Britannica could be notable exceptions.  Those encyclopedias, perhaps the best-known encyclopedia editions in English, both featured articles by famous contemporary thinkers who expressed their own idiosyncratic views.  To be sure, some people reject all notions of objectivity and neutrality and prefer the openly personal and idiosyncratic, even in encyclopedias.  This is not the norm, or the ideal at least, for reference work today.

[3] Lawrence Lessig's attempt to make a wiki out of his second version of his book Code (called Code 2.0), demonstrates the difficulty of watering down the ideas and voice of an interesting person.


A Defense of Modest Real Name Requirements

Lunchtime speech at the Harvard Journal of Law & Technology 13th Annual Symposium: Altered Identities, Harvard University, Cambridge, Massachusetts, March 13, 2008.

I. Introduction

Let me say up front, for the benefit of privacy advocates, that I agree entirely that it is possible to have an interesting discussion and productive collaborative effort among anonymous contributors, and I support the right to anonymity online, as a general rule. But, as I'm going to argue, such a right need not entail a right to be anonymous in every community online. After all, surely people also have the right to participate in communities in which real-world identities are required of all participants—that is, they have a right to join voluntary organizations in which everyone knows who everyone else really is. There are actually quite a few such communities online, although they tend to be academic communities.

Before I introduce my thesis, I want to distinguish two claims regarding anonymity: first, there is the claim that personal information should be available to the administrators of a website, but not necessarily publicly; and second, there's the claim that real names should appear publicly on one's contributions. I will be arguing for the latter claim, that real names should appear publicly.

But actually, I would like to put my thesis not in terms of how real names should appear, but instead in terms of what online communities are justified in requiring. Specifically in online knowledge communities—that is, Internet groups that are working to create publicly-accessible compendia of knowledge—organizers are justified in requiring that contributors use their own names, not pseudonyms. I maintain that if you want to log in and contribute to the world’s knowledge as part of an open, community project, it’s very reasonable to require that you use your real name. I don't want, right now, to make the more dramatic claim that we should require real names in online knowledge communities—I am saying merely that it is justified or warranted to do so.

Many Internet types would not give even this modest thesis a serious hearing. Most people who spend any time in online communities regard anonymity, or pseudonymity, as a right with very few exceptions. To these people, my love of real names makes me anathema. It is extremely unhip of me to suggest that people be required to use their real names in any online community. But since I have never been or aspired to be hip, that’s no great loss to me.

What I want to do in this talk is first to introduce the notion of an Internet knowledge community, and discuss how different types handle anonymity as a matter of policy. Then I will address some of the main arguments in favor of online anonymity. Finally, I will offer two arguments that it is justified to require real names for membership in online knowledge communities.

II. Some current practices in online knowledge communities

First, let me give you a definition for a phrase I'll be using throughout this talke. By online knowledge community I mean any group of people that gets organized via the Internet to create together what at least purports to be reliable information, or knowledge. And I distinguish between a community that purports to create reliable information from a community that is merely engaging in conversation or mutual entertainment. So this excludes social networking sites like MySpace and FaceBook, as well as most blogs, forums, and mailing lists. Digg.com might be a borderline case; calling that link rating website a “knowledge community” is again straining the definition, because I’m not sure that many people really purport to be passing out knowledge when they vote for a Web link. They’re merely stating their opinion about what they find interesting; that’s something different from offering up knowledge, it seems to me.

I want to give you a lot of examples of online knowledge communities, because I want to make a point. The first example that comes to mind, I suppose, would be Wikipedia, but also many other online encyclopedia projects, such as the Citizendium, Scholarpedia, Conservapedia, among many others (and these are only in English, of course). Then there are many single-subject encyclopedia projects, such as, in philosophy, the Stanford Encyclopedia of Philosophy and the Internet Encyclopedia of Philosophy; in biology, there is now the Encyclopedia of Life; in mathematics, there is MathWorld; in the Earth Sciences, there is the Encyclopedia of Earth; and these are only a few examples.

But that’s just the encyclopedia projects. There are many other kinds of online knowledge communities. Another sort would be the Peer to Patent Project, started by NYU law professor Beth Noveck. Perhaps you could consider as an online knowledge community the various pre-print, or e-print, services, most notably arXiv, which has hundreds of thousands of papers in various scientific disciplines. This might be straining the definition, however. If you consider a pre-print service an online knowledge community, then perhaps you should consider any electronic journal such a community; indeed, perhaps we should, but I won’t argue the point. Anyway, I could go on multiplying examples, but I think it would get tedious, so I’ll stop there.

The examples I've given so far have been mostly academic and professional communities. And here I finally come to my point: out of all the projects named, the only ones in which real names are not required, or at least not strongly encouraged, are Wikipedia and Conservapedia. This, of course, proves only that when academics and professionals get online, they tend to use their real names, which shouldn’t be surprising to anyone.

But there are actually quite a few other online knowledge communities that don’t require the use of real names. I have contributed a fair bit to one that is a very useful database of Irish traditional music—it’s got information about tunes and recordings--it's called TheSession.org. There are many other hobbyist communities that don’t require real names; just think of all the communities about games and fan fiction. Of course, then there are all the communities to support open source software projects. I doubt a single one of those requires the use of real names.

I haven't had time to do (or even find) a formal study of this, but I suspect that, as a general rule, academic projects either require or strongly encourage real names, while most other online knowledge communities do not. This should be no great surprise. Academics are used to publishing under their real names, but this is mostly for professional reasons; with the advent of the Internet, many other people are contributing to the world's knowledge, in various Internet projects, but they have no professional motivation to use their own real names. For some people--for example, a lot of Wikipedians--privacy concerns far outweigh any personal benefit they might get for putting their names on their contributions.

So, how should we think about this? Is it justifiable to demand anonymity in every online community, on grounds of privacy, or any other grounds? I don't think so.

III. Some arguments for anonymity

Next, let's consider some arguments for anonymity as a policy, and briefly outline some replies to them. By no means, of course, do I claim to have the last word here. I know I am going very quickly over some very complex issues.

A. The argument from the right to privacy. The most important and I think most persuasive argument that anonymous or pseudonymous contribution should be permitted in online communities is that this protects our right to privacy. The use of identities different from one’s real-world identity helps protect us against the harvesting of data by governments and corporations. Especially in open Internet projects, a sufficiently sophisticated search can produce vast amounts of data about what topics people are interested in, and much other information potentially of interest to one's employers, corporate competitors, criminals, government investigators, and marketers. This is a major and I think growing concern about Google, as well as many online communities like MySpace and FaceBook. Like many people, I share those concerns, even though personally my life is an open book online--maybe too open. Still, I think privacy is an important right.

But I want to draw a crucial distinction here. There is a difference between, on the one hand, using a search engine, or sharing messages, pictures, music, and video with one's friends and family, and on the other hand, adding to a database that is specifically intended to be consulted by the world as a knowledge reference. The difference is very obvious if you think about it. Namely, there is simply no need to make your name or other information publicly available, for you to do all the former activities. When you are contributing to YouTube, for example, you can achieve your aims, and others can enjoy your productions, regardless of the connection or lack thereof between your online persona and your real-world identity. So, in those contexts, the connection between your persona and your identity should be strictly up to you. For example, whether you let a certain other person, or a marketer, see your FaceBook profile also should be strictly up to you. These online services have become extensions of our real lives, the details of which have been and generally should remain private, if we want them to be.

We have a clear interest in controlling information about our private lives; we have that interest, of course, because it can be so easily abused, but also because we want to maintain our own reputations without having the harsh glare of public knowledge shone on everything we do. Lack of privacy changes how we behave, and indeed we might behave more authentically, and we might have more to offer our friends and family, if we can be sure that our behavior is not on display to the entire world.

I've tried to explain why I support online privacy rights in most contexts. But I say that there is a large difference between social networking communities like MySpace and FaceBook, on the one hand, and online knowledge communities like Wikipedia and the Citizendium, on the other hand. When you contribute to the latter communities, the public does have a strong interest in knowing your name and identity when you contribute. This is something I will come back to in the next part of this talk, when I give some positive arguments for real names requirements.

B. The argument from the freedom of speech. But back to the arguments for anonymity. A second argument has it that not having to reveal who you are strengthens the freedom of speech. If you can speak out against the government, or your employer, or other powerful or potentially threatening entities, without fear of repercussions, that allows you to reveal the full truth in all its ugliness. This is, of course, the classic libertarian argument for anonymous speech.

The most effective reply to this is to observe that, in general, there is no reason that online collaborative communities should serve as a platform for people who want to publish without personal repercussions. There are and will be many other platforms available for that. Indeed, specific online services, such as WikiLeaks, have been set up for anonymous free speech. Long may they flourish. Moreover, part of the beauty of the classical right to freedom of speech is that it provides maximum transparency. Anyone can say anything—but then, anyone else can put the first person’s remarks in context by (correctly) characterizing that person. Maximum transparency is the best way to secure the benefits of free speech.

I suspect it is a little disingenuous to suggest that anonymous speech is generally conducive to the truth in online knowledge communities. The WikiScanner, and the various mini-scandals it unearthed, actually helps to illustrate this point. It illustrated something that was perfectly obvious to anyone familiar with the Wikipedia system: that persons with a vested interest in a topic can and do make anonymous edits to information about that topic on Wikipedia. They are not telling truth to power under the cover of anonymity. Rather, they are using the cover of anonymity to obscure the truth. They would behave differently, and would be held to much more rigorous standards, if their identities were known. I want to suggest, as I'll elaborate later, that full transparency--including knowledge of contributor identities--is actually more truth-conducive than a policy permitting anonymity.

IV. Two reasons for real name requirements

Now I am going to shift gears, and advance two positive arguments for requiring real names in online knowledge communities. One argument is political: it is that communities are better governed if their members are identified by name. The other argument is epistemological: it is that the content created by an "identified" community will be more reliable than content created by an "anonymous" community.

A. The argument from enforcement. The first argument is one that I think you legal theorists might be able to sink your teeth into. Let me present it in a very abstract way first, and then give an example. Consider first that if you cannot identify a person who breaks a rule, it is impossible to punish that person, or enforce the rule in that case. Forgive me for getting metaphysical on you, but the sort of entity that is punished is a person. If you can't identify a specific person to punish, you obviously can't carry out the punishment. This is the case not just if you can't capture the perpetrator, but also if you have captured him but you can't prove that he really is the perpetrator. That's all obvious. But it's also the case that you can't carry out the punishment if the perpetrator is clearly identifiable in one disguise, but then changes to another disguise.

So far so good, I hope. Next, consider a principle that I understand is sometimes advanced in jurisprudence, which is that there is no law, in fact, unless it is effectively enforced. A law or rule on the books that is constantly broken and never enforced is not really, in some full-blooded important sense, a law. For example, the 55-mile-per-hour speed limit might not be a full-blooded rule, since you can drive 56 miles per hour in a 55 mile per hour zone, and never get a ticket. Obviously I am not denying that the rule is on the books; obviously it is. I am merely saying that the words on the books lack the force of law.

Now suppose, if you will, that in your community, your worst offenders can only rarely be effectively identified. You have to go to superhuman lengths to be able to identify them. In that case, you've got no way to enforce your rules: your hands are tied by your failure to identify your perpetrators effectively. But then, if you cannot enforce your rules, your rules lack the force of law. In a real sense, your community lacks rules.

I want to suggest that the situation I've just described abstractly is pretty close to the situation that Wikipedia and some other online communities are in. On Wikipedia, you don't have to sign in to make any edits. Or, if you want to sign in, you can make up whatever sort of nonsense name you like; you don't have to supply a working e-mail address, and you can make as many Wikipedia usernames as your twisted heart desires. Of course, no one ever asks what your real name is. In fact, Wikipedia has a rule according to which you can be punished for revealing the real identity behind a pseudonym.

This all means that there is no effective way to identify many rulebreakers. Now, there is, of course, a way to identify what IP address a rulebreaker uses, but as anyone who knows about IP addresses knows, you can't match an IP address uniquely to a person. Sometimes, many people are using the same address; sometimes, one person is constantly bouncing around a range of addresses, and sharing that range with other people. So there is often collateral damage when you block the IP address, or a range of addresses, of a perpetrator. Besides, anyone with the slightest bit Internet sophistication can quickly find out how to get around this problem, by using an anonymizer or proxy.

That there is no effective way to identify some rulebreakers is a significant practical problem on Wikipedia, in fact. Wikipedians complain often and bitterly about anonymous, long-term, motivated trouble-makers who use what are called "sockpuppets"--that is, several accounts controlled by the same person. Indeed, this is Wikipedia's most serious problem, from the point of view of true-believer Wikipedians.

In this way, Wikipedia lacks enforceable rules because it permits anonymity. I think it's a serious problem that it lacks enforceable rules. Here's one way to explain why. Suppose that we say that polities are defined by their rules. If that is the case, then Wikipedia is not a true polity. In fact, no online community can be a polity if permits anonymous participation. But why care about being a polity? For one thing, Wikipedia and other online communities, which typically permit anonymity, are sometimes characterized as a sort of democratic revolution. On my view, this is an abuse of the term "democratic." How can something be democratic if it isn't even a polity?

There is another, shorter argument that anonymous communities cannot be democratic. First, observe that if it is not necessary to confirm a person’s identity, the person may vote multiple times in a system in which voting takes place. Moreover, if the identities of persons engaged in community deliberation need not be known, one person may create the appearance of a groundswell of support for a view simply by posting a lot of comments using different identities. But, for voting and deliberation to be fair and democratic, each person’s vote, and voice, must count for just one. Therefore, a system that does not take cognizance of identities is inherently unfair and undemocratic. I think anonymous communities cannot be fair and democratic.

But why should we care about our online communities being fair, democratic polities? Perhaps their governance is relatively unimportant. When it comes to whether a link is placed on the front page of Digg.com, or what videos are highly rated on YouTube, does it really matter if it's not all quite on the up-and-up?

Maybe not. I am not going to argue about that now. But matters are very different, I want to maintain, with online knowledge communities, which is the subject of this paper. Knowledge communities, I think, must be operated as fair, democratic, and mature polities, if they are open to all sorts of contributors and they purport to contain reliable information that can be used as reference material for the world. It makes a difference, I claim, if an online community purports to collect knowledge, and not just talk and share media among friends and family.

Why does it matter if a community collects knowledge? First, it's because knowledge is important; we use information to make important decisions, so it is important that our information be reliable. If you are not convinced, consider that many people now believe that false information caused the United States to go to war in Iraq. Consider how many innocent people are in prison because of bad information. These days, two top issues for scientists are also political issues: global warming and teaching evolution in the schools. Scientists are very concerned that persons in politically-powerful positions do not have sufficient regard for well-established knowledge. Whatever you think of these specific cases, all of which are politically charged, it seems clear enough that there is no shortage of examples that demonstrate that we do, as a society, care very much that our information be reliable--that we do not merely have random unjustified beliefs, but that we know.

The trouble, of course, is that as a society--especially as a global Internet society--we do not all agree on what we know. Therefore, when we come together online from across the globe to create collections of what call knowledge, we need fair, sensible ways to settle our disputes. That means we must have rules; so we must have a mature polity that can successfully enforce rules. And, to come back to the point, that means we must identify the members of these polities; we are well justified to disallow anonymous membership.

B. The epistemological argument. Finally, I want to introduce briefly an epistemological argument for real names requirements, which is distinguishable from the argument which I just introduced, even though it had epistemological elements too. Now I want to argue that using our real identities not only makes a polity possible, it improves the reliability of the information that the community outputs.

Perhaps this is not obvious. As I said earlier, some people maintain that knowledge is improved when people are free to "speak truth to power" from a position of anonymity. But, as I said, I suspect that in online communities like Wikipedia, a position of anonymity is used specifically to obscure the truth more than reveal it. Now, in all honesty, I have to admit that this might be rather too glib. After all, most anonymous contributors to Wikipedia aren't trying to reveal controversial truths, or cover them up; they are simply adding information, which is more or less correct. Their anonymity doesn't shield them from wrongdoing, it merely shields their privacy. As a result, why not say that the vast quantity of information found in Wikipedia--which is very useful to a lot of people--is directly the result of Wikipedia's policy of anonymity? In that case, anonymity actually increases our knowledge--at least the sheer quantity of our knowledge.

Can I refute that argument? I'm not sure I can, nor would I want to if it is correct. The point being made is empirical, and I don't know what the facts are. If anonymity does in fact have that effect, hooray for anonymity. I merely want to make a few relevant points.

I think that in the next five to ten years, we will see whether huge numbers of people are also willing to come together to work under their own real names. I don't pretend to be unbiased on this point, but I think they will be. I don't think that anonymity is badly wanted or needed by the majority of the potential contributors to online knowledge communities in general. Having observed these communities for about fifteen years, my impression is that people get involved because they love the sense of excitement they get from being part of a growing, productive community. My guess is that anonymity is pretty much irrelevant to that excitement.

Regardless of the role of anonymity in the growth of online resources, a real names policy has a whole list of specific epistemological benefits that a policy of anonymity cannot secure. Consider a few such benefits.

First, the author of a piece of work will be more careful than if she puts her real name on it: her real-world reputation is on the line. And I suppose being more careful will lead to more reliable information. This is quickly stated, and very plausible, but it is a very important benefit.

Second, a community all of whose members use their real names will, as a whole, have a better reputation than one that is dominated by pseudonymous people. We naturally trust those who are willing to tell us who they are. As a result, the community naturally has a reputation to live up to. There are no similar expectations of good quality from an anonymous community, and hence no high expectations to live up to.

Third, it is much harder for partisans, PR people, and others to use the system to cover up unpleasant facts, or to present a one-sided view of a complex situation. When real names are used, the community can require the subjects of biographies and the principals of organizations to act as informants. The Citizendium does this. Wikipedia can't, because this would require that people identify themselves.

V. Conclusion

I'm going to wrap up now. I've covered a lot of ground and I went over some things rather fast, so here is a summary.

I began by defining "online knowledge community," and showing with a number of examples that online academic communities tend to use (or strongly emphasize the use of) real names. Other sorts of online communities generally permit or encourage anonymity, because there is no career benefit to being identified, while there is a definite interest in privacy. I considered two main arguments (though I know there are others) for permitting anonymity as a matter of policy. One argument starts from the premise that we have an interest in keeping our personal lives private; I admit that premise, but I say that, when it comes to knowledge communities in particular, society has an overriding interest in knowing your identity. Another argument is a version of the classical libertarian argument for anonymous speech. I grant that society needs venues in which anonymous speech can take place; I simply deny that all online knowledge communities need play that role. Besides, anonymity is probably used more as a way to burnish public images than it is to "speak truth to power."

In the second half of the paper, I considered two main arguments (though again, there are others) for requiring real names as matter of policy in online knowledge communities. In the first, I argued that rules cannot be effectively enforced when rule-breakers cannot be identified. This is a problem, because we would like online knowledge communities to be fair and democratic polities; but when community members cannot be uniquely identified, this violates the principle of one person, one voice, one vote. Then I argued that the requirement of real names actually increases the reliability of a community's output. Since we want the output of knowledge communities, in particular, to be maximally reliable, we are well justified in requiring real names in such communities.


A compromise position that I favor would involve requiring real users’ names to be visible to other contributors; allowing them to mask their real names to non-contributors; and legally forbidding the use of our database to mine personal information. This compromise does not settle the theoretical issue discussed in the arguments that follow, of course.


Citizendium: A New Vision for Online Knowledge Communities

Speech delivered at Eastern Michigan University, Ypsilanti, Michigan,
Feb. 7, 2008, as part of the College of Arts and Sciences Lecture Series,
"Wikipedia - Democratization of Knowledge or Triumph of Amateurs," hosted by
Marshall Poe.

Familiar territory

Five or ten years ago, if I were introducing a new wiki encyclopedia project, I would have to argue and explain at great length about the advantages of mass collaboration. And you all would be very skeptical. I would explain how people can come together online from around the world and donate their labor to create something that everyone can access freely, and which is controlled by the contributors themselves. I would have to teach lessons about bottom-up methods and free content. But today, most of you are all firm believers that enormous amounts of reasonably good, if not perfect, content can be created by online communities. Everybody knows what giant online communities can create, because everyone can see the results in Wikipedia, YouTube, and the many other community-built websites.

So my task isn't to explain everything about how the Citizendium, this new project, works, because in many ways it works similarly to many other Internet community content projects. It is open to everyone--or, everyone willing to work under our rules, anyway. It is built collaboratively, by people working together on a wiki. It is built bottom-up, which means no one is assigning articles, and generally, no one in authority needs to be consulted except when really difficult disputes need to be resolved; instead, the people who make decisions about an article are the people who happen to show up. The resulting content is free, meaning anyone can read and republish it, at will and free of charge. And it is run by a non-profit.

This is familiar territory. It would be boring and banal for me to point out that collaboration on free content represents an interesting opportunity. Of course it does. The Internet has been exploiting that opportunity for almost ten years, at least ever since the Open Directory Project got started in 1998. The real question is whether there are any interesting new free content opportunities. And there is, I think. The most interesting unexploited opportunity before the Internet today is high quality and high relevance. In short, if developing sheer quantity of content was the big exciting problem ten years ago, we've licked that one. The big exciting problem now is quality: how to create enormous amounts of high-quality and highly-relevant content. And this is--I guarantee it--a much more difficult problem, and one that not nearly as many online projects will be able to solve.

The problem of quality and relevance

This is a problem that just cannot be solved by "more of the same." For example, simply throwing more people at the problem of quality will not solve it, for the simple reason that many people do poor quality work in the existing community content systems. Simply look at the results that come up from a typical Google search. It is estimated that there are over one billion people online now. If number of people were the answer to the problem of high quality, wouldn't we have a brilliantly pristine Internet? But, of course, we don't. Instead, the Internet reflects a wonderfully diverse humanity, from the lows of porn websites on up to professionally edited, highly interesting content collections, written by some of the most brilliant minds. Now, please don't get me wrong. I think that, for example, Wikipedia is very useful, and the contributions of hundreds of thousands of amateurs is crucial to its usefulness. But there is a big difference between being highly useful, on the one hand, and of really high quality, on the other.

The problem of quality and relevance won't be solved by more of the same. You could make projects even more free--you could release them into the public domain, instead of using a Creative Commons license. But this would not solve the quality problem. And again, you could make projects as wonderfully collaborative as you want--even more collaborative than Wikipedia is now--but that still wouldn't help establish reliability or relevance.

Three principles

Clearly, something really important has been left out of the Web 2.0 equation. What? What needs to be added so that our communities produce content that is not merely abundant, useful, and interesting, but also reliable and relevant?

I have three principles, which I will state briefly first but then elaborate, because it is very easy to misunderstand in all three cases. They are:

  1. Find a meaningful role for experts within the project.
  2. Require contributors to use their real-world identities.
  3. Establish the rule of law by committing contributors to a social contract that makes them full partners in the project.

Adopting these three principles will help transform Web 2.0 into Web 3.0. Leveraged intelligently, these principles will allow an online community to produce high quality and relevance, without necessarily compromising high productivity. They will, in short, help the Internet to grow up.

Let's consider these principles each briefly in turn.

A role for experts in open projects

First, experts are needed to play meaningful roles, in short, because only they can be counted on to recognize when some content represents the latest expert knowledge. Amateurs and dilettantes are sometimes perfectly capable of creating excellent and reliable material on many subjects, especially if they're good writers and researchers; but they are inconsistent in doing so, and they generally lack the expert's ability to judge when some content actually represents the latest expert opinion on a subject. It seems obvious that the intelligent use of experts in a collaborative project can help to improve the quality of the output.

To this there are some common reactions, which I want to address directly, though I don't have time to do them justice.

Whenever I suggest that experts need a place in some online communities, one of the first things someone says in reply is that there's no way to tell who the experts are. But I find this very puzzling. Society has many ways to identify experts. And not all of them are jokes! There are even better ways than "a person from out of town with slides." To identify its expert editors, the Citizendium asks people to send a CV and we have certain objective criteria, such as terminal degrees and publishing, and other relevant evidence of expertise.

A second thing that people often imply, or assume, is that if one makes a place for experts, that will make the community a top-down, command-and-control system, which is a step backwards. Now, I fully admit that professionals of all sorts have a bit of a fetish for hierarchy and bureaucracy. But that doesn't mean that they cannot participate in a relatively flat, bottom-up community. And this is what the Citizendium does. Our editors have the general authority to make decisions about articles, but they rarely "pull rank." They can also approve articles. Neither of these functions compromises the bottom-up, collaborative, productive nature of the project.

Third, there is the confused thought, which is alarmingly common, that the very concept of expertise is somehow passe, and that experts have been somehow rendered unnecessary in a world that could produce Wikipedia. This sentiment is very confused, as I say. It stems from the insight that the open source community, Wikipedia, YouTube, Flickr, and so forth have all been able to produce enormous amounts of interesting, useful stuff--all without experts. This is actually incorrect. All of those Internet projects have produced interesting, useful content in part because they have experts who are comfortable working in a perfectly open system. What is true is that those projects generally do not have expert supervisors, people chosen specifically because they are qualified to manage content of a certain type. But more importantly, the mere fact that interesting, useful content can be created without expert supervision simply doesn't mean that humanity can't do any better. It is very obvious to me that can do better than Wikipedia, YouTube, and all the rest.

Why real names

The second of the three principles I stated above is that we should require contributors to use their real-world identities. In other words, when you contribute to a project, you can't call yourself "hipster45"; you have to use your own real name and identity. You can't lie about who you are. I don't say that this is necessary for every Internet community. After all, there are some people who will simply never contribute under their own identities, because they are concerned about privacy matters; or they don't want to be embarrassed later by their bad behavior online. Sometimes it might be better not to require the use of real-world identities. I admit that.

But in the case of strongly collaborative knowledge community like the Citizendium, it makes good sense to require real names. There are at least three reasons. First, it improves the credibility of the output: people can see who contributed some content, and whether they appear to know anything about the subject. Second, by making people take real-world personal responsibility for their contributions, it becomes possible to enforce rules. When problem contributors can make up a new pseudonym as soon as they get out of line, this makes it in principle impossible to enforce rules effectively. But if you can enforce rules effectively, you can do the work of a project a lot more efficiently. Third, people do tend to behave themselves better when their identities are known and their behavior is out in the open, and good behavior is crucial to a smoothly running knowledge community.

Again, however, there are some common objections to the principle that I want to address. Some people assume that I think there should never be anonymity online. That is simply wrong; I think that anonymity is one of the great advantages of the Internet, actually, and I believe it reinforces the value of free speech. I merely think that, in knowledge communities like the Citizendium, the advantages of requiring real names strongly outweighs the advantages of permitting pseudonyms.

Some might find it unusual that I would claim that the advantage lies in requiring real names. After all, one might well point out that many people will never contribute to the Citizendium simply because we do require real names. And I do have to admit that there are probably quite a few people involved in Wikipedia who will never get involved in the Citizendium precisely because they can't use a pseudonym. How do I respond to that?

Well, I have no data to back me up on this, of course, because it is speculative, but I think that in the long run, there will be more people willing to work as identified, responsible members of an Internet community connected to the real world, than as unidentified avatars, disconnected from the real world. In fact, in the long run, I think there could be more people who will insist on a real names requirement, precisely because it makes the community more mature, and those who use their real names are not at an disadvantage to those who use fake names.

There are also some understandable questions about how we can manage to confirm a person's real name on the Internet. I don't have time to go into that in great detail, but suffice it to say that we merely require some proof. We don't pretend to have an fallible system, which I would think would be impossible to have while remaining truly open and efficient. But so far, very, very few contributors have been exposed as having used an unregistered pseudonym.

The rule of law in online communities

Now to the third principle. Anyone who has spent a lot of time working in online communities is familiar with certain types of problematic characters and certain patterns of bad behavior. Governance of online communities, according to Internet scholar Clay Shirky, is a "certified hard problem." I agree. But what makes it hard is that such communities are generally volunteer communities of equals, and in such communities, it is hard to get buy-in from participants for resting some decisionmaking authority in anyone's hands. I actually think that this is a problem about the Internet's radical egalitarianism. As political philosophers sometimes observe, if you take egalitarianism to an extreme, you've got anarchy. After all, if everyone is supposed to be totally equal, they should all be equal in power; and that means that no one can be a decisionmaking authority, because the decisionmaking authority would have more power than the average person.

Well, I don't mean to go into that, as much as I'd like to. I only wanted to bring up that subject to explain why I think it is so important that online communities adopt constitutions, as it were, just as real, offline communities do. If you think about it, it is bizarre that online communities don't demand this more often, just as offline communities do. Of course, there are many online communities that announce basic ground rules in advance--especially listservs (mailing list discussions).

I think that online communities should go beyond basic ground rules. I think they should require their members to sign onto the rules explicitly, and then give the members a key stake in the governance of the project. In my experience, giving members an active stake in governance gets them personally invested, and great things can result.

But this isn't how many Web 2.0 projects work. Many of them are actually for-profit businesses that essentially exploit their contributors. This has struck me increasingly as a very strange and morally problematic situation. I think that we could be accomplishing a great deal more, and potentially avoid many abuses that plague MySpace and YouTube, if there were mature community governance. But probably, the owners of such websites would not stand for it.

How these principles are interrelated

These might appear to you to be three unrelated principles, but they are in fact closely related and mutually supporting, and together they represent a different vision of what online communities should look like.

I've found that it's very difficult to get experts involved in open projects. Experts generally tend not to take a venue seriously unless it is closed and exclusive. I do note that Wikipedia has had a fair bit of expert involvement, largely owing to the broad influence that it has a resource, being #8 in the Alexa rankings. But I also note that they tend not to stick around for very long. Most experts are not going to stay involved in an open project if their views are not respected, and frankly, their views aren't going to be respected unless it is a project policy that their views be respected. That's because most people simply assume that online communities are perfectly egalitarian, and no special consideration should to be given to expertise. So that's the first principle: respect expertise.

But if there is to be a policy that in some way requires respect of expert knowledge, there has to be an effective way to enforce that policy. So the project first needs to secure the support of participants for the policy, or it will be unenforceable. An excellent way to secure support for basic policies is to require participants to sign onto an explicit "social contract"--that's the third principle.

Some people will go to surprising lengths to disrupt a project--it's literally a hobby for them--if they can hide behind anonymity or pseudonymity. So it isn't enough to get people to say, "I agree with your fundamental policies." The very most disruptive people will say they agree, and then proceed to get the whole community up in arms; some people just thrive on chaos that way. If you attempt to ban such people from the community, but anonymity is allowed, they can and, in some communities, do return--and commit the same offenses all over again. This basically means that anonymity makes it impossible to enforce rules effectively. So if you want to have fundamental rules at all, if you want to have the rule of law, you must require people to reveal their identities, at least to project organizers. That's the second principle.

The growing opportunity

Let's take a step back. Imagine what a successful online community that adopts these three principles would look like. It could still be radically collaborative, bottom-up, free, dynamic, and productive. But it would also welcome experts and give them the credit due to their long years of study and experience. They need not bark orders; they could work alongside the rank-and-file contributors and act as guides rather than as top-down managers. As a result, the quality of the content could be expected to be considerably more reliable, or at least considerably more faithful to the latest expert knowledge, than the typical Web 2.0 project.

Not only would content be more reliable in this way, it would also be more credible. That is because people would be required to use their own real names, and content that comes with a name attached is for that reason at least slightly more credible. I'm sure you all remember the hubbub that the "WikiScanner" caused. For a month or so there was story after story about how different corporations and politicians had removed all negative information about them
from Wikipedia. That of course was a result of Wikipedia's anonymity policy. Well, imagine a more reliable wiki encyclopedia that required people to take responsibility for their additions--and their deletions. The sort of abuses that are epidemic in Wikipedia would be much less likely to happen in the new sort of project.

Finally, consider the community from the point of view of the participants. With gentle guidance from experts and their relative maturity, with the requirement of real names, and with the requirement that people agree to the basic project rules, the community that results can be expected to be much cooler, calmer, and saner. I think of this new sort of community like a friendly, open county fair with expert judges, where many older-style communities resemble a street fight between rival gangs, or a free-for-all barroom brawl.

The new sort of online community I've described is a significant opportunity, as I see it--it is the next step in the development, or the maturation, of the Internet. I think in another ten years, this will be regarded by most people as the only sensible sort of online community, at least for knowledge projects.

The Citizendium experience so far

This is the opportunity that the Citizendium project leverages. We employ all three principles. So, of course, you might be interested to know how we're doing. I will conclude by giving you a progress report.

First, I should clarify that we are open to virtually everyone who is willing to work under our rules. If you give us your real name, convincing evidence of who you are, a coherent brief biography about yourself, and you agree with our fundamental policies, then you're in. We have something like 250 editors and over 2,000 authors registered.

A private pilot project got started in November 2006, and we opened the project up to public viewing and broader participation in March 2007, a little less than a year ago. While thousands of people have created accounts, each month over 200 people edit the wiki, and on any given day you can expect to see 40 different people, who we call Citizens, on the wiki. These Citizens are all named, so that when you examine the Citizendium recent changes page, you see nothing but real names. To someone familiar with regular wikis, this is a very unusual and refreshing sight.

By various measures, our rate of production has been increasing, which is to say that production is accelerating. One way of measuring production is the rate at which new articles are created. A year ago, we were creating about five new articles per day; now we're up to 15 per day, and we're on a decided upward trend. We also added, in our first year, over five million words. That is more words than Wikipedia produced in its first year.

And how many articles? Because some people have uploaded articles from Wikipedia without working on them, we don't take credit for those. We take credit only for those articles that we have started ourselves, which is most of them, and articles to which we've made significant changes. Well, we have 5,200 articles under development, and we added our most recent 2,000 articles in about the last three months. So we are definitely accelerating.

In all honesty, we aren't doing so well approving new articles; we only have about 50 approved articles. I think this is mostly because our editors are more interested in working on new articles than approving old ones. I also think we can make our article approval process much more efficient, and that's something I hope to organize soon, if no one else does.

So much for our productivity. What about our community? What's our quality of life, so to speak? Well, here, I think we really shine. Outside of a short time last year in which we experimented with self-enrollment, we have had virtually no vandalism. That's right, despite being as productive and open as it is, the Citizendium is basically vandalism-free.

As can be expected in any community, online or offline, the Citizendium community has its share of personal unpleasantness. But typically I find people interacting politely and reasonably pleasantly, even when they are disagreeing. I also find very little indeed of what I used to describe on Wikipedia as "trolling"--in other words, hardly anyone ever appears to be disrupting the community just for the sake of doing so, or just to call attention to himself.

There are many developments I lack the time to tell you about, but I'd like to highlight one in particular, because it applies to the university context. Last semester we started a project called "Eduzendium." Essentially, we're inviting college instructors to assign their students Citizendium articles for class assignments. The students get extra help from the Citizendium community with their articles, and are motivated to do a good job not only because their work is visible publicly, but because it will actually be of good use to the whole world. Instructors get a new assignment type in their repertoire. And the Citizendium benefits, of course, from the added activity and content. Well, last semester we had courses at Purdue and the University of the Witwatersrand (in South Africa), and others. This semester, larger classes at University of Colorado, Temple University, and CUNY are engaged in the program. I think the Eduzendium project will inevitably expand, and in a few more years actually become a large source of our content.

Well, that's my report. I think we're doing very well for being about a year old. If we continue to accelerate our growth, you can expect us to be have over 100,000 articles within a few years.

So, why don't you help us toward that goal? I would like to conclude by inviting you all, everyone in this audience, to join the Citizendium and start a new article tomorrow.

Thank you very much.

Home