How to make Wikipedia better (and why we should)

I’ve been involved in a long debate over Wikipedia with a friend who is a respected journalist. His contempt for the project stems from his distrust of anonymous writers and what he perceives as a lack of respect among Wikipedia’s contributors for journalistic standards. He’s not wrong — those who follow the controversy surrounding Wikipedia know about recent scandals. More importantly, however, his views are representative of a large number of influential people who distrust Wikipedia for serious research.

Wikipedia is a good idea. There is a need for a freely available, reliable encyclopedia on the Internet. Commercial alternatives like Britannica clearly have their place. But, if only because users expect information on the Internet to be free, we should be grateful that some people are willing to volunteer their time to make that information reliable.

Are we there yet? The report that Nature Magazine released last December contended that we may be closer than we thought. It is also a positive sign for Wikipedia that prestigious organizations are beginning to take the encyclopedia seriously enough to evaluate its claims. In a follow-up interview during Nature’s Dec. 15 podcast, Jimmy Wales, president of the Wikimedia Foundation and founder of Wikipedia, said his goal is to achieve “Britannica or better” accuracy. Yet he was also modest about the report’s findings, admitting that he neither expected such a positive review nor did he think the level of quality was consistent across all subjects.

There is no question: Wikipedia has a long way to go. In order to make it better, supporters need to shift focus away from isolated articles and genres and first address the system that produces the content. By doing so, contributors will have more tools to ensure the reliability of their articles.

I came up with a list of six easy steps that project leaders could take to make Wikipedia better. It’s not conclusive, but these suggestions arose from my own research and conversations with people who are concerned about the project.

1. Consistently enforce the existing standards

In addition to dozens of clearly written policy pages, Wikipedia has impressive tools for tracing the history of an article through its “recent changes” feature. There are extensive guides on the website that instruct contributors on how to cite sources, format entries, debate controversial passages, and argue effectively. Critics who only know about the wiki format may not understand the standards that the project leaders demand.

The problem? Wikipedia’s policies are not evenly enforced. The scandal over John Seigenthaler’s biography was likely only one manifestation of this problem. Participants in Wikipedia need to find ways to uniformly enforce existing standards on all content. If it’s too much work, then questionable content should be taken offline until it can be addressed. Otherwise, there is no way to claim that the site is reliable.

2. Force editors to take responsibility for their articles by telling us their names

Currently, contributors don’t need to provide any information beyond a user name in order to join the community. Until recently, there wasn’t even a requirement to have a user account in order to edit content.

This suggestion will meet with the most resistance, but it answers the biggest complaint about Wikipedia. The current policy is indefensible. If there is an honest reason to remain anonymous (like the fear of political retribution), then it’s easy to provide an editorial workaround either by having more reviews or clearly indicating that the article was anonymously created for political reasons.

3. Supply references and reasons for content change

Right now, in order to change an article on Wikipedia, after logging in, all somebody has to do is type. Their changes will be preserved along with the article’s history, but the only way to explain the reason for a change happens after the fact, in discussion forums.

What if there was an additional field on the edit screen that forced contributors to back up every new piece of content they added with references and reasons for doing so? It would be another tool that justified the added obtrusiveness with its usefulness.

4. Make citations clear

Like any good reference tool, Wikipedia provides endnotes for authors to cite references. But these aren’t used consistently. Some of the numbered links in the articles (which resemble endnotes) are merely links to other websites, with no bibliographical information at the end of the article.

5. Let users rate contributors

Trust is the key issue — and online it matters even more. Wikipedia could easily make use of a system similar to eBay’s user rating system. Every contributor should have their own page with a list of articles and feedback. Only users with an account should be able to create feedback for other users.

6. Settle copyright disputes before questionable material is published

There is a page on Wikipedia that lets leaders debate whether an image or text is under copyright protection. The problem is that many times, the resource is already on the website. If someone is reviewing possible copyright violations anyway, why not do it before the material is published?

These steps are easy to implement from a technology perspective, but the cultural challenges are significant. Despite recent scandals, Wales said that vandalism and malicious editing are not the biggest problems the community faces.

Far more difficult is getting contributors who are passionate about their content to agree on what gets published and the reasons for doing so. Experts who join in order to donate their time and knowledge to improving content will be forced to defend themselves — and their credentials — against less qualified opponents. It is conceivable that such an affront to their pride will drive many away. That would be unfortunate.

Perhaps even more challenging will be getting the current community to agree to abide by stricter rules. But those who most appreciate the remarkable qualities of the Wikipedia community should be the first to pressure the project leaders to take the simple steps necessary in order to ensure that its articles have been fact-checked, are clear of libel and copyright violations, and meet certain standards of composition and organization. Until these steps are taken, they may never be able to convince critics — who could otherwise be valuable allies — that Wikipedia is more than just a cute toy.

About Ray Grieselhuber

Ray is the Founder and CEO of Firewatching Media, a full-service interactive agency based in San Diego, California.

Comments

  1. All good ideas. Start a list. This is Wikipedia’s, several clicks from the front page.

    As recent as a year ago, the Wikipedia community has some “clever, creative ideas” about how to manage problems, which they’ll introduce reactively (e.g., not allowing an anonymous person to start a new article). Here’s what he said at a Berkman Center conference on credibility a year ago:

    “The openness is a means to the end and if it becomes a problem in certain areas, we’ll find ways to deal with it. I’m not saying we’ll lock the articles down. We’ve got a lot more clever, creative ideas than that, but we’re ready to adapt to deal with those kinds of problems.”

    There’s a lot of ideas one can think up (e.g., show the most recent 5 updates directly ON the article page), but it’s like whistling to the wind. I think they are planning to have a membership to actually have take part in the governing process; that would be interesting to look into. Also they could step up how they classify edits to make them easier to review. I mentioned that in my appeal for information service providers to do a better job making “social data” accessible. Right now the

  2. Magnus Timmerby says:

    Ah, constructive suggestions, nice! Here follows my unconstructive critique of it. :-)

    Unfortunately most of the points above mean more bureaucracy.

    Even in a scenario where all points were in some way enforced, I lack the vision to see that they would do better than eliminate some 70% of the crap and still let stuff like the Seigenthaler scandal through. Some may say this is good, but the problem is that Wikipedia would die the slow death of buraeucracy – the best way to drain any collective effort of all energy.

    Your article has been referred to in various blogs, many probably read it, but the fact that one has to register to comment is a sure way to fend off any interesting comments from true wiki fans. (98.506% of them hate to register.)

    In the spectrum with the ultra-open wiki in one end and the old-school group of elite editors in the other end, the Internet offers room for all kinds of models inbetween those two extremes. In my uninformed opinion, Wikipedia has so far been able to keep its balance. I normally don’t enjoy clinging on to the status quo, but in this case I’d like to think the balance is right.

    I think we must look for other ways to digest inaccuracies. One thing is for sure, in the Information Age we have to accept them. I hope all schools these days teach a heavy dosage of source skepticism – my schools did not pay much attention to that.

    I think that in the end most people who grew up with Internet have no problem understanding the simple fact that Wikipedia is usually correct but may sometimes be incorrect. A few may not understand this, is this a real problem?

    Indeed, what is the real problem here?

  3. re: What is the real problem here: Jimmy Wales has stated that greater accuracy is the goal for Wikipedia. Hence some suggestions from outsiders.

    Also, the fact that one has to register for the OJR stops no one from commenting on it. This factoid is interesting: “98.506% of them hate to register.” 98.506% of whom? Wiki users? Or the core group of ~700 wikipedians, who I thought all knew each other?

    The important discussion will not happen here; OJR will cover other stories to come. It should happen at Wikipedia. And so far, the Village Pump (policy) is the only page I see that addresses it.

    Though I am hard-pressed to figure how anyone can trust the Wiki format for deliberative discussions. Formal discussion systems (linear or threaded) would make it impossible to rub out past posts.

    The discussion on “Unregisted users creating pages” [sic] on the “Policy” page ought to be interesting, but it now only lists 4 comments from 2/1/2006. Which means that there are no permanent anchors for the discussion points.

  4. In your second section, it appears as though you are advocating for greater accountability for Wikipedia editors. Greater accountability does not necessarily require the elimination of pseudonyms, though; for proof, please re-visit eBay’s user feedback system, which functions beautifully despite rampant pseudonymity.

    (Additionally, that section is erroneous; anonymous editors can still edit existing articles, but the recent changes left them unable to create new ones.)

    There has been discussion recently of implementing a user feedback system, but it is still in a very theoretical phase.

    While your other suggestions have merit, they are not likely to be implemented any time soon due to issues of scale. Wikipedia currently has more than 950,000 articles in the English edition alone, and only 807 administrators to handle the worst of the work (deletions, vandal blocks, etc.). Other editors are free to help as they can, though, so it is only through the efforts of more casual Wikipedians that the administrators are made aware of most problems.

    (Of course, since anybody can edit Wikipedia, this means that anybody – including the author – can begin helping in any number of useful ways. To echo a common refrain on Wikipedia, {{sofixit}}.)

  5. Good comments, Jim.

    I was hoping to have a Wikipedean here. Thanks, also, for clarifying the usage policy for anonymous users.

    I’m glad you brought up the “sofixit” response. As Wikipedia grows and attracts more users who want to use it for serious research, I don’t think this response will get very far.

    Wikipedia is a volunteer effort, but sometimes making suggestions as a user is the best contribution we can make.

    The project leadership, because they have established “Britannica or better accuracy” as a goal, will need to find or allocate the resources to do so. I made these suggestions because I believe that they would help establish a stronger baseline of reliability from a user’s perspective.

    If there are too many articles to fix these problems already then I’m afraid in the minds of Wikipedia’s harshest critics the project will never be useful. That’s too bad, because those critics, if converted, could be Wikipedia’s strongest allies.

    Finally, with regard to the ebay user rating system: eBay transactions are atomic by nature. When it’s finished, that’s it. All necessary credentials are verified by eBay itself and that’s what makes the system work.

    On Wikipedia however, users are truly anonymous. In journalism and research a person’s name and reputation are everything. There is no credit card or bank account to verify a writer’s credibility, nor is there an organization or an editor who can stand behind the writer’s work.

  6. Larry Daley says:

    Perhaps rather than editing controls what should be promoted are rival but similar computer constructs, that would offer different views of the same topics.

  7. Larry Daley says:

    I am continuing to carefully monitor Wikipedia sites. And beginning to notice that some “editors” appear to be making spurious accusations, inserting opinion in Wikipedia jargon P(oint O(f) (V)iew, vandalism, personal attacks, etc apparently as an excuse for blocking access for insertions they disagree with.

  8. Ray: That’s true. But because all edits are stored, there’s a public record of each editor’s history. There’s a wider range of formalities in wikipedia writing and editing than in traditional journalism, so the breadcrumb trail can be very revealing.

    It’s not a system that can’t be outwitted, but outwitting it is almost always more trouble than it’s worth. For example, someone could register multiple user accounts with different “personalities”, but unless they spent all their time editing, each account would have fewer edits credited to it, diminishing its credibility. (This leaves aside the challenge of making acceptable-to-wikipedia edits in different “accents”, as it were, and other difficulties our potential astroturfer would face.)

    This is the advantage of relying on *people* — sheer huge numbers of people — as a counterweight to abuse, rather than automated systems. Even the most complex automated system (reputation system, for example) is simpler than any human, and can therefore be gamed pretty easily. A group of humans can be gamed, too, but with much greater difficulty.

  9. Welcome to the 21st century form of Journalism. Like it or not, it’s decentralized, and Wikipedia shows you both the best and worst of that system.

    There’s a couple of things going on here to pay attention to. In a broader sense, it’s a darwinistic approach to an encyclopedia, relying on the numbers to correct the mistakes and inaccuracies. As a very, very minor-league Wikipedian, I can say that the “sofixit” response is indeed the best, only, and most successful one. It relies on no centralized control, but instead on *us*. That’s one of the basic premises of Wikipedia — harnessing the network of users across the globe, instead of a central hierarchal editor pyramid. If you don’t like open source, then don’t use Wikipedia, because that is the very heart of its concept. Go use Britanica.

    “Sofixit” is, in fact, the only reason I started posting to Wikipedia — I spotted some articles in fields I am knowledgeable in that were egregiously wrong, and corrected them. It was either that, or complain about inaccuracies — I put my money where my mouth was. Why is this too much to ask of, well, in this case, the entire World?

    Now on a case by case basis of your suggestions:

    1.) Reliability. Having more people to police the articles would be great. Of course, the difficulty will remain of having enough of them. It’s like having many hundreds of managing editors. I have no idea how the higher end of Wikipedia works, or if any of it’s staff get paid, but if it’s anything like the other internet communities I have participated in/administered/ran, it’s probably a primarily volunteer effort. If Wikipedia became a paying concern as part of, say, an expanding Google empire, maybe that could change. But otherwise? That’s a lot of pain to ask for no compensation.

    2.) Real names. I’d have no problem with this, and I think this is the strongest case that you can make. Restricting membership, revealing true identities, and even providing credentials would greatly contribute to a less fast-and-loose method of handling material on the site. Apparently it is a tradition that many on Wikipedia do not log in or use their real names, and certainly, track record can help legitimize a poster. But real names may just be an inevitable development.

    For those needing anonymity, it would be easy enough to arrange such with the Wikipedia admins, with a decent reason why. Just the same as anonymity is conferred in special circumstances in the dead tree media, it can be in new media as well.

    3.) References. There is now a field, when editing, for the “reason” for the edits; the custom tends to be that the editor will place some kind of reference to what he changed and why there, in a very general sense.

    Adding more fields? That would require reworking the database to include another field, and if realized for *every* edit on the page, that could be a *lot* of fields. Example: in one case, I rewrote an article to the extent that 30% of the finished article was rewritten with the correct facts, and 60% was entirely new. With a total of 90% of the finished work being essentially a “change”, can you imagine how many notes would be required? Far more than the article itself, in word count. Lets add to that the heavy lifting in rewriting the software engine to include the additional fields. I don’t envy working with that size of database.

    4.) Citations. Actually, it’s part of Wikipedia’s policy to require references. However, enforcement is not always consistent, and some articles have few references. They are, however, already supposed to have them, bibliography style at the ends of the articles. So this is an enforcement, not requirement, issue.

    5.) Rating. Umm. No. A thousand times no. Having been in quite a few internet communities, I can definitively say that ratings are a dangerous, deceptive, misleading, pointless tool. Personal disputes, politics, and other external influences almost *always* motivate ratings. Unlike in eBay where you have a financial motivation, and eBay policing you, all you’ve got is a user based system, just as “open source” as the content you want to solidify the basis of. Only this time, people reading can’t read for themselves why a certain user has good or bad votes, and many quickly tire of reading the reasons and figuring out the history. It is very easy to slander someone with a rating system, and there is little recourse. The best rating system is the proof in the pudding — the contributor’s work, which can, if contributed by a registered user, be found without great difficulty.

    And besides, if users were to use real names, ratings would be entirely irrelevant.

    6.) Review prior to publication. Again, the problem is, you’re proposing to change the engine. Now the wiki editor has to await a moderator’s approval before taking change, slowing down the data entry process by large factors. It also means yet more stress being placed on the backs of a more hierarchal staff. More admin, more mods, and still thankless and for free. Would you do the work of an associate editor at a large newspaper for no pay? I wouldn’t. Hierarchy and “open source” rarely mix well if at all; being reciprocals, they often end up paralyzing an effort, rather than making it more successful.

    In closing, a lot of good points. It’s interesting to see commentary on Wikipedia in a more “established” media outlet on the web at last. It promises to be a very interesting project as the decades wear on, and perhaps some of the suggestions which are now impractical will be less so in the future. But if you want “your father’s Oldsmobile” — i.e. a true 19th century style omnibus of solidity — Wikipedia will never be that. It’s a practical, pragmatic, workingman’s encyclopedia, not a scientific journal.

    A last food for thought — how is Wikipedia all that less secure than, well, OJR? Anyone want to bet me I can’t register a false name here and post innaccurate commentary? That’s right — you rely on the peer review of other users to “call out” a troll for being a troll.

    Just like Wikipedia. Only smaller.

  10. Larry Daley says:

    Ben:

    What you say has much merit. However, the function of the mechanism of mass action that you describe applies only to individual input; group input, as you mention in passing, especially from organized government groups (e.g. Chinese censorship apparatus) which have almost unlimited manpower, is entirely another kettle of fish.

  11. Larry Daley says:

    Notice of “Trial at Wikipedia”
    For the next 36 hours or so is found at:
    http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/205.240.227.15#Response

    In this “trial” against 205.240.227.15 is entitled:
    Wikipedia:Requests for comment/205.240.227.15

    The charges (in my view spurious) are:
    1: Section blanking.
    2: Anti Castro POV (point of view) and/or fabrications.
    3: Lying about content of source
    4: Vandalism (attack on other editors)

  12. Magnus Timmerby says:

    Here’s another idea that might push automatic rating one step further: assuming some authentication scheme is in place, let’s say some editors join together in a group, and everytime one of them writes or amends an article, her own name is signed, as well as the group’s name.

    Now, visually, Wikipedia could show, for each article:

    A) The most recent contributor’s signature, and her group, if any

    B) The top three most active contributors in volume.

    C) Next to every name and group name, a colour blob and/or percentage indicating that contributor’s rating, i.e. trustworthiness.

    This way, not only individuals, but also editing teams can make themselves a name for being trustworthy. My thought is that an editing team can make itself a name as dependable as a newspaper’s brand.

  13. Magnus Timmerby says:

    And a small note to Jon Garfunkel: regarding my “98.506%” factoid, it was simply a less-than-obvious joke regarding the dilemma with facts on the net: can we trust them? This particular number I just made up. I signed with my full name. Who trusts me now?

  14. Larry Daley says:

    Well that ended badly, Wikipedia “editors” held a mock trial, in which “evidence” was entered continuously. And although 205.240.227.15 was able to address all matters, in my view successfully additional charges were entered continuously. This process went on until the “editors” ran out the clock… and 205 was banned for 8 days…the accuser “Colle” then proceeded to revise the article on Cuba to his liking. “Colle” eliminating semi-permanently mention of the persecution of dissidents especially the “free librarians, disposing of citations did not like (e.g. that of Ray Bradbury author of Fahrenheit 451) and avoiding and then continuously deleting mention that his principal source for evidence of a putative “Democracy in Cuba” Arnold August was a member of the Canadian Communist Party… etc etc…