The readers will have the final word

Two examples today further drive home the lesson that the journalism media no longer provides the final word on the day’s news, thanks to the Internet.

Example #1

There’s a stunning story out of the St. Louis area, involving a crowdsourced online effort to get around a newspaper’s editorial decision in covering the suicide of a local teen.

Gawker Media’s Jezebel blog appears to have amplified the controversy, which was brought to the attention of the national journalism community via a letter to Jim Romenesko’s blog on Poynter.org yesterday.

Steve Pokin of the St. Charles Journal broke the story of Megan Meier, a 13-year-old who had some trouble (like many teens) but was reportedly turning her life around, in part due to the friendship of a boy she’d met on MySpace. But when the buy turned on her, insulting her, Megan was devastated, then took her life, Pokin wrote.

The twist? The boy didn’t exist. ‘He’ was the creation of the mother of one of the girl’s former friends. But the Journal didn’t name the woman, citing concerns for *her* teen daughter.

Jezebel and other bloggers went nuts, and soon, they’d uncovered the woman’s name, her address, phone number and business registration records and plastered them all over the Web.

The lessons for journalists? First, we can’t restrict access to information anymore. The crowd will work together to find whatever we withhold.

Second, I wonder if that the decision to withhold the other mother’s name didn’t help enflame the audience, by frustrating it and provoking it to do the work of discovering her identity. That frustration may have helped amplify the negative feelings toward this woman, further aggravating up the virtual lynch mob.

Interestingly, no blogger or commenter I’ve found has said anything about the other mother’s daughter, the girl the Journal was trying to protect. And I find it hard to believe that the kids in the local community didn’t already know the identity of all the persons involved. By withholding the name, the Journal might have created a larger controversy from an already tragic incident.

Example #2

The Los Angeles Times this morning advanced the L.A. Auto Show, noting, quite correctly, in my opinion, that these shows are more about creating long-term buzz than immediate sales.

From the story:

All this, though, is not for you, dear reader. It’s for us, the media.

Before the show opens to the public, thousands of reporters, photographers and bloggers are wined, dined and entertained for two days by the kingpins of Detroit, Tokyo and Wolfsburg, Germany, all in the pursuit of good press.

Among the heavy hitters in town this week: Ford Motor Co. Chief Executive Alan Mulally; Nissan Motor Co. and Renault CEO Carlos Ghosn; and General Motors Corp. Vice Chairman Bob Lutz. They met and supped with journalists Tuesday, made speeches and shook hands Wednesday, and were gone, along with the reporters, by Thursday. Party over.

And the people who actually buy these cars? Little more than an afterthought.

The article misses the point that the reporting on the show does not stop at the conclusion of media day. Many visitors outside the traditional media will post photos, videos and write-ups to thousands of blogs and discussion boards during and after the show, extending and shaping the buzz that the newspaper, TV and, yes, some invited blog reporters kicked off this week.

Leaders in many industries understand that coverage opportunities now extend beyond traditional news organizations. A spokesperson for Universal Studios’ theme parks told me back in 2004, “we don’t so much care about coverage in the L.A. Times; we want to be on websites. That’s where our customers are.”

Media days are just step in a modern publicity campaign. If this year’s auto show, or any such event, is to be a success, it needs to elicit strong coverage from consumers across the Web, perhaps even more so than from the pros in print and on air.

Bob Cauthorn returns with CityTools

Newcomers to online journalism might not recognize the name “Bob Cauthorn.” But to industry geezers like me, Bob was the guy you could count on, back in the late 1990s, to rip newspaper companies for their ham-handed, clueless approaches to the emerging Internet marketplace. Bob could be profane, abrasive and loud… but time has shown that he was almost always right.

Then, after stints at a couple of newspapers, Cauthorn essentially disappeared from the industry scene. He went off to some start-up called “CityTools,” which produced… well, many us weren’t quite sure.

Now, Cauthorn’s back. CityTools is ready to launch, and Cauthorn’s ready to show off his new baby.

In short, CityTools is a social media framework for publishing news articles, lists and classified advertisements. Cauthorn demo’d for me a platform that serves both newspapers as well as independent and individual publishers.

Newspapers could use CityTools as an ad hoc wire service, to create with other papers online portals on topics of mutual interest. Interest groups could use the platform to manage collaborative publications. Readers can build lists of their favorite… whatever, and share those lists with others to create aggregated “favorites” lists from designated communities.

And, of yeah, the platform supports stories, ads and lists in multiple languages. Speak English, Spanish… and Swedish? CityTools will let you read, create, order and distribute content in all three, at once. Registered users can declare which of 13 supported languages they read, and select which one they want to use as their primary language while navigating the site. They can also select their community, which will deliver them content and ads tagged to that community, while allowing them to use breadcrumb trails to navigate to content from all other CityTools communities.

It’s loaded with cool widgets like this, so my inner geek demanded that I get the scoop. I talked with Cauthorn on the phone earlier this month, and an edited transcript follows.

OJR: You were raising hell in the newspaper.com world there a few years ago and then just kind of disappeared into CityTools. Bring us up to speed on what you’ve been up to.

Cauthorn: I went into the lab. After I left The [San Francisco] Chronicle, I went backpacking along the Pacific Crest Trail and did a lot of thinking about the state of journalism and online newspapers and stuff and, as you probably know, I was one of the very earliest people doing what we now call social news. Back then we didn’t really have a name for it, you know, we’re just doing the community front page which allowed people to decide what was on their front page and share links and vote on things and – but all the stuff that has now become commonplace with Digg and whatnot.

I was thinking a lot about the need for a new kind of journalism online as well as the kinds of things that may help, you know, existing print newspapers to survive. And when I say print newspapers it’s because even though they have online operations, they’re still thinking so much like print operations, you know, and so after, you know, sort of both literally and figuratively going to the mountain, I came back and decided to try to re-imagine this stuff from the ground up.

So that’s what I’m focusing on right now.

On the newspaper side, what we’ve created is what we think is an extraordinarily interesting and brand new thing. We’re giving newspapers the ability to very easily set up ad-hoc wire services if you will, to share content with other newspapers of a like mind as well as to share classified ads.

OJR: I think one of the distinguishing characteristics between let’s say, first generation online publishing versus traditional offline publishing has been that the focus of offline publishing, local newspapers, has been geographic. A lot of early online publications have been organized around topic and they’ve been geographically agnostic, if you will. They don’t care about where you are in the world, just what you want to talk about. And what you’ve just described here seems like it is taking the geographic-based local newspaper and moving it into the more topically based world where you’re creating topic – you’re creating topical networks for local communities so you’re no longer just about the Fort Lauderdale community, you’re about boating.

Cauthorn: Well, geography is still important. What we’re trying to do though is we’re trying to say, “Look. Let’s imagine content as a palette of colors.” Right now we’ve had a very limited palette. You’ve got what the wire services give you and you’ve got what your local folks generate and of course with layoffs and stuff like that, that palette of colors that your local folks is generating is getting less. And what happens is you say, “Okay fine. Why don’t we expand that palette by borrowing colors from other people?”

Let’s use agricultural reporting as an example. The fact of the matter is that agricultural reporting across the country, the numbers have been shrinking and shrinking and shrinking. Right? Because the newspaper has to make a choice between covering agriculture, even if you’re an agricultural market, and covering the statehouse, they’ve got to cover the statehouse. It’s just their natural bias. Whether or not that’s relevant to the reader or not, who knows? But it’s a natural bias.

So what happens of all of a sudden you say, “Okay, but you know what? So we’re not doing a great job of covering all agriculture in our area, but you know what? If we combined four cities, let’s say all the small newspapers in the Imperial Valley, and say okay, we’re gonna share our agricultural coverage and ou can put it online or you can put it in print. It doesn’t matter. It’s up to them. All of a sudden, you’ve got a rich, brand new product that really resonates for the local audience. And guess what? Google can’t match. There’s no way a mass aggregator can match that.

OJR: Let’s talk about some of other folks that are out there, in this spectrum of social media, from earlier sites like Backfence to Topicx to whomever the Knight Foundation’s gonna be funding this year and next. What have you got going that you think distinguishes CityTools?

Cauthorn: Up until now what’s happened is that sites have enforced their view of what local is. So, you say, okay, this site is about Pima County Arizona. That’s our local view and that’s it. And it may be part of a network where you have Pima County here and you’ve got Maricopa County there, but if you’re on a Maricopa County site you don’t see the Pima County stuff. If you’re on a Pima County site, you don’t see the Maricopa County stuff.

What we’re doing to begin with is we’re saying, “Look, what we need to do is put the definition of what local is from the perspective of this site in the hands of the user.” We talk about personalization but what I want to start talking about is context of your life. The user has a context of their life and their context is that I might identify myself as being a local to the Bay Area, but my next-door neighbor might think of San Francisco only as where their local context is. How do you build a site that responds to both of those people’s concerns in a fluid manner? That’s what we’ve built.

So what happens is that, for example in Brooklyn — I think we’ve got twelve or fifteen neighborhoods in Brooklyn, specific neighborhoods. So let’s say you’re looking at Bensonhurst’s stuff. You’re reading a restaurant review in Bensonhurst and you click on Bensonhurst, say, “Show me all the restaurants you got in Bensonhurst,” because what we allow you to do is combine. I don’t know the context. I’m gonna allow you to set the context. Right?

So you say, “The context I’m interested in is Bensonhurst and I want to see all the restaurant reviews in Bensonhurst.” Well, everybody’s posted a restaurant review in Bensonhurst, there they are. If there’s not enough content, and if you think, “Oh well, wait a minute, I’d like to see all the restaurant reviews in Brooklyn,” all you got to do is click Brooklyn [on the page’s bread crumb trail] and suddenly, bang, you get everything in Brooklyn.

OJR: One of the distinguishing characteristics about my hometown, the L.A. area, not that it isn’t beginning to happen in other metro areas as well, is as you go by neighborhood to neighborhood, you’re not just changing geography, you’re also changing, literally, the language spoken by the people in that neighborhood. Tell me a little bit about how CityTools is accommodating language differences.

Cauthorn: We currently support 13 languages. And we believe, we’re not sure about this, but we believe we’re the first multilingual news site in the world. Up until now, if you speak Spanish and you’re in Los Angeles, you have the choice of an English language newspaper or a Spanish language newspaper, either in print or online. But I go down to the mission in San Francisco and you hear people freely mingling Spanish and English together. That’s the context of their life. Right?

So what we do is we allow you to say, “Okay, I only want to see Spanish language content in East L.A.” So you’ve got it. If you’re comfortable in Spanish and English, you can have Spanish and English and it’s freely mixed in there.

Now think about this in terms of business model, what happens when you have bilingual classifieds? Imagine what would happen if the Hispanic community in Los Angeles had the ability to say, “Okay, I want to see classifieds in Spanish or English.”

That’s what I’m talking about when is say context. I want to know where you live, I want to know what languages you speak, tell me what you’re interested in. I will change the nature of the site to match those things. This is a big deal, we think.

Now, that’s – so that’s all really powerful, but then we get into some other stuff that also we believe is quite new. And you’re getting back to what distinguishes us from the other sites that have come before. We have this entire group publishing model that anybody can create what we call teams.

Let’s say you have a class full of journalism students and you create a team for that class and they write their stories and they assign them to their team. Now you have flexibility. You can I want it to appear with other team stories, but I don’t want to allow the team members to edit it. Or, you can say I want it to appear with other team stories and I’m gonna allow other team members to edit it. Because we have a draft and edit mode, what happens is that the students can write their stories in edit mode and then they can submit them to the teacher and when the teacher says that they’re good enough, then the teacher can say, “Okay, publish that one, publish that one, publish that one.” It’s just click, click, click, click, click and they get published.

Now here’s what’s slick about that. So all of a sudden what you have is you have got a workflow that resembles an existing news room. Right? But what’s slick about that is two things. One, every university in America is part of our geographic database. So let’s say this is at University of California-Berkeley. Let’s say they assign these stories to the geography of University of California-Berkeley.

All of a sudden then, you’re looking at collaborative group output of content which is tied to a place. And what’s really slick about it is that they can also put those headlines on their own sites because we give you code you can just cut and paste this code on and anytime that your story’s on CityTools, it gets updated on your own site.

Why does that matter? Here’s why. What we’re trying to do is we want to help nonprofits and community organizations, parent teacher organizations and stuff like that. None of them have the ability to conveniently and quickly update content on their own websites on a regular basis. Right? So what we’re saying is all you have to do is put this code in and once you start using CityTools, automatically those headlines will go over on your site, styled the way you want them, looking the way you want them.

But here’s where it gets really cool. So you and I have this organization working on leukemia. And let’s say we have a constituency of 3,000 people out there who have an interest in leukemia. All of a sudden, we can open up a public team that is tied to the organization and we can invite all of our thousands of people to join. So if you’re an activist – imagine if an activist organization, such as anti-war organization, said, “Everybody join this big team,” then you’ve got 1,000 people looking for stories about anti-war stuff every single day. And, by the way, it also shows up on your own website. Suddenly, that gets interesting.

So we are hoping that what’s gonna happen is we’re gonna start to engage people in the context of their lives – again, getting back to this word, context. Tell me what organizations you belong to and I will help you make life in that organization better.

OJR: Getting more into this idea of the crowd, tell me more about the kind of collaborative list building technology that you’ve built in here.

Cauthorn: When I was on the mountain I was walking down a trail and listing things in my head and I said, you know, if I got two other people doing this, I could build a consensus and that was when I went, “Oh sh-t.” What we do is that we allow people to create rank lists and these rank lists can be about anything. By itself, this is not unknown, it just hasn’t been done in this context.

What we can do is allow you to say, “Okay, here are – here are my five favorite Italian restaurants in all of Los Angeles.” And, by the way, you can adapt that by neighborhood if you want to, and you can do it in Spanish.

But then what happens is somebody else comes along, because none of us can resist a good list. And they go, “Oh no, Robert’s list was good, but he missed this, this and this and I disagree with the order.” So what they can do is what we call linking lists. When you read the list, if you’re a member, you just click, “I want to link to this list,” and create your own list.

Now [the lists] are part of a family and what happens behind the scenes is that we do some heavy lifting on text analysis and we look at the item titles and then we say, okay, we then can allow you to create a consensus view of what the best Italian restaurants are by merging them together.

For example, let’s say there’s a restaurant that you call Paizano and I call it Il Paizano. Our system will recognize that you’re talking about the same place and so Paizano appears on both lists. As you know, consensus building algorithms are not unknown. This is pretty well established, but nobody’s applied them to lists before we believe.

So all of a sudden what happens [on CityTools] is that then you the reader can say, “Hey, here’s Robert’s list and here’s Bob’s list. I want to see the consensus. Show me the ranked view of what both lists think is the most important.” And that’s cool if it’s two people. It gets really, really interesting if you have 25 people doing it or 100 people doing it and then it get really, really, really interesting if you can bring it up by geography.

Now imagine if the PTAs in San Francisco all put in their lists of their greatest needs at their school and they link them together. With one click then a reader can say, “Show me what the most serious needs are in the schools.” No one’s every been able to do this before. And we’re allowing people to determine the context in which it’s done, certainly they can say, “Okay show me what are the worst needs in San Francisco.” Oh guess what, you can expand the view to show me the rank list of the needs of schools in the entire Bay Area.

This gets powerful. I mean that is magic, man. I mean think about what this can mean for a society.

You start to pull these things together and what you’re looking at is a sandbox for community interaction that hasn’t existed before. Up until now, here’s what we had: You had UGC [user generated content] sites where people can create stuff, or you had shared news sites where they could share news. Okay. That’s fine. We do both. We say, “Look. You go in both modes, because sometimes you want to write stuff. Sometimes you want to read stuff.” Okay. There are a couple of sites out there where you can make lists, but you just write lists down. You can’t tie them together. You can’t link them together. You can’t do this other stuff that we’re doing.

When I was doing my big backpacking trip and thinking about this stuff, I decided, on a very cold night in the Sierras, to peel back newspapers to their essential core. You know? And part of that essential core has been creating marketplaces.

But the other part of it is this entire connective tissue argument is the way in which our reporting and the reading of those reports connects individuals to one another.

That’s what we’re trying to do: to get back to that essential core of allowing people to create these connections between the writer and their audience, between groups of people who are trying to get something done in a community.

A journalist's guide to crowdsourcing

Last week, I had the pleasure of conducting some training sessions for the staff at the Orlando Sentinel in Florida. I spent the morning and lunch sessions talking with Sentinel reporters and editors about blogging and discussion forums, and the final session of the day was on my favorite online journalism topic: crowdsourcing.

Few journalists, at the Sentinel or elsewhere, know much about this topic, save, perhaps, for the fact that it’s become one of the industry’s hotter buzzwords. But I believe that crowdsourcing might, in the end, have more of an effect on all forms of journalism than anything else that’s come out of the online journalism revolution.

That’s why I decided to put together this introductory Q&A about crowdsourcing, for OJR readers.

What is crowdsourcing?

Crowdsourcing, in journalism, is the use of a large group of readers to report a news story. It differs from traditional reporting in that the information collected is gathered not manually, by a reporter or team of reporters, but through some automated agent, such as a website.

Stripped to its core, though, it’s still just another way of reporting, one that will stand along the traditional “big three” of interviews, observation and examining documents.

The core concept is not new in journalism. At its heart, modern crowdsourcing is the descendent of hooking an answering machine to a telephone “tip line,” where a news organization asks readers to phone suggestions for stories. Or asking readers to send in photos of events in their community.

Such methods require substantial manual labor to sift through submitted material, looking for information that can be used well in a story. Which makes them only marginally more effective than traditional news reporting.

True crowdsourcing involves online applications that enable the collection, analysis and publication of reader-contributed incident reports, in real time.

What are some examples of crowdsourcing?

My favorite example comes not from a news organization, but the U.S. Geological Survey. Its “Did You Feel It” feature builds detailed “shake maps” illustrating the intensity of earthquakes by zip code, through thousands of volunteer reports submitted online by readers.

A simpler example, but very popular this summer, is GasBuddy.com The site won’t win any awards for soothing graphic design, but it allows readers in more than 100 communities to share real-time reports on gas prices in their area.

I built my first crowdsourcing news feature in 2001, on my theme park website. “Accident Watch” built a reader-written database of injury accidents at U.S. theme parks, in the absence of federal or significant state incident data. Readers submitted reports of injury accidents that they’d witnessed or read about, with reports from just one reader labeled “unverified.” A second report of the same incident from another reader or link to an official police, court or park report or a news story was required for a report to be labeled “verified.”

How can I be sure this information isn’t bogus?

In a true crowdsourced project, information is not verified manually by a reporter between submission and publication. Which inspired concern from many traditional reporters.

A well-designed crowdsourcing project, like a well-edited newsroom, can discourage bogus submissions while minimizing their influence if accepted. Here are my suggestions to avoid bogus data in a crowdsourced project:

  • Request the reader submit personal identification along with the report. On “Accident Watch,” readers must be registered with the site, which requires e-mail verification, in order to submit a report. The earthquake project requires a zip code and requests a reader’s name, phone, e-mail and street address. Asking readers to identify themselves sends the message that you take this project seriously and that you wish them to do the same. Obviously bogus ID allows you to flag bogus records for deletion with ease.
  • If your project publishes individual reports, provide other readers with an opportunity to dispute or verify each individual report. The empowers your readers to help clean your data for you.
  • Even if you are publishing data only in aggregate, be aggressive about encouraging readers who dispute that data to add their report to the database, as more data should help move the mean toward the true value.

    How is crowdsourcing different from polling?

    Obviously, you do not have a controlled random sample of the population in a crowdsourced project, as you would with a carefully executed poll. But that does not prohibit you from collecting accurate and engaging data through crowdsourcing. You just need to be careful in identifying whether a specific project works better with polling or crowdsourcing.

    Polling’s great for constructing an accurate portrait of a community’s demographics, attitudes and behavior. Crowdsourcing’s great for incident reports, which might be incomplete if limited to a small random sample.

    Either the incident (the roller coaster crash, the bottled falling from your kitchen shelves, the three-buck gasoline) happened, or it didn’t. But the more people you have “on the ground” as potential sources in your crowd, the more data points you can collect. If you poll only a few hundred people, you’ll miss incidents.

    Think of another great crowdsourcing project: missing/safe person lists following a disaster, such as Hurricane Katrina or 9/11. A random sample would get you only the family and friends of your sample, instead of the many thousands more who want and need information about their loved ones.

    At the same time, be careful about drawing broad conclusions about community behavior based on your crowdsourced incident reports. Don’t ask people about their income, education or even exercise habits, them claim that your numbers represent the entire community you cover. Use a traditional random sample survey to reveal that kind of descriptive data, instead.

    Do I have to learn programming to do this?

    No, but if you want to attempt a true crowdsourcing project, someone in your newsroom will. Free online survey tools and mapping websites can help you collect and publish great reader-contributed data. But if you want custom information to move from survey form to published report in real time, you can’t do that yet without a programmer on your team.

    Ultimately, journalism is social science, and journalists who want to make best use of crowdsourcing need to get familiar with the mathematics of social science. The interviewing and document searches of 20th-century investigative reporting will look incomplete as savvy journalists and newsrooms learn to harness the Internet’s wide reach and interactivity to gather massive databases that only formal social science techniques can effectively manage and analyze.

    Isn’t this just citizen journalism on steroids?

    Consider crowdsourcing a fork in the citizen journalism movement.

    Unlike more traditional notions of “citizen journalism,” crowdsourcing does not ask readers to become anything more than what they’ve always been: eyewitnesses to their daily lives. They need not learn advanced reporting skills, journalism ethics or how to be a better writer. It doesn’t ask readers to commit hours of their lives in work for a publisher with little or no financial compensation. Nor does it allow any one reader’s work to stand its own, without the context of many additional points of view.

    For those reasons, I think, crowdsourcing ultimately will revolutionize journalism as “citizen journalism” efforts that rely on more traditional reporting methods fail.

    * * *
    Interested in learning more about crowdsourcing, news blogging or managing reader-contributed news content? Bring OJR to your newsroom for a custom staff training session. Contact editor Robert Niles via rniles[at]usc.edu for rates and more information.