Duke University's new Reporter's Lab for investigative tools

When Sarah Cohen looks back at the exhaustive work she and other Washington Post journalists poured into a Pulitzer-winning investigation on child deaths, she sees not just accomplishment but opportunity — to make such work easier, and to enable more of it.

Cohen now is Knight Professor of the Practice at Duke University and director of the university’s new Reporter’s Lab, which aims to be a central resource for developing and sharing technology to improve and simplify the hands-on work of public-affairs reporting. The lab plans to make its software and other resources available to anyone who wants them.

In a recent chat, Cohen told me the project aims to bring technology innovation to in-depth reporting, which she thinks has been left behind even as digital tools have transformed how news is organized and consumed.

Here’s how she put it in a May 16 blog post introducing the Reporter’s Lab:

“For professional and pro-am journalists who specialize in public affairs, the technological revolution passed them by sometime in the early millennium,” continuing that the lab aims “to do for modern reporting what photocopiers did in the 1970s, and e-mail, the Web, spreadsheets and databases did in the 1990s. It will go beyond the hype to test, create, commission or apply new methods to make the hard work of original reporting easier or more effective.”

Cohen’s work also attacks the central question facing accountability reporting, especially the highly valued variety that requires significant time and labor: How do we continue to afford it?

“What I’m looking at is how do we reduce the cost of original reporting without losing anything,” Cohen said

The Reporter’s Lab, which is part of Duke’s DeWitt Wallace Center for Public Policy and related to the center’s computational journalism initiative, has several related goals focused on acting as a central resource for innovation and advances in the core work of in-depth journalism.

“I spent about a year interviewing reporters, editors, technologists, online folks, academic researchers, actual computer scientists, about 100 of them altogether, about what’s needed versus what’s possible using today’s technology,” Cohen said.

She also drew on her own deep experience: 20 years of reporting and editing mostly focused on computer-assisted journalism, including 10 years at the Post.

In a summary describing the lab, formally known as the Duke Project for the Advancement of Public Affairs Reporting, Cohen wrote that analysis of 15 boxes of handwritten forms and other documents for “The District’s Lost Children” series (for which she shared the 2002 investigative Pulitzer) took six months.

“Electronic tools that would have made those documents searchable, extracted the little precise information that was not censored and grouped the recommendations might have cut that effort by a third,” she wrote. “If the analysis were easier more reporters in other cities might have tackled similar projects.”

Part of the lab’s work will be in building, adapting or testing tools for depth reporting. For instance, the lab created a tool called TimeFlow (for reporters to use organizing material on long-running stories), which has been downloaded 1,500 times.

The lab recently hired Charlie Szymanski, app developer and visualization pro. Szymanski worked previously at the National Journal and the Sarasota Herald-Tribune, where his portfolio includes a Knight-Batten Innovation Award winner and impressive work for the paper’s series on fraud in real-estate “flipping.”

Cohen plans to work with Investigative Reporters and Editors to develop a “test kitchen” approach involving the lab’s tools and technology from other sources.

“I think there’s a great need for this,” said Mark Horvit, IRE executive director. There’s no lack of new tools being used and touted, he said, but even the open ongoing exchange at the NICAR-L listserv for computer-assisted journalism sometimes becomes overwhelming with the range of recommendations.

“It’s very hard to keep up with all of that — it’s very hard to know what tools are best for your needs,” he said.

Additionally, as Cohen notes in her project summary, tools for web-scraping, indexing material or doing other key tasks in simpler ways often are too expensive or technologically daunting to be used effectively or broadly by many reporters — especially on deadline.

Cohen plans to build testing and training into the lab and noted that thanks to Duke’s support, “everything that’s being done here is open source and free.”

What's wrong with us?

SUNNYVALE, Calif. – I am on the road today, attending the “NewsTools 2008: Journalism that Matters” conference at Yahoo! The conference is the work of the Media Giraffe Project, which bills the event “a concept/design mashup for journalists, technologists and entrepreneurs.”

Today’s event focused on the ‘concept’ half, with ‘design’ – whatever it might turn out to be – left for tomorrow’s agenda. Organizers eschewed a traditional, panel-oriented format in favor of asking participants to design their own sessions on the fly. That format offers great potential, for keeping topics fresh and audience members engaged. The risk, however, is an event that resembles a junior high school mixer, with everyone glued to their seats around the perimeter, afraid of initiating contact with anyone.

To the organizers’ credit, many volunteers stepped forward, and the participants I spoke with reported the sessions they attended worthwhile, though not revolutionary. I meant to hop between concurrent sessions, but found myself sitting through my first choices until their end, engaged in the discussions.

We will be interviewing some of the participants about their initiatives, in future articles on OJR. And I hope to bring you reports of some of the research and production tools and operational solutions we examine tomorrow in a future column.

But today, I’d like to start with the problems — the challenges and roadblocks that I saw, or heard others describe, during today’s sessions. How many of these sound familiar to you?

Impatience with unsolved problems

I don’t know of anyone who’s launched a website, or other computer application, and had it immediately work well and serve its audience completely. Nope. It’s alpha, beta… then launch, usually followed by a quick succession of revisions and patches. With users providing valuable feedback along the way. Developers accept that a project is not complete until it has had some time to live in the field, used by actual consumers.

This, of course, is not the way most journalists work. They keep their stories internal, password-protected within their newsroom’s publication system, until it’s been desk-edited, copy-edited, sometimes lawyer-vetted, skedded and copy-edited again. That’s created a cultural expectation within the journalism business that one’s product will be complete when it goes to the public.

With the exception of a few early adopters of open-source journalism, the public beta is a foreign concept to most reporters. But a willingness to test, even to fail, in front of the public is a requirement for technical innovation. If you’ve become used to having everything “just so” before sending it our into the world, you’re bound to feel disappointment, then frustration, when that world changes and people rapidly want new and different things to try.

That’s the tone I heard underneath many frustrations expressed today (and at previous industry events). Even when the industry is making progress (with blogging and with online community management, for example) many journalists feel uncomfortable waiting for initiatives to play out in public. Journalists would do better to think like programmers in the sense of recognizing incremental success and not getting too depressed when initiatives fail. Keep what works, learn from what doesn’t and try again, a little differently next time.

Inbred analysis

One participant mentioned that she kept running into the same people at these types of conferences. That’s a problem. The journalism industry typically looks within itself for potential solutions to technical and business challenges online, when it should be looking to people outside the “news” industry who have taken on, and solved, many of the same challenges.

Newspapers have struggled for years to learn how to build and manage insightful, responsible online discussion communities, ignoring the hundreds of individuals outside the field who have built large, well-run and, sometimes, even financially lucrative, forums online. Some newsrooms have struggled to deploy multi-million-dollar content management systems, while open-source developers have created more stable and scalable systems at a fraction of the cost.

The situation reminds me of political parties keep hiring the same losing campaign managers, election cycle after election cycle. Again, it’s time to think like a programmer: If you want a different output, you need to try a different input. Just because someone is engaged in publishing content online that doesn’t carry the “Big ‘J'” Journalism label doesn’t mean that such individuals haven’t learned and can’t teach those who do use that label something valuable about publishing online.

Dinner isn’t all vegetables

Many journalists who whine about their inability to make money online for their “serious journalism” need to take a more thoughtful look at what they are offering their potential audience.

It’s a rare publication that rakes in the cash offering readers nothing but investigative pieces and serious, in-depth profiles. Even The New Yorker runs a hell of a lot of cartoons. Individual journalists may aspire to a career of hard-hitting reporting. But their companies also employ people who are shooting wild art at Little League games, publishing pages filled with comics and Sudoku, and running reader sweepstakes and giveaways.

If you’re going to publish a website, you can’t forget the gimmicks. As one of my colleagues asked, ‘where’s your Wingo?’ What fun, silly, engaging things are you going to do online to help make your potential audience want to spend more time with your website?

As I discussed with several other conference participants over dinner, journalists need to treat their websites like a dinner party. You can’t just dish out a plate of veggies. You need to invite your readers in, chat with them, serve ’em a drink and get them comfortable. Then you can start dishing out the food, including a main course, veggies and dessert.

Reporters who worked as specialists offline need to develop and display additional skills when they move online, including the ability to entertain as well as inform their online readers.

Q&A: Jay Rosen and Assignment Zero

Jay Rosen, a journalism professor at New York University, built NewAssignment.net as a laboratory on the Internet to test whether the same Web-based collaboration that spawned Wikipedia, the Firefox browser from Mozilla and the Linux operating system could spur a new form of journalism.

Assignment Zero, the site’s first experiment in collaboration with Wired, is to cover the small but growing trend of crowdsourcing using crowdsourcing—that is having large groups of people spread across the world working together to report and write about the phenomenon of large groups of people working collaboratively from far flung areas to produce high quality work.

“We are trying to figure out whether you can do open platform journalism and whether there are advantages to it,” said Rosen, who emphasized that this is “just an initial test.”

Rosen spoke to OJR about how the Assignment Zero experiment is progressing and what he hopes to learn from the results.

OJR: Why this particular topic for Assignment Zero?

Rosen: We used the gift of particularity that an assignment like this with Wired gives us. We don’t have to ask ourselves what on earth should we investigate because we have to investigate something that’s of interest to Wired, and so the collaboration helped focus our first efforts and give some shape to our story.

It’s also an initial test that also helped us launch our site. Because it’s hard to think of what should be in your site if you want to do open source reporting. It’s very difficult to think of something in the abstract and try and build something that works for a practice that doesn’t exist yet. So instead of doing that we just built what we needed for this assignment.

OJR: What are the criteria for whether the final product is a success or a failure?

Rosen: And there’s a number of answers to that. The most important result is learning how to practice in this area. That’s all I am trying to do we are trying to learn how to practice in this area. Where you have the many reporting and a few editing, is there a way to do it? What do you need to be prepared for? How do you motivate people to contribute? What can volunteer users do? What do they have trouble doing? All these questions are open questions and so our first imperative is simply to learn a lot about that and to learn the lessons that can only be gotten through practice.

The other goal is the work of journalism at the end is exactly going to be lots of pieces of journalism that we can compare to other forms and we can compare to standard methods. So there will be a Wired.com cover story written by Jeff Howe, who is a Wired writer, that would draw on Assignment Zero and link downward to it and at the same time we will publish a editor’s cut or edited package of features and interviews and articles and close ups–recognizable forms of journalism that can be compared to peer products as it were. So these are all ways of judging what we are doing and I think it will be pretty easy for people to judge. I’m trying to kind of make it easy by doing something recognizable on the one hand but novel for journalism on the other.

OJR: The idea of how open source reporting might work is important given some of the problems that can crop up working in an open source environment. How do you plan to deal with these negative aspects as they come up?

Rosen: Here is the way I think of it. I said we are doing an open platform reporting. We are trying to capture some of the benefits of openness. What are they? Well it’s not a big mystery. It starts with what Dan Gillmor said – readers know more than I do.

So we picked a story where we think users know a lot more than we do because the spread of crowd sourcing and open collaboration is in fact a sprawling story. So we are trying to get the benefits of openness like that crowd is more diverse than we are and has more perspectives than we do. And when you try to gain the benefits of the openness you also know that there is a lot of cost, there is a lot of problems that come with openness. And so working in this area, is by definition trying to capture the benefits and solve the problems or reduce the costs and if a reporter comes along, as many have, and brings up one of those costs and says “what about this?”

Well the answer is almost always going to be the same it’s a problem we are working on that and the solution is going to be different in each case and most of them don’t have magical solutions. They have approximate solutions.

So if you can reduce the costs enough and you can get the benefits it may be worth doing. But I can’t even tell you right now that it is. I don’t know yet, we are trying to find that out. I don’t know that this is going to work. I think it’s the most important thing to mention in this interview. We don’t know yet what the potential is. A lot of people think that there could be potential and I am one of them but by practicing we will reveal some of the problems.

OJR: Based on the volunteers that you have gotten so far, what is it that is driving these people to, as you say, “commit acts of journalism for free?”

Rosen: We hope to have a better understanding of that at the end than we do now, but a lot of them are well aware of the citizen journalism discussion. They see themselves as participants in it. They want to be part of it. It’s sort of like the de-professionalisation of journalism appeals to them, but its not that they are terribly ideological about it. They are not. In some cases it’s somebody who took a few journalism courses in college and so it’s a road not taken but still of interest. Some of them are dissatisfied with the way professional journalism has been conducted. Some of them have an interest in this subject that we are investigating here, and a lot of them we don’t exactly know. We don’t know what they intend yet and we don’t know why they joined and this is not unusual in a project like this.

OJR: There is a certain amount of enthusiasm when things are novel that drives people to want to participate. How will you sustain that interest and enthusiasm from start to finish?

Rosen: Definitely, that’s a major challenge, preventing premature disillusionment. I wouldn’t say we totally succeeded at that and that we have seen that happen. Sustaining involvement is definitely a huge puzzle.

I consider that this participation the part of this work to be puzzles within puzzles. It’s all really fascinating and difficult to understand.

OJR: As a pilot project, everyone is watching Assignment Zero very closely. With the limited resources of independent journalists or small publishers, how might they implement aspects the Assignment Zero model?

Rosen: The whole point of NewAssignment.net is for people to take what we are doing and develop it. That’s why I founded this project. It’s supposed to give its results away, it is itself a part of the gift economy.

And my belief is that since this is simply a set of tools you let people practice in this area and they use these tools the way they want to, they will start inventing things, creating things, discovering things that others will be able to pick up very quickly. So I can write 10 blog posts about how open source journalism could work should work but if I can send people one URL where a smart editor is organizing a group of people they will get it like that.

OJR: There are also hurdles in journalism culture that make this a hard sell to some organizations and journalists even if you were to prove it a success.

Rosen: I could think of a zillion and one ways in which it would be a hard sell. And there are also hidden weaknesses and traps in it that I think will come out. Because there’s a ton of problems–and I can’t stress this enough–with an open approach to reporting.

That’s why everybody loves the idea of blogging as in individual writers giving their opinion. But when we are trying to figure out the right route to reliable information than a whole bunch of new problems arise. And I just wanted to steer right towards the biggest problems because I feel that I don’t really think that I am going to figure this puzzle out. I think it’s going to be someone somewhere looking at what we are doing or reading about it who says to themselves – that’s not the way to do it, you know there is a simpler way. And they’ll figure it out. But that’s fine. Again, that’s what NewAssignment.net is; it’s not a company. Its only mission is to spark innovation. So I have a very simple agenda and I don’t care where the innovation happens.

OJR: Did the fact that you have never been a professional journalist help or hinder when putting this project together?

Rosen: I haven’t been a journalist and so I do approach the routines and rituals of American professional journalism in a more anthropological way and a lot of what journalists do seems very strange to me, but I have made a study of the routines and rituals of the press, and there are parts of them I know better than professional journalists. Not in the sense that I know how to do the job better than them, I don’t. I rarely tell journalists what they should do, in terms of like editing their newspaper or covering their story. Usually they know much better than I do but if you look at parts of their professional life, I know them better.

One of them for example is the legitimacy system that they use to derive their rationale. I studied that and know where it comes from. They tend to just reproduce it you know. It’s nature to them. It’s professional culture to me.

But this thing is not really about that. What I am trying to do with Assignment Zero is it doesn’t really have its reference points in the problems of the newsroom. It really has a different reference point which is the fact that open source projects have succeeded in other areas and so we know people can collaborate online and then pool what they know. So we are trying to figure out can this work in journalism too.

When Tim Berners-Lee invented the Web, he wasn’t inventing a new platform for CNN, he wasn’t trying to put the newspaper out of business, and he wasn’t trying to create a multimedia world. What he was trying to do when he invented the World Wide Web was make it possible for people in a scientific community interested in the same problem to share knowledge and to share data and work together.

So collaboration is not something new in use of the Web; it is in fact the original motivation for creating the Web and it’s in the DNA of this wonderful machine. And so I see what we are doing as springing from original promise of the web which is a democratic promise. It’s the idea that we are stronger together if we share what we know. Than if we are atomized and alone. And it’s not really the obsessions of the newsroom that gave rise to what I am doing now.

OJR: How do you plan to handle info overload in this project as it flows in from the legions of volunteers?

Rosen: If you invite participation that generates a lot of activity and that activity generates certain costs, like returning e-mail, for example, and if you simply try to absorb those costs by hiring more people your project runs out of money within a week. And so this point has been reached many times in open source projects and the way they work and the way they scale, as people say in the valley, is that you have to convert some of those participants to organizing the others.

And those people frequently called super contributories if you look in the literature on online organizing you’ll find that these are key players right in your volunteer core. And so that’s what you a have to do you have to configure participants so that they themselves absorb the cost of organizing other participants. I can’t say that we have done that completely yet but we are highly aware of the problem.

One of the coolest things I think in NewAssignment.net and this is something that I am going to develop more of as I go along is we have a director of participation. Her name is Amanda Michel and I got her from the world of online organizing and politics. She worked in online organizing for Dean and for Kerry. And she could have worked and made quite a bit of money actually doing the same thing in the ‘08 cycle but she is more interested in the media side of things.

And so I went out and this is the person I found when I went out looking for somebody who would actually organized people horizontally on the net to work under high pressure conditions where being wrong could have consequences right. That’s what I had wanted somebody who had done that because that’s what kind of situation we are in. So the director of participation their job is to organize people while the editors who are much more traditional figures organize the story. And learning how those two jobs work and how they those two people can work together is another thing that we are trying to discover here. And there has never been a need to organize people to report stories except for the news people. This is all a whole other kind of thing and you need somebody working on that. You need someone who is constantly removing barriers making participation easier because participation always has costs and they can get high very quickly either for you like the institution doing it like Assignment Zero or for the participants.

And if you are battling those costs constantly your project quickly becomes unworkable.

OJR: What’s the timeline for you the work that comes from this reporting?

Rosen: Yeah there will be we are looking right now… this could change of course. But right now we are looking at about June 4th or 5th to publish and so everything has to work backwards from there.

But we are going to have a filter and we are going to have… hope to have a verification round but we might end up like with two days of back checking by crowd you know what I mean. Throw everybody at what we need to check really we don’t know yet but that’s exactly why we are doing this and journalism doesn’t happen until the familiar structures of bylines and deadlines and you know…

So basically we are going to let 40 people steer 40 pages through to completed text and publish the best of them and Jeff Howe will do an overview.