USC Annenberg Online Journalism ReviewUSC

Sections
Article Archive
Readers' Blog
Wikis
Ethics
Events Calendar
Making Money
Reporting
Video
Writing
Resources
Register
About OJR
Privacy Policy
OJR Delivered
OJR by E-mail
RSS Article Feed
RSS Blog Feed
Search




Document Reading Made Easy
A new software might help journalists sort through reams of documents in minutes 

For the reporters who have devoted themselves to documents, spending hours reading school board agendas backwards to catch every detail, Murray Craig's invention holds a definite appeal.

Craig, a programmer in Southern California, believes the problem of sorting through reams of government documents can be solved by artificial intelligence. According to a press release, his software "can detect government corruption in five minutes."

The software is several generations beyond the tools typically used by computer-literate journalists these days

The software is several generations beyond the tools typically used by computer-literate journalists these days. While many reporters use keyword searches on documents or hunt for patterns in databases, Craig's software can take documents such as city council minutes and identify potential conflicts of interest. 

Instead of matching keywords, artificial intelligence software incorporates rules within its programming to find results. Basic rules for searching the minutes of meetings include "only members can vote" and "only elected officials are members." With a large enough library of rules, the software can associate not only words, but the meanings of words as well.

Craig developed an interest in searching for corruption after years of wrangling with the town council of Langley, British Columbia, over a variety of issues. Before he retired to the town ten years ago, he had written software that could predict the failure of engine parts in aircraft, based on the flight history of the machine.

His biggest battle in Langley was trying to prevent mushroom farmers from polluting the streams that were feeding his community salmon hatchery. As it stands, the battle over the hatchery has remained unresolved. He left British Columbia three years ago for Southern California, where he is marketing and improving the software he built to fight the town council.

"I think my best work is in creating technology to help people like me," said Craig, who is 50. "Better to create more people like me."

Craig demonstrated the program recently at the office of eNeuralNet, the company that licenses the software. For the demonstration, he searched 12 years worth of minutes from the town council of Langley. He clicked on an alderman's name, then selected from a list of four types of searches on council votes: for, against, absent and conflict. He chose to search for conflicts.

The program, called Minutes-N-Motion, kicked out dozens of instances in which the alderman had been excused from votes. In each instance, the full text of the minutes appeared in a window at the bottom of the screen. In most cases, the alderman had excused himself from voting on a contract for a construction firm that was represented by his law firm. But in one case, he voted for a contract for the firm.

"We've just been through six years of meetings in 15 minutes," Craig said. "For you to step into Langley and be this expert is impossible. This, in the hands of every person, means politicians will have to be more accountable."

He wants to sell the service to news organizations, political campaigns and political activists. The clients for Craig's software have included pharmaceutical companies looking for patterns in the results of clinical trials and a travel agency looking for credit card fraud. The price ranges from $50,000 to $5 million, depending upon the complexity of the job.

Of course, with prices like those, the company is more likely to make progress with political groups than news organizations, whose managers routinely refuse requests for cheaper software.

"Clearly, he doesn't know what newsroom budgets are like," said Steve Doig, Knight Chair in Journalism at Arizona State University and Pulitzer Prize-winning investigative reporter.

Doig, who has used computers throughout his career, said that "I have certainly come to believe that a lot of tasks that reporters do are things that can be automated. If there's a pattern to it, there's no reason why software can't do the tedious work."

Nevertheless, he said, simple keyword searches of documents can handle most of the work reporters need to do. 

Doig does not dismiss the technology, however. "We certainly have seen in journalism that we are the last to discover the good tools other people are using," he said.

Without question, many industries will make use of artificial intelligence as it advances and the price drops. Tom Mitchell, president of the American Association for Artificial Intelligence, said that technology that reads English is in its infancy.

"The tricky part technically is, if you read the minutes of the local town government meeting, those are plain old English text," he said. "We really don't have software that can read plain old English text."

But the researchers are working on it.

"It's going to be a long time before we have programs that can read novels and reflect on their meaning," Mitchell said. "But we already have software that can read Web pages and pull out employee names and phone numbers."

Also, the chief problem with depending on technology will always remain: the results are only as good as the data. Mark Rasch, a former federal prosecutor and lawyer who specializes in Internet law, said that "where it doesn't work is when the guy's brother-in-law says, 'Vote this way,' and he does. The real corruption cases are where there's no recusal at all."

However, Thomas Kemp, chief executive of eNeuralNet, said that if extensive records that outlined officials' family and business relationships were added to council minutes, the program could signal more potential conflicts.

The key to using this type of tool, policy experts said, is to find the context of the questions it raises.

Bruce Cain, director of the Institute of Governmental Studies at the University of California at Berkeley, said that in the demonstration case, the alderman's vote for the contract could have been approved by a city attorney. The search did not show the context.

"The point is that it's a potentially useful tool, but it's not a replacement for high-quality, intensive reporting," Cain said.

He also wondered whether this application of technology could push more deliberation among elected officials into closed sessions.

"The interesting question is, what effect does this technology have on public officials?" Cain said. "It may create a tendency for things to be driven underground."

Even so, the appeal of artificial intelligence could grow in the news industry, where high turnover rates in newsrooms make 28-year-old reporters seem middle-aged. Over time, these types of automated shortcuts could seem economical to media corporations that employ fewer reporters as well as small-scale Internet publishers who scramble to give readers a good grasp of their communities.

Avi Adelman publishes a Web site in Dallas called BarkingDogs.org, an endeavor that made national news recently when the Dallas Morning News asked him not to link directly to the newspaper's stories. 

The problem with Adelman's neighborhood, he said, is that it contains 60 bars and 6,000 people. He covers the effects of this combination intensively.

Among other things, he publishes police logs, reports of mixed beverage sales and photos of bar patrons peeing in public. Adelman, a freelance Web designer with a journalism degree from Temple University, also tracks the dealings of members of the city council.

"If the sun came up on the wrong side of Lower Greenville," he proclaimed, "they would blame me for it."

But his approach to the idea of using artificial intelligence for news gathering is much less brash.

"I fail to see how a computer can do what we already do," Adelman said. He said he hears about conflicts and corruption all the time, by attending meetings, talking with people in the community and taking tips on the telephone.

Members of the public, he pointed out, cannot get access to the types of records most likely to show conflicts, such as internal corporate documents. "John Q. Public doesn't have the power to subpoena those records," he said.

And as far as finding red flags, he figured the software would present far too many choices for him to chase.

"There's too many flags," Adelman said. "There's too many damn flags."

 

Rebecca Fairley Raney is a freelance writer who has written for The New York Times, Writer's Digest, The Atlantic online, Interactive Week and Red Herring magazine.  To receive e-mail notification of her stories on e-government, click here.

 

News briefs from around the world give you the latest developments that affect online journalism.
American Association for Artificial Intelligence
BarkingDogs.org
eneuralnet.com
Find out about Rebecca Fairley Raney's latest stories