Making smarter readers: Lessons in Trilliscience

Last week, I was chatting via IM with my brother about a party he hosted. Still in college, he’s 7 and a half years younger than me, so when he described the music playlist at his party as “crunk,” I had no idea what he was talking about. But in less than a second, I not only knew the meaning of “crunk” but was able to respond intelligently with: “As in Three 6 Mafia?” I had achieved instant context and instant application. How did I do it?

I was Trilliscient.

A year ago this April, Cerulean Studios, the makers of the popular chat client Trillian, released a new feature as part of version 3.0 of its software: integration with Wikipedia to provide instant lookups. Not to worry — this is not another article about Wikipedia. But what is remarkable about this enhancement is why Wikipedia was chosen for this feature and what the latter could mean for the future of content delivery.

Here’s how it works. Every word that users type in their chat session is scanned and compared against Wikipedia’s database of articles. If a word or phrase matches the title of an article on Wikipedia, that keyword or phrase is underlined in the chat window. When the user hovers over the link with the mouse, a small window (tooltip) displays the first few paragraphs of the entry. It’s a simple but potentially revolutionary innovation. Appropriately, an entry on Wikipedia notes that the effect of this innovation is called Trilliscience, a wordplay on “Trillian Omniscience.” Wikipedia’s definition of Trilliscience is “the instant application of previously unheld knowledge as the result of passive searching.”

So far, Trilliscience hasn’t revolutionized anything. It’s still basically confined to its original, solitary application in Trillian. But as media organizations continue to struggle with the opportunities provided by the Web, two closely-related industries are rapidly maturing: search engine optimization and search engine marketing. Understanding the techniques developed by these industries and the example provided by Trillian could help media organizations (independent and otherwise) both inform readers and more efficiently monetize content.

Third time’s the charm?

Cerulean Studios was not the first company to try to embed contextual references. Microsoft tried with its Smart Tags, successfully annoying the hell out of an entire generation of users. The feature — a rollover list of related links to be chosen by Microsoft and its partners — was originally intended for use with IE, but was withdrawn after criticism. Smart Tags are still available within Microsoft Office, but people who make use of them must be few. Smart Tags embeds all sorts of useless and intrusive actions such as “Send Mail” and “Schedule a Meeting” into the word processing interface. If I want to schedule a meeting, I’ll open my calendar. I’m in Word because I want to type.

Characteristically, Google was smarter (and scarier). In a new version of its Toolbar released early last year, a feature called AutoLink made its first appearance. AutoLink, while controversial due to its ability to modify and add links to a Web page, does embed links that are useful. The feature can create links to Google Maps for street addresses and ISBN lookups on Amazon for books. (It’s worth noting that Jon Udell was doing contextual ISBN lookups over three years ago.) Google’s links are less intrusive than Microsoft’s Smart Tags. Smart Tags display a block with a big triangle in the middle if you have the nerve to brush a word with your cursor. The links that AutoLink creates are just links.

But the integration that Cerulean Studios added to Trillian with Wikipedia is different from Smart Tags and AutoLink. And, from both a reader’s and a content provider’s perspective, the Trillian feature is more useful.

Here’s an example. Say I’m reading the New York Times online. I come across the name Mohamed ElBaradei and I don’t know who he is. Newspapers have an easy answer for this: they simply explain it using a handful of words. But what if, as an online reader, I am suddenly more interested in learning about Mohamed ElBaradei than I am in finishing the story? As the publisher, do you want me to type “Google.com” into my address bar and navigate away from your page in order to learn about Mohamed ElBaradei? No, you don’t. You want me to stay on your site and view more ads. So how do you meet the needs of the user (to learn more about Mohamed ElBaradei) and satisfy your own need to keep users tuned in? Make readers Trilliscient.

Making it work

Of course, implementing Trilliscience would require publishers to make site-by-site determinations as to the best interface and source of information in the rollover. For Trillian, Wikipedia served this purpose. Because it is not part of Cerulean Studios’ business model to provide content, the company outsourced it. That provided a benefit to Trillian users by allowing them to have smarter conversations, more efficiently.

Why Wikipedia? There are a number of reasons imaginable, some of them technical, and some of them legal. First, Wikipedia provides content (of questionable reliability) for free. Nobody is going to get sued for integrating Wikipedia content into their website or application. Second, Wikipedia and the wiki model of organization automatically optimizes URLs, an important component of Search Engine Optimization (SEO). This means that a page’s URL reflects the title of the page, making it easy for spiders to crawl a website and make intelligent guesses about content.

Trilliscience also adds a new twist to search. Remember the definition? Trilliscience is “the instant application of previously unheld knowledge as the result of passive searching.” Readers may not be actively searching for new content, but in those instances where their curiosity or lack of knowledge on a topic would normally switch them into active mode, websites can now provide instant gratification to that impulse. And Trilliscience lets publishers do that without losing the reader.

Websites that already publish massive amounts of content (like newspapers) can make great use of this technique. Even if they don’t have URLs optimized for crawling, they do have access to the interfaces that can provide related content. For example, to return to the New York Times, when I’m reading about Mohamed ElBaradei, what if I could hover over his name with my cursor and instantly get a short history, as well as links to other Times stories about him? I would most likely either finish reading the story first, having satisfied my initial curiosity, or click one of the links offered within the tooltip — then I would be more likely to stay on the website for the New York Times.

This also removes the burden to producers of manually adding links to an article. Some sites do a better job of providing links for referenced terms, but inevitably they either just redirect me to a search page (again, away from the article I was trying to read) or in the case of things like stock quotes, redirect me to an entirely new page, again potentially losing revenue.

It’s possible to go even further. In addition to a tooltip with a definition or introductory paragraph from a free or partnered content provider like Wikipedia or Britannica, you could provide links to other articles on your site. Or you could add search results from search engines on the same topic using keywords on which you have heavily bid, or your own text advertising. The level to which you make use of advertising in these lookups is up to you (and how much your users will tolerate), but it does present another opportunity to monetize content.

In all of this, the question of fair use and copyright violations must be addressed. The safest course for publishers is to only use their own content. A second tier would be to integrate freely available content, but again, reliability varies and it is necessary for publishers to assess the risk to their reputation against the potential for profit. Partner relationships with commercial content providers or resellers (like Amazon) may be another means for profitable affiliate payouts. The worst thing publishers could do, from a legal perspective, would be to arbitrarily include commercial content without permission. There are strict laws that govern the derivative use of works, and this is becoming stricter as courts struggle to keep up with new technology.

Keeping all of the above in mind, judicious use of the model provided by Trillian could help publishers increase profitability while enhancing the reader’s experience.

How to make Wikipedia better (and why we should)

I’ve been involved in a long debate over Wikipedia with a friend who is a respected journalist. His contempt for the project stems from his distrust of anonymous writers and what he perceives as a lack of respect among Wikipedia’s contributors for journalistic standards. He’s not wrong — those who follow the controversy surrounding Wikipedia know about recent scandals. More importantly, however, his views are representative of a large number of influential people who distrust Wikipedia for serious research.

Wikipedia is a good idea. There is a need for a freely available, reliable encyclopedia on the Internet. Commercial alternatives like Britannica clearly have their place. But, if only because users expect information on the Internet to be free, we should be grateful that some people are willing to volunteer their time to make that information reliable.

Are we there yet? The report that Nature Magazine released last December contended that we may be closer than we thought. It is also a positive sign for Wikipedia that prestigious organizations are beginning to take the encyclopedia seriously enough to evaluate its claims. In a follow-up interview during Nature’s Dec. 15 podcast, Jimmy Wales, president of the Wikimedia Foundation and founder of Wikipedia, said his goal is to achieve “Britannica or better” accuracy. Yet he was also modest about the report’s findings, admitting that he neither expected such a positive review nor did he think the level of quality was consistent across all subjects.

There is no question: Wikipedia has a long way to go. In order to make it better, supporters need to shift focus away from isolated articles and genres and first address the system that produces the content. By doing so, contributors will have more tools to ensure the reliability of their articles.

I came up with a list of six easy steps that project leaders could take to make Wikipedia better. It’s not conclusive, but these suggestions arose from my own research and conversations with people who are concerned about the project.

1. Consistently enforce the existing standards

In addition to dozens of clearly written policy pages, Wikipedia has impressive tools for tracing the history of an article through its “recent changes” feature. There are extensive guides on the website that instruct contributors on how to cite sources, format entries, debate controversial passages, and argue effectively. Critics who only know about the wiki format may not understand the standards that the project leaders demand.

The problem? Wikipedia’s policies are not evenly enforced. The scandal over John Seigenthaler’s biography was likely only one manifestation of this problem. Participants in Wikipedia need to find ways to uniformly enforce existing standards on all content. If it’s too much work, then questionable content should be taken offline until it can be addressed. Otherwise, there is no way to claim that the site is reliable.

2. Force editors to take responsibility for their articles by telling us their names

Currently, contributors don’t need to provide any information beyond a user name in order to join the community. Until recently, there wasn’t even a requirement to have a user account in order to edit content.

This suggestion will meet with the most resistance, but it answers the biggest complaint about Wikipedia. The current policy is indefensible. If there is an honest reason to remain anonymous (like the fear of political retribution), then it’s easy to provide an editorial workaround either by having more reviews or clearly indicating that the article was anonymously created for political reasons.

3. Supply references and reasons for content change

Right now, in order to change an article on Wikipedia, after logging in, all somebody has to do is type. Their changes will be preserved along with the article’s history, but the only way to explain the reason for a change happens after the fact, in discussion forums.

What if there was an additional field on the edit screen that forced contributors to back up every new piece of content they added with references and reasons for doing so? It would be another tool that justified the added obtrusiveness with its usefulness.

4. Make citations clear

Like any good reference tool, Wikipedia provides endnotes for authors to cite references. But these aren’t used consistently. Some of the numbered links in the articles (which resemble endnotes) are merely links to other websites, with no bibliographical information at the end of the article.

5. Let users rate contributors

Trust is the key issue — and online it matters even more. Wikipedia could easily make use of a system similar to eBay’s user rating system. Every contributor should have their own page with a list of articles and feedback. Only users with an account should be able to create feedback for other users.

6. Settle copyright disputes before questionable material is published

There is a page on Wikipedia that lets leaders debate whether an image or text is under copyright protection. The problem is that many times, the resource is already on the website. If someone is reviewing possible copyright violations anyway, why not do it before the material is published?

These steps are easy to implement from a technology perspective, but the cultural challenges are significant. Despite recent scandals, Wales said that vandalism and malicious editing are not the biggest problems the community faces.

Far more difficult is getting contributors who are passionate about their content to agree on what gets published and the reasons for doing so. Experts who join in order to donate their time and knowledge to improving content will be forced to defend themselves — and their credentials — against less qualified opponents. It is conceivable that such an affront to their pride will drive many away. That would be unfortunate.

Perhaps even more challenging will be getting the current community to agree to abide by stricter rules. But those who most appreciate the remarkable qualities of the Wikipedia community should be the first to pressure the project leaders to take the simple steps necessary in order to ensure that its articles have been fact-checked, are clear of libel and copyright violations, and meet certain standards of composition and organization. Until these steps are taken, they may never be able to convince critics — who could otherwise be valuable allies — that Wikipedia is more than just a cute toy.