Making smarter readers: Lessons in Trilliscience

Last week, I was chatting via IM with my brother about a party he hosted. Still in college, he’s 7 and a half years younger than me, so when he described the music playlist at his party as “crunk,” I had no idea what he was talking about. But in less than a second, I not only knew the meaning of “crunk” but was able to respond intelligently with: “As in Three 6 Mafia?” I had achieved instant context and instant application. How did I do it?

I was Trilliscient.

A year ago this April, Cerulean Studios, the makers of the popular chat client Trillian, released a new feature as part of version 3.0 of its software: integration with Wikipedia to provide instant lookups. Not to worry — this is not another article about Wikipedia. But what is remarkable about this enhancement is why Wikipedia was chosen for this feature and what the latter could mean for the future of content delivery.

Here’s how it works. Every word that users type in their chat session is scanned and compared against Wikipedia’s database of articles. If a word or phrase matches the title of an article on Wikipedia, that keyword or phrase is underlined in the chat window. When the user hovers over the link with the mouse, a small window (tooltip) displays the first few paragraphs of the entry. It’s a simple but potentially revolutionary innovation. Appropriately, an entry on Wikipedia notes that the effect of this innovation is called Trilliscience, a wordplay on “Trillian Omniscience.” Wikipedia’s definition of Trilliscience is “the instant application of previously unheld knowledge as the result of passive searching.”

So far, Trilliscience hasn’t revolutionized anything. It’s still basically confined to its original, solitary application in Trillian. But as media organizations continue to struggle with the opportunities provided by the Web, two closely-related industries are rapidly maturing: search engine optimization and search engine marketing. Understanding the techniques developed by these industries and the example provided by Trillian could help media organizations (independent and otherwise) both inform readers and more efficiently monetize content.

Third time’s the charm?

Cerulean Studios was not the first company to try to embed contextual references. Microsoft tried with its Smart Tags, successfully annoying the hell out of an entire generation of users. The feature — a rollover list of related links to be chosen by Microsoft and its partners — was originally intended for use with IE, but was withdrawn after criticism. Smart Tags are still available within Microsoft Office, but people who make use of them must be few. Smart Tags embeds all sorts of useless and intrusive actions such as “Send Mail” and “Schedule a Meeting” into the word processing interface. If I want to schedule a meeting, I’ll open my calendar. I’m in Word because I want to type.

Characteristically, Google was smarter (and scarier). In a new version of its Toolbar released early last year, a feature called AutoLink made its first appearance. AutoLink, while controversial due to its ability to modify and add links to a Web page, does embed links that are useful. The feature can create links to Google Maps for street addresses and ISBN lookups on Amazon for books. (It’s worth noting that Jon Udell was doing contextual ISBN lookups over three years ago.) Google’s links are less intrusive than Microsoft’s Smart Tags. Smart Tags display a block with a big triangle in the middle if you have the nerve to brush a word with your cursor. The links that AutoLink creates are just links.

But the integration that Cerulean Studios added to Trillian with Wikipedia is different from Smart Tags and AutoLink. And, from both a reader’s and a content provider’s perspective, the Trillian feature is more useful.

Here’s an example. Say I’m reading the New York Times online. I come across the name Mohamed ElBaradei and I don’t know who he is. Newspapers have an easy answer for this: they simply explain it using a handful of words. But what if, as an online reader, I am suddenly more interested in learning about Mohamed ElBaradei than I am in finishing the story? As the publisher, do you want me to type “Google.com” into my address bar and navigate away from your page in order to learn about Mohamed ElBaradei? No, you don’t. You want me to stay on your site and view more ads. So how do you meet the needs of the user (to learn more about Mohamed ElBaradei) and satisfy your own need to keep users tuned in? Make readers Trilliscient.

Making it work

Of course, implementing Trilliscience would require publishers to make site-by-site determinations as to the best interface and source of information in the rollover. For Trillian, Wikipedia served this purpose. Because it is not part of Cerulean Studios’ business model to provide content, the company outsourced it. That provided a benefit to Trillian users by allowing them to have smarter conversations, more efficiently.

Why Wikipedia? There are a number of reasons imaginable, some of them technical, and some of them legal. First, Wikipedia provides content (of questionable reliability) for free. Nobody is going to get sued for integrating Wikipedia content into their website or application. Second, Wikipedia and the wiki model of organization automatically optimizes URLs, an important component of Search Engine Optimization (SEO). This means that a page’s URL reflects the title of the page, making it easy for spiders to crawl a website and make intelligent guesses about content.

Trilliscience also adds a new twist to search. Remember the definition? Trilliscience is “the instant application of previously unheld knowledge as the result of passive searching.” Readers may not be actively searching for new content, but in those instances where their curiosity or lack of knowledge on a topic would normally switch them into active mode, websites can now provide instant gratification to that impulse. And Trilliscience lets publishers do that without losing the reader.

Websites that already publish massive amounts of content (like newspapers) can make great use of this technique. Even if they don’t have URLs optimized for crawling, they do have access to the interfaces that can provide related content. For example, to return to the New York Times, when I’m reading about Mohamed ElBaradei, what if I could hover over his name with my cursor and instantly get a short history, as well as links to other Times stories about him? I would most likely either finish reading the story first, having satisfied my initial curiosity, or click one of the links offered within the tooltip — then I would be more likely to stay on the website for the New York Times.

This also removes the burden to producers of manually adding links to an article. Some sites do a better job of providing links for referenced terms, but inevitably they either just redirect me to a search page (again, away from the article I was trying to read) or in the case of things like stock quotes, redirect me to an entirely new page, again potentially losing revenue.

It’s possible to go even further. In addition to a tooltip with a definition or introductory paragraph from a free or partnered content provider like Wikipedia or Britannica, you could provide links to other articles on your site. Or you could add search results from search engines on the same topic using keywords on which you have heavily bid, or your own text advertising. The level to which you make use of advertising in these lookups is up to you (and how much your users will tolerate), but it does present another opportunity to monetize content.

In all of this, the question of fair use and copyright violations must be addressed. The safest course for publishers is to only use their own content. A second tier would be to integrate freely available content, but again, reliability varies and it is necessary for publishers to assess the risk to their reputation against the potential for profit. Partner relationships with commercial content providers or resellers (like Amazon) may be another means for profitable affiliate payouts. The worst thing publishers could do, from a legal perspective, would be to arbitrarily include commercial content without permission. There are strict laws that govern the derivative use of works, and this is becoming stricter as courts struggle to keep up with new technology.

Keeping all of the above in mind, judicious use of the model provided by Trillian could help publishers increase profitability while enhancing the reader’s experience.

Comments

Terry Steichen says:

March 22, 2006 at 3:32 pm

The idea of providing links that the reader of an article can use to get relevant background information is intriguing. It can be very useful when the text contains a log of jargon that’s unfamiliar to the reader.

However, the approach described in this article isn’t necessarily particularly useful for newspaper article readers.

The strategy described in the article links words within the article to corresponding words in the title of an article in a repository (implemented via Wikipedia in the example). What that will do is miss a lot of relevant matches where that particular word isn’t in the title, but is strongly present in the body. Or, where that word isn’t used but a synonym is.

It can also clutter the original article if a good proportion of the key words are converted into links – it can make the article more difficult to read. Take a look at some of the articles on Yahoo News to see what I mean.

Even more importantly, the meanings of individual words in an article are often tangential to the articles underlying theme or concept. Yet the reader looking for more background is most likely to be interested in background relevant to the overall theme, rather than to a bunch of component details.

And, as to an implementation, IMHO the notion of having a client-based lookup could work for relatively small volume text like IM. But for a high volume news website with non-tiny articles, the linking would have to be done at the server. And, while a browser plugin could handle simple pattern-matching, I doubt if it could handle the kind of context annotations we’re talking about here.

I don’t mean to diminish the potential value of adding background annotation of news articles. I simply want to point out that it’s a tremendously difficult challenge.

(This comment was also posted in E-Media Tidbits – http://www.poynter.org/article_feedback/article_feedback_list.asp?user=91647&id=98759)
Ray Grieselhuber says:

March 23, 2006 at 11:37 pm

A few things to keep in mind…

“The strategy described in the article links words within the article to corresponding words in the title of an article in a repository (implemented via Wikipedia in the example). What that will do is miss a lot of relevant matches where that particular word isn’t in the title, but is strongly present in the body. Or, where that word isn’t used but a synonym is.”

This particular comment misses the point a bit. The idea is an interface enhancement that works in favor of the user’s experience and the publisher’s desire to hang onto readers. Clearly, because there are so many content management systems out there, how this enhancement fits is dependent on the architecture of the system itself. Wikipedia was an easy example because of the way titles are strongly present in the URL. That being said, there is a lot of value to be gained from an SEO perspective in optimizing content so that there is a strong correlation between the title of the article and its content.

The concern about clutter is valid, although again this is an implementation specific detail. The sites that care about the user experience would obviously have to make this consideration. I think, however, that it could be done nicely without clutter.

Your point that “the meanings of individual words in an article are often tangential to the articles underlying theme or concept” is absolutely correct.

At the same time, we have witnessed the growth of bookmarking and tagging tools to help readers make connections between large numbers of disparate content sources. These tools have shown that it is possible to narrow vast numbers of keywords down to a relatively small number of tags, within a given topic. This sort of organization is still in its infancy, but there is a lot of promise, in my opinion, not only for supporting what I proposed but also for allowing users to organize the news sites they read, by themselves. That’s a whole other discussion, however.

For some sites, you’re right, it would be tremendously challenging to implement this. For those sites that are highly optimized for automated referencing, however, it would not be as difficult. As with anything in business, if the potential reward is greater than the cost, then it’s worthwhile.

Making smarter readers: Lessons in Trilliscience

Third time’s the charm?

Making it work

Comments

Some advice on covering tragedies

Journalists too quick to call Boston explosions a terrorist attack?

Journalism schools educate more employable students

Does Twitter put limitations on discussions of race?

Search OJR

Follow us on Facebook!

About Us

Browse Archives

Top Tags

Making smarter readers: Lessons in Trilliscience

Third time’s the charm?

Making it work

Comments

Some advice on covering tragedies

Journalists too quick to call Boston explosions a terrorist attack?

Boston Marathon explosions remind journalists how to handle social media

Journalism schools educate more employable students

Does Twitter put limitations on discussions of race?

Search OJR

Follow us on Facebook!

About Us

Browse Archives

Top Tags