Personalized news done right?

I’m not ashamed to admit it: The first time I saw Twitter, I thought, “What’s the point?” Maybe you did too, or maybe you’re just more perceptive than I am. Even Twitter’s founders have said they didn’t know exactly what it was when they started working on it. (Biz Stone: “If anything we sort of thought it a waste of time.”)

For every Twitter enthusiast, there was, I suspect, a point of realization that this thing could actually be incredibly useful. Some have cited the plane-in-the-Hudson story as their aha! moment. For me, it was less of a moment and more of a gradual understanding. I began to see its potential as a real-time information source when I first learned of a few important news items — both big international stories and news of a more personal nature — through Twitter.

I began following like-minded people for the interesting links they would post. Before long, information overload took hold. I tried to cull my follow list so I could read everything. I worried I would miss something. Finally, I learned to embrace the firehose and not try to process the whole stream.

But still I thought there must be a better way to separate signal from noise. And then I noticed that the most interesting and important items were appearing maybe three or four times in my Twitter feed. Since then, I’ve wished for a way to mine my feed for those links.

Last week I heard about and was thrilled to find it does exactly what I wanted. I spoke with Maxim Grinev, the project’s technical lead, about and where it’s headed.

How does work?

We look at the tweets that your friends send, and also tweets that friends of your friends send. So, first circle and second circle. And then we extract links from those tweets. Usually links are shortened, so we get the long versions. Then we group by links and calculate how many times each link is posted by your friends and friends of friends to built your personalized “newspaper”. (NB: Links posted by friends get more weight than links posted by friends of friends.) Right now, every “newspaper” is updated about every half an hour. It can be updated more frequently, but we don’t want to stress Twitter.

How did the project start?

As usual, it was a side project. We had been working on some semantic search technology. It’s about using semantic relationships extracted from Wikipedia to organize other data (blogs, news, etc.). As we were working on this, we started using Twitter. We didn’t have any idea in advance of what we wanted to build. We just analyzed how people used Twitter, what information could be detected. We understood that Twitter is not only good for spreading news, but it’s also a good voting system. So we can collect and analyze how many times links are voted on in Twitter. Analyzing this data, we can understand how important this link or this event is.

Who is working on the project, and what’s your business model?

We have 5 people working on this project: 4 developers located in Moscow, and one business guy in San Francisco. We are computer scientists, and we specialize in data management. We are self-funded; there are no external investors. As concerns the business model, we are considering various partnership schemes and selling advertising on but we have not decided on anything yet. Now we are mainly focusing on attracting users.

Will take advantage of Twitter’s new lists feature?

Right now we don’t do anything with lists. We are thinking about how to incorporate this. One of the options could be to generate newspapers based on some list. So if you have a list of people, you can collect the second-circle friend-of-friend information and build a newspaper for a list.

What other things are in the offing for

We are currently collecting feedback from users. Usually our users request relatively small features — for example, they want to improve the retweet feature. We are going to handle this feedback and add features. In addition to that, we are planning to extend the system in two ways: First, we want to extend the sources that are processed — so, in addition to Twitter, we are thinking about collecting posts and links from Facebook, mainly, and maybe Friendfeed. Second, we are going to allow ranking of news by global popularity. So you would have two different tabs: The first tab is personal news. The second tab is global news. In this sense we will compete with Tweetmeme.

What are your thoughts on the future of news?

I can’t say how it will be. I can just share my own experience, and I think it’s typical: Since I started using Twitter, I’ve nearly stopped collecting news from other sources. Before, for example, I watched news on TV and read more magazines. Now I get nearly all my news from Twitter. I’m quite confident that if I read Twitter, I will not miss some important piece of news. So if a war has started, or there’s some disaster, it will be mentioned at least once in my Twitter timeline.

I have heard a lot of discussion about media sources dying — The New York Times has problems, etcetera. Of course, I think that all these major newspapers and magazines are very important, because journalists have the ability to travel places and work at this full-time. But with regard to selecting what I will read, I’m not going to visit The New York Times website, for example. If there’s some interesting and important article posted there, I will find it in my Twitter timeline.

Also, by the way, there’s an interesting idea we’re looking at: When you visit The New York Times website, for example, you might be interested in getting all the links published there, but ranked according to the judgment of your friends and friends of friends. So it’s the same as, but restricted to a single source — The New York Times, in this case. We are talking to one major newspaper about this.

This headline not written for Google

I’m amused by a discussion on SEO and headline-writing taking place at the Nieman Journalism Lab site and on the Canadian blog MediaStyle. It seems a seminar on SEO for editors at The Globe and Mail offended the Canadian paper’s online books editor, who interpreted it as a charge to dumb down headlines.

Most commentary has focused on the question of why his post was removed from the Globe and Mail’s books blog, In Other Words. I’ll let others tackle that angle. What I’m interested in is whether the writer, Peter Scowen, has a point. I believe he does, even if it’s poorly expressed:

Last week, our headline on the review for Nicholson Baker’s The Anthologist was one of those sweetly goofy and slightly shopworn plays on words that newspapers are rightly famous and infamous for. The book is about a self-doubting poet in midlife crisis mulling (and procrastinating) over an essay about rhyme; the headline was “The marinating of the ancient rhymer.”

Our merriment came to a screeching halt on Tuesday after I went to a seminar on search engine optimization and discovered that it was actually a really really crappy headline. I learned that this kind of badinage, so peculiar to newspapers, has no place on the Internet. The reason is both simple and deranged: The most important reader of Internet news headlines is not you, the sentient, curious human being, but the robots at Google that scan headlines and return search results based on what their cold, lifeless eyes tell them.

Above all, we were taught that Internet headlines have to be written with a certain kind of hipster doofus in mind. This person was embodied by the groovy, ever-pacing journalism professor who led the class on writing for robots (he didn’t call it that), and who whipped out his iPhone and boasted that he will not click on anything whose headline doesn’t hand the story to him on a digital platter.

I happen to know the journalism professor in question. His name is Alfred Hermida, and he is anything but a “hipster doofus”. He’s a keen observer of the changes taking place in the practice of journalism, and I’m happy to be joining him on the faculty of the University of British Columbia in January.

I have taught on the subject of headlines and findability, both at the L.A. Times and for Poynter’s NewsU, and I have always stressed this point: It’s not about writing for Google. It’s about writing for humans, with search engines in mind — a theme Alf says he raised in his seminar. But if we’re going to write with an eye toward findability, we have to understand how search engines work and how people use them, and I presume that’s why The Globe and Mail invited Alf to speak.

I wasn’t there, but I suspect there may have been some nuance in Alf’s presentation that was lost on Scowen. In any case, there are ways — both technical and editorial — to publish great headlines without killing search relevance:

  • At the very least, most content management systems these days will allow editors to write a literal, search-friendly headline for the story and put a more creative, punny headline on their homepage and section fronts, where keywords don’t matter as much.
  • Better yet, if your CMS supports it, you could put your literal headline in a story’s <title> tag and on RSS feeds, and get more abstract in the display headline that readers see when they pull up the story. The New York Times has been doing this a bit, I’ve noticed.
  • Finally, if you have control over the words in a post’s URL — and with many blogging tools you do — you can put full names and keywords there instead of in the headline and still get them seen by search engines. Mashable, a popular blog on social media, seems to be optimizing the URLs on its posts in that way.

Scowen raises an important topic, but he cheapens his argument by suggesting that readers who arrive at news content via search — about a third of the audience of many news sites — are intellectually incurious and that journalists who cater to them are dumbing down the craft.

(For what it’s worth, I think his post does kind of fit in a blog about books, because it captures an important difference between online and print writing. And readers’ reactions could have been illuminating for the Globe and Mail staff. It’s a shame that this conversation wasn’t allowed to take place on the G&M site and had to happen elsewhere instead.)

Measuring user engagement: Lessons from BusinessWeek

Think about the traffic statistics you refer to when you look at Omniture or Google Analytics data for your site. Unique visitors? Pageviews? What do they actually tell you about your audience? The ubiquitous unique visitor metric treats your most passionate and thorough users exactly the same as those of the one-hit scan-and-scram variety. And pageview tallies are so apples-to-oranges in these days of Flash and AJAX that they’re rendered almost meaningless. If you really want to describe your audience, it’s time for some new metrics.

But what else is there? The folks at BusinessWeek think they have an answer, and it’s not about how much content users consume but rather what they do with it. I asked BusinessWeek’s online editor, John Byrne, about his team’s efforts to go beyond pageviews and visits to quantify something more inscrutable: user engagement.

What is BusinessWeek’s definition of user engagement and why is it important?

User engagement is how we nurture and build a community. Our reader engagement index is a comments-to-postings measure for a given month: So we will tally how many comments on X number of stories/blog posts that published that month. This gives us a ratio figure that we track to determine our monthly reader engagement index and growth. In February of this year, we received from our community 28.2 perspectives and insights for every story or blog post we published. A year earlier, we received 23.7. So we know we’re moving in the right direction.

It’s important because we value, and so measure and gauge, all our interactions with our readers on — including commenting on a story or blog post. The next level is how our writers and editors engage our readers in a conversation, and also welcoming our readers to write longer pieces for us, or to report (at least once a week) a reader-suggested story. We’re also engaging with BW readers on other sites, such our Ning network that served as a forum to generate and debate stimulus spending priorities for the Obama administration, or interactions involving our 50+ staffers on Twitter. If we don’t listen to our readers and interact with them, and then act on the feedback and suggestions they’re giving us, we’re dead in the water. That applies to any media brand today, not just BusinessWeek. We’re just making it more of a priority, including featuring readers on an equal plane with our writers — on our home page, for example, our featured reader is given more prominence than even a Jack & Suzy Welch.

What information sources and tools do you use in measuring engagement?

Our reader engagement index involves Omniture (for stories) and Movable Type to track numbers of blog comments and posts.

Beyond our reader engagement index, other measures include how much you’re retweeted — for instance, one of my tweets on March 25 was retweeted 130 times. We also look at referring traffic from blogs or Twitter on Omniture, or by running a Twitter search: or

And also look at Google BlogSearch or blogpulse (owned by Nielsen Buzzmetrics) for mentions of

What do engagement metrics tell you that conventional metrics do not?

It shows, in quantifiable/measurable terms, how much our readers care about us. To post a comment or submit a suggestion is a strong indicator of a BW loyalist, someone we need to nurture and engage and reward. It also tells us how much (and how well) our staffers are interacting with readers. The problem with time spent on a site is that it also measures, in the case of a portal, email time, or in the case of a site heavy on video, time spent watching video, which can be like TV. I also argue that simple pageview metrics are heavily influenced by slideshows and email. There is no better sign of commitment or engagement than the act of reading a substantive piece of journalism, thinking about it and then forming a point of view on that story that you’re willing to write and share with others. That is true engagement.

How do you use this information to improve the site?

You can’t manage something if you don’t measure it. So having a point of reference for exactly how we’re doing drives other ideas and initiatives to increase engagement.

Does increased user engagement translate into benefits for advertisers?

Yes. Anecdotally, our sales team is selling our engagement story and using it to differentiate what we do versus our competition. It also helps to better position our Business Exchange, a new Web 2.0 product we launched last September, as a key component of our engagement efforts.

So, let’s look at some potential engagement metrics and what they might tell us. This is by no means a comprehensive list; it’s just what came to my mind. If you have other thoughts on ways to measure engagement or how you might use this data, please, um, “engage” in the comments below.

  • Internal metrics: Statistics about engagement that takes place on your site
    • Comments posted: Shows how much users are inclined to react to a topic, or supply insights of their own.
    • Return commenters: In other words, how many people comment multiple times on the same item? This is a measure of conversation around a topic. (Kudos to the Guardian’s Kevin Anderson for this idea.)
    • Times e-mailed: Reveals how often users are sharing this information with friends. This metric probably skews toward neophyte users, as more experienced users are presumably less likely to use an “e-mail this” feature.
    • Average time spent on page: Shows how thoroughly users are consuming the content, perhaps? Lots of asterisks, though, as John points out.
  • External metrics: Statistics about how people share and discuss your content elsewhere
    • Tweets/retweets: Measures how “viral” this content is in a social network. There’s also geographic information embedded in these tweets that could tell you where a topic resonates particularly strongly.
    • Diggs: Another measure of the viral nature of a topic. Given Digg’s audience, this metric might favor content that appeals to a techie crowd.
    • Delicious saves: Shows how many users stored this page with an eye toward returning to it. This metric could be particularly useful for ongoing features that you want to build a regular user base for.
    • Inbound links from blogs: Quantifies the discussion taking place in the blogosphere. This could help you identify the blogs that are most attuned to the content you produce — as opposed to just the ones that send you the most visitors (which are not necessarily the most engaged users).

Each of these metrics is easily available for a given URL at a given moment, but keeping track of all your stories over time would be impossible without some automated assistance — particularly with regard to the external metrics.

Here’s what I’d like to see: A web service that will track a URL across several services (Technorati, Delicious, Digg, and maybe internal analytics packages too) to see how it’s being referenced in each medium, then tabulate all those metrics into a single “engagement score”. (And I’d love to hear from any programmers who want to take a stab at building this!)

Meanwhile, anybody have any engagement metrics tips they’d like to share?