Personalized news done right?

I’m not ashamed to admit it: The first time I saw Twitter, I thought, “What’s the point?” Maybe you did too, or maybe you’re just more perceptive than I am. Even Twitter’s founders have said they didn’t know exactly what it was when they started working on it. (Biz Stone: “If anything we sort of thought it a waste of time.”)

For every Twitter enthusiast, there was, I suspect, a point of realization that this thing could actually be incredibly useful. Some have cited the plane-in-the-Hudson story as their aha! moment. For me, it was less of a moment and more of a gradual understanding. I began to see its potential as a real-time information source when I first learned of a few important news items — both big international stories and news of a more personal nature — through Twitter.

I began following like-minded people for the interesting links they would post. Before long, information overload took hold. I tried to cull my follow list so I could read everything. I worried I would miss something. Finally, I learned to embrace the firehose and not try to process the whole stream.

But still I thought there must be a better way to separate signal from noise. And then I noticed that the most interesting and important items were appearing maybe three or four times in my Twitter feed. Since then, I’ve wished for a way to mine my feed for those links.

Last week I heard about and was thrilled to find it does exactly what I wanted. I spoke with Maxim Grinev, the project’s technical lead, about and where it’s headed.

How does work?

We look at the tweets that your friends send, and also tweets that friends of your friends send. So, first circle and second circle. And then we extract links from those tweets. Usually links are shortened, so we get the long versions. Then we group by links and calculate how many times each link is posted by your friends and friends of friends to built your personalized “newspaper”. (NB: Links posted by friends get more weight than links posted by friends of friends.) Right now, every “newspaper” is updated about every half an hour. It can be updated more frequently, but we don’t want to stress Twitter.

How did the project start?

As usual, it was a side project. We had been working on some semantic search technology. It’s about using semantic relationships extracted from Wikipedia to organize other data (blogs, news, etc.). As we were working on this, we started using Twitter. We didn’t have any idea in advance of what we wanted to build. We just analyzed how people used Twitter, what information could be detected. We understood that Twitter is not only good for spreading news, but it’s also a good voting system. So we can collect and analyze how many times links are voted on in Twitter. Analyzing this data, we can understand how important this link or this event is.

Who is working on the project, and what’s your business model?

We have 5 people working on this project: 4 developers located in Moscow, and one business guy in San Francisco. We are computer scientists, and we specialize in data management. We are self-funded; there are no external investors. As concerns the business model, we are considering various partnership schemes and selling advertising on but we have not decided on anything yet. Now we are mainly focusing on attracting users.

Will take advantage of Twitter’s new lists feature?

Right now we don’t do anything with lists. We are thinking about how to incorporate this. One of the options could be to generate newspapers based on some list. So if you have a list of people, you can collect the second-circle friend-of-friend information and build a newspaper for a list.

What other things are in the offing for

We are currently collecting feedback from users. Usually our users request relatively small features — for example, they want to improve the retweet feature. We are going to handle this feedback and add features. In addition to that, we are planning to extend the system in two ways: First, we want to extend the sources that are processed — so, in addition to Twitter, we are thinking about collecting posts and links from Facebook, mainly, and maybe Friendfeed. Second, we are going to allow ranking of news by global popularity. So you would have two different tabs: The first tab is personal news. The second tab is global news. In this sense we will compete with Tweetmeme.

What are your thoughts on the future of news?

I can’t say how it will be. I can just share my own experience, and I think it’s typical: Since I started using Twitter, I’ve nearly stopped collecting news from other sources. Before, for example, I watched news on TV and read more magazines. Now I get nearly all my news from Twitter. I’m quite confident that if I read Twitter, I will not miss some important piece of news. So if a war has started, or there’s some disaster, it will be mentioned at least once in my Twitter timeline.

I have heard a lot of discussion about media sources dying — The New York Times has problems, etcetera. Of course, I think that all these major newspapers and magazines are very important, because journalists have the ability to travel places and work at this full-time. But with regard to selecting what I will read, I’m not going to visit The New York Times website, for example. If there’s some interesting and important article posted there, I will find it in my Twitter timeline.

Also, by the way, there’s an interesting idea we’re looking at: When you visit The New York Times website, for example, you might be interested in getting all the links published there, but ranked according to the judgment of your friends and friends of friends. So it’s the same as, but restricted to a single source — The New York Times, in this case. We are talking to one major newspaper about this.

How AOL can transform its sow's ear

[Tom Grubisich is a former managing editor of news for Digital City/AOL and a regular contributor to OJR. Today, he offers his ideas on how the once-dominant online publisher can regain its momentum.

In the weeks to come, we would like to feature more dot-com veterans’ takes on how other once-dominant online publishers can turn things around. If you are interested in writing such a piece, please contact OJR editor Robert Niles at editor – at –]

Reviving AOL may be as big a challenge as making a Marc Jacobs purse out of a sow’s ear. But AOL still has about 10 million subscribers. Ten million! Any other site would die for that number.

AOL’s subscribers are voracious consumers of pages. But as AOL now chases after other users all over the Web in another of its frenetic but behind-the-curve strategies, those 10 million monthly fee payers are treated as stepchildren, if not orphans.

They get zilch. Why doesn’t AOL woo these long-suffering loyalists with richer content, particularly news that will give them a reason not to finally opt out — like 20 million other AOL subscribers since 2002?

With its new myAOL, AOL now invites everybody, including subscribers, to create a tailored “start” page of news from baskets of sources or RSS feeds. But myAOL does no more than copy other, earlier personalized sites, like Yahoo, netvibes, pageflakes and iGoogle.

Why doesn’t myAOL take personalization to the next, logical level – let users not only choose a news provider, but also specific subjects?

I’d go to myAOL in a one-click heartbeat if it gave me a start page that scoured the Internet every day for stories about climate change, Sunni-Shiite relations in Iraq, certain movie directors like David Fincher and Paul Thomas Anderson, and other subjects that I follow closely. I suspect many Internet users have their own special news interests.

Sure, give me a box on today’s biggest headlines — like presidential primaries — but in today’s wraparound news environment (radio, TV, cellphone, print, not to mention my omnivorous Web browsing), I see and hear those same commodified headlines many times during my 16 waking hours. I’m sure many other people go through a similar experience.

My ideal start page would not only bring me specific news — text, photos, videos — but also make it easy for me to share it with friends (that I would arrange in clusters on a social widget a la Facebook).

The widget would let me and friends discuss the story, and, if we wanted, do something in response (send a letter to elected officials, get together for a meeting, etc.).

Personalization at this level would require AOL to develop or at least tweak some search algorithms so it could find and send me my news however arcane and wherever it resides in cyberspace — say, the transcript of a new hearing by Rep. Henry Waxman’s Committee on Oversight and Government Reform on the impact of proposed mega-coal-fired energy plants (I would also expect AOL not to send me five different versions of one story, which would be easy enough to avoid.)

Personalization taken to this level is not the “Daily Me” echo chamber detailed so worrisomely in a recent Financial Times opinion piece. It would bring to your computer news you want, but not necessarily news you are happy about.

If myAOL granted me my news wishes, I would reward it with loyalty. My social-network friends — some of them, anyway — might decide to make myAOL their start page too. Other Internet users might find their way to myAOL — through tagging, clouds and all the other prompts of Web 2.0.

A myAOL that did these things could help web 2.0 achieve its grand but unrealized mission — making the Internet an agent of change — not just in stock valuations but in our civic life.

It’s just possible that AOL could turn its sow’s ear into a purse. Perhaps not a Marc Jacobs. But how about a Michael Kors?

Bob Cauthorn returns with CityTools

Newcomers to online journalism might not recognize the name “Bob Cauthorn.” But to industry geezers like me, Bob was the guy you could count on, back in the late 1990s, to rip newspaper companies for their ham-handed, clueless approaches to the emerging Internet marketplace. Bob could be profane, abrasive and loud… but time has shown that he was almost always right.

Then, after stints at a couple of newspapers, Cauthorn essentially disappeared from the industry scene. He went off to some start-up called “CityTools,” which produced… well, many us weren’t quite sure.

Now, Cauthorn’s back. CityTools is ready to launch, and Cauthorn’s ready to show off his new baby.

In short, CityTools is a social media framework for publishing news articles, lists and classified advertisements. Cauthorn demo’d for me a platform that serves both newspapers as well as independent and individual publishers.

Newspapers could use CityTools as an ad hoc wire service, to create with other papers online portals on topics of mutual interest. Interest groups could use the platform to manage collaborative publications. Readers can build lists of their favorite… whatever, and share those lists with others to create aggregated “favorites” lists from designated communities.

And, of yeah, the platform supports stories, ads and lists in multiple languages. Speak English, Spanish… and Swedish? CityTools will let you read, create, order and distribute content in all three, at once. Registered users can declare which of 13 supported languages they read, and select which one they want to use as their primary language while navigating the site. They can also select their community, which will deliver them content and ads tagged to that community, while allowing them to use breadcrumb trails to navigate to content from all other CityTools communities.

It’s loaded with cool widgets like this, so my inner geek demanded that I get the scoop. I talked with Cauthorn on the phone earlier this month, and an edited transcript follows.

OJR: You were raising hell in the world there a few years ago and then just kind of disappeared into CityTools. Bring us up to speed on what you’ve been up to.

Cauthorn: I went into the lab. After I left The [San Francisco] Chronicle, I went backpacking along the Pacific Crest Trail and did a lot of thinking about the state of journalism and online newspapers and stuff and, as you probably know, I was one of the very earliest people doing what we now call social news. Back then we didn’t really have a name for it, you know, we’re just doing the community front page which allowed people to decide what was on their front page and share links and vote on things and – but all the stuff that has now become commonplace with Digg and whatnot.

I was thinking a lot about the need for a new kind of journalism online as well as the kinds of things that may help, you know, existing print newspapers to survive. And when I say print newspapers it’s because even though they have online operations, they’re still thinking so much like print operations, you know, and so after, you know, sort of both literally and figuratively going to the mountain, I came back and decided to try to re-imagine this stuff from the ground up.

So that’s what I’m focusing on right now.

On the newspaper side, what we’ve created is what we think is an extraordinarily interesting and brand new thing. We’re giving newspapers the ability to very easily set up ad-hoc wire services if you will, to share content with other newspapers of a like mind as well as to share classified ads.

OJR: I think one of the distinguishing characteristics between let’s say, first generation online publishing versus traditional offline publishing has been that the focus of offline publishing, local newspapers, has been geographic. A lot of early online publications have been organized around topic and they’ve been geographically agnostic, if you will. They don’t care about where you are in the world, just what you want to talk about. And what you’ve just described here seems like it is taking the geographic-based local newspaper and moving it into the more topically based world where you’re creating topic – you’re creating topical networks for local communities so you’re no longer just about the Fort Lauderdale community, you’re about boating.

Cauthorn: Well, geography is still important. What we’re trying to do though is we’re trying to say, “Look. Let’s imagine content as a palette of colors.” Right now we’ve had a very limited palette. You’ve got what the wire services give you and you’ve got what your local folks generate and of course with layoffs and stuff like that, that palette of colors that your local folks is generating is getting less. And what happens is you say, “Okay fine. Why don’t we expand that palette by borrowing colors from other people?”

Let’s use agricultural reporting as an example. The fact of the matter is that agricultural reporting across the country, the numbers have been shrinking and shrinking and shrinking. Right? Because the newspaper has to make a choice between covering agriculture, even if you’re an agricultural market, and covering the statehouse, they’ve got to cover the statehouse. It’s just their natural bias. Whether or not that’s relevant to the reader or not, who knows? But it’s a natural bias.

So what happens of all of a sudden you say, “Okay, but you know what? So we’re not doing a great job of covering all agriculture in our area, but you know what? If we combined four cities, let’s say all the small newspapers in the Imperial Valley, and say okay, we’re gonna share our agricultural coverage and ou can put it online or you can put it in print. It doesn’t matter. It’s up to them. All of a sudden, you’ve got a rich, brand new product that really resonates for the local audience. And guess what? Google can’t match. There’s no way a mass aggregator can match that.

OJR: Let’s talk about some of other folks that are out there, in this spectrum of social media, from earlier sites like Backfence to Topicx to whomever the Knight Foundation’s gonna be funding this year and next. What have you got going that you think distinguishes CityTools?

Cauthorn: Up until now what’s happened is that sites have enforced their view of what local is. So, you say, okay, this site is about Pima County Arizona. That’s our local view and that’s it. And it may be part of a network where you have Pima County here and you’ve got Maricopa County there, but if you’re on a Maricopa County site you don’t see the Pima County stuff. If you’re on a Pima County site, you don’t see the Maricopa County stuff.

What we’re doing to begin with is we’re saying, “Look, what we need to do is put the definition of what local is from the perspective of this site in the hands of the user.” We talk about personalization but what I want to start talking about is context of your life. The user has a context of their life and their context is that I might identify myself as being a local to the Bay Area, but my next-door neighbor might think of San Francisco only as where their local context is. How do you build a site that responds to both of those people’s concerns in a fluid manner? That’s what we’ve built.

So what happens is that, for example in Brooklyn — I think we’ve got twelve or fifteen neighborhoods in Brooklyn, specific neighborhoods. So let’s say you’re looking at Bensonhurst’s stuff. You’re reading a restaurant review in Bensonhurst and you click on Bensonhurst, say, “Show me all the restaurants you got in Bensonhurst,” because what we allow you to do is combine. I don’t know the context. I’m gonna allow you to set the context. Right?

So you say, “The context I’m interested in is Bensonhurst and I want to see all the restaurant reviews in Bensonhurst.” Well, everybody’s posted a restaurant review in Bensonhurst, there they are. If there’s not enough content, and if you think, “Oh well, wait a minute, I’d like to see all the restaurant reviews in Brooklyn,” all you got to do is click Brooklyn [on the page’s bread crumb trail] and suddenly, bang, you get everything in Brooklyn.

OJR: One of the distinguishing characteristics about my hometown, the L.A. area, not that it isn’t beginning to happen in other metro areas as well, is as you go by neighborhood to neighborhood, you’re not just changing geography, you’re also changing, literally, the language spoken by the people in that neighborhood. Tell me a little bit about how CityTools is accommodating language differences.

Cauthorn: We currently support 13 languages. And we believe, we’re not sure about this, but we believe we’re the first multilingual news site in the world. Up until now, if you speak Spanish and you’re in Los Angeles, you have the choice of an English language newspaper or a Spanish language newspaper, either in print or online. But I go down to the mission in San Francisco and you hear people freely mingling Spanish and English together. That’s the context of their life. Right?

So what we do is we allow you to say, “Okay, I only want to see Spanish language content in East L.A.” So you’ve got it. If you’re comfortable in Spanish and English, you can have Spanish and English and it’s freely mixed in there.

Now think about this in terms of business model, what happens when you have bilingual classifieds? Imagine what would happen if the Hispanic community in Los Angeles had the ability to say, “Okay, I want to see classifieds in Spanish or English.”

That’s what I’m talking about when is say context. I want to know where you live, I want to know what languages you speak, tell me what you’re interested in. I will change the nature of the site to match those things. This is a big deal, we think.

Now, that’s – so that’s all really powerful, but then we get into some other stuff that also we believe is quite new. And you’re getting back to what distinguishes us from the other sites that have come before. We have this entire group publishing model that anybody can create what we call teams.

Let’s say you have a class full of journalism students and you create a team for that class and they write their stories and they assign them to their team. Now you have flexibility. You can I want it to appear with other team stories, but I don’t want to allow the team members to edit it. Or, you can say I want it to appear with other team stories and I’m gonna allow other team members to edit it. Because we have a draft and edit mode, what happens is that the students can write their stories in edit mode and then they can submit them to the teacher and when the teacher says that they’re good enough, then the teacher can say, “Okay, publish that one, publish that one, publish that one.” It’s just click, click, click, click, click and they get published.

Now here’s what’s slick about that. So all of a sudden what you have is you have got a workflow that resembles an existing news room. Right? But what’s slick about that is two things. One, every university in America is part of our geographic database. So let’s say this is at University of California-Berkeley. Let’s say they assign these stories to the geography of University of California-Berkeley.

All of a sudden then, you’re looking at collaborative group output of content which is tied to a place. And what’s really slick about it is that they can also put those headlines on their own sites because we give you code you can just cut and paste this code on and anytime that your story’s on CityTools, it gets updated on your own site.

Why does that matter? Here’s why. What we’re trying to do is we want to help nonprofits and community organizations, parent teacher organizations and stuff like that. None of them have the ability to conveniently and quickly update content on their own websites on a regular basis. Right? So what we’re saying is all you have to do is put this code in and once you start using CityTools, automatically those headlines will go over on your site, styled the way you want them, looking the way you want them.

But here’s where it gets really cool. So you and I have this organization working on leukemia. And let’s say we have a constituency of 3,000 people out there who have an interest in leukemia. All of a sudden, we can open up a public team that is tied to the organization and we can invite all of our thousands of people to join. So if you’re an activist – imagine if an activist organization, such as anti-war organization, said, “Everybody join this big team,” then you’ve got 1,000 people looking for stories about anti-war stuff every single day. And, by the way, it also shows up on your own website. Suddenly, that gets interesting.

So we are hoping that what’s gonna happen is we’re gonna start to engage people in the context of their lives – again, getting back to this word, context. Tell me what organizations you belong to and I will help you make life in that organization better.

OJR: Getting more into this idea of the crowd, tell me more about the kind of collaborative list building technology that you’ve built in here.

Cauthorn: When I was on the mountain I was walking down a trail and listing things in my head and I said, you know, if I got two other people doing this, I could build a consensus and that was when I went, “Oh sh-t.” What we do is that we allow people to create rank lists and these rank lists can be about anything. By itself, this is not unknown, it just hasn’t been done in this context.

What we can do is allow you to say, “Okay, here are – here are my five favorite Italian restaurants in all of Los Angeles.” And, by the way, you can adapt that by neighborhood if you want to, and you can do it in Spanish.

But then what happens is somebody else comes along, because none of us can resist a good list. And they go, “Oh no, Robert’s list was good, but he missed this, this and this and I disagree with the order.” So what they can do is what we call linking lists. When you read the list, if you’re a member, you just click, “I want to link to this list,” and create your own list.

Now [the lists] are part of a family and what happens behind the scenes is that we do some heavy lifting on text analysis and we look at the item titles and then we say, okay, we then can allow you to create a consensus view of what the best Italian restaurants are by merging them together.

For example, let’s say there’s a restaurant that you call Paizano and I call it Il Paizano. Our system will recognize that you’re talking about the same place and so Paizano appears on both lists. As you know, consensus building algorithms are not unknown. This is pretty well established, but nobody’s applied them to lists before we believe.

So all of a sudden what happens [on CityTools] is that then you the reader can say, “Hey, here’s Robert’s list and here’s Bob’s list. I want to see the consensus. Show me the ranked view of what both lists think is the most important.” And that’s cool if it’s two people. It gets really, really interesting if you have 25 people doing it or 100 people doing it and then it get really, really, really interesting if you can bring it up by geography.

Now imagine if the PTAs in San Francisco all put in their lists of their greatest needs at their school and they link them together. With one click then a reader can say, “Show me what the most serious needs are in the schools.” No one’s every been able to do this before. And we’re allowing people to determine the context in which it’s done, certainly they can say, “Okay show me what are the worst needs in San Francisco.” Oh guess what, you can expand the view to show me the rank list of the needs of schools in the entire Bay Area.

This gets powerful. I mean that is magic, man. I mean think about what this can mean for a society.

You start to pull these things together and what you’re looking at is a sandbox for community interaction that hasn’t existed before. Up until now, here’s what we had: You had UGC [user generated content] sites where people can create stuff, or you had shared news sites where they could share news. Okay. That’s fine. We do both. We say, “Look. You go in both modes, because sometimes you want to write stuff. Sometimes you want to read stuff.” Okay. There are a couple of sites out there where you can make lists, but you just write lists down. You can’t tie them together. You can’t link them together. You can’t do this other stuff that we’re doing.

When I was doing my big backpacking trip and thinking about this stuff, I decided, on a very cold night in the Sierras, to peel back newspapers to their essential core. You know? And part of that essential core has been creating marketplaces.

But the other part of it is this entire connective tissue argument is the way in which our reporting and the reading of those reports connects individuals to one another.

That’s what we’re trying to do: to get back to that essential core of allowing people to create these connections between the writer and their audience, between groups of people who are trying to get something done in a community.