My response to The Hartford Courant’s “Spanish-language strategy” with Google Translate

NOTE: This post originally appeared on Web Journalist Blog.

“Como una cortesía para The Courant, por demostrando ignorancia y falta de respeto a su propia comunidad, déjeme decir: lo cagaron.”

If you were to translate this using Google Translate, guess what… it would be wrong. Anyone who is bilingual wouldn’t be surprised. But they would be surprised in hearing that a news organization would solely depend on using this primitive service as their “Spanish-language strategy.”

Sadly, this isn’t a joke: Hartford Courant’s Spanish site is Google Translate by Poynter.

But, instead of just being disgusted or insulted by The Courant’s “strategy,” let me offer some tips for an actual strategy:

  1. Hire a diverse staff, and in this case, a Spanish speaker. Listen to them. Anyone in their right mind would have told you this was a bad idea.
  2. I know resources are tight, as an affordable alternative to hiring more staff, partner up with the local Spanish-language news organizations. Believe me, they are there. And they’d love to help you inform the community. (Hey Courant, have you tried working with Connecticut’s Latino News Source: ctlatinonews.com?)
  3. No Spanish-language news organization in your town? Look again. Think radio, newsletters or neighboring towns. Any of these will be better than an automated site.
  4. Still confused? Reach out to the National Association of Hispanic Journalists to find local members in your area, including Spanish-language news organizations.
  5. But, let’s say there are no Spanish-language news outlets. Partner up with the largest, Spanish-language local business. They know their community and are fully aware of the information network that is functioning now.

Lastly, apologize to the fastest growing demographic in your community for treating them with such little respect. It’s not a smart business move to belittle them, especially if you want to tap into their growing influence.

I preach experimentation, risk taking and embracing failure. You experimented and took a risk… and you failed. Oh, did you fail.

Learn from your big mistake and start genuinely engaging with your own diverse community.

Do you have any tips for The Courant or any other news organization trying to serve its Latino community? Please share them in the comments.

Oh, and if you are wondering, here’s how I’d translate my statement:

“As a courtesy to The Courant, for displaying its ignorance and lack of respect to its own community, let me say: you f&*#d up.”

Robert Hernandez is a Web Journalism professor at USC Annenberg and co-creator of #wjchat, a weekly chat for Web Journalists held on Twitter. You can contact him by e-mail (r.hernandez@usc.edu) or through Twitter (@webjournalist). Yes, he’s a tech/journo geek.

Wanted: human editors. Scrapers and robots need not apply

My world is awash in crap data.

Several times a week, I open my snail mail box to find bulk-mail solicitations for some member of one of my websites, but sent to the site’s street address. Every month or so, I’ll get a series of calls to my business phone (which is listed on my website), but the caller will ask for a name I’ve never heard. For the rest of that week, I’ll get dozens of similar calls, from different people calling on behalf of some work-at-home scheme, all asking for the same fake name.

And whenever I’m stuck searching for information via Google or Bing, I inevitably have to scroll past link after link to scraped websites – pages written not by any human being, but slapped together by scripts created to blend snippets from other webpages into something that will fool Google’s or Bing’s algorithm into promoting them.

If Google really wants to make its search engine results pages more meaningful, forget about adding links from my Google+ friends. How about creating a scraper-free search engine, instead?

I have no doubt that the reason why I get all those misaddressed letters and wrong-number phone calls is that some fly-by-night “data” company scraped together a database by mashing up names, street addresses and phone numbers it crawled on various websites. That database gets laundered through some work-at-home company, which sells it to customers suckers via the Internet as a “lead list” for commission sales.

It’s bad enough to take phone calls from these poor chumps, who think that they’ve taken a step toward earning some honest income. But I’m stunned when I see the bogus-name letters coming to my office from established colleges and non-profit institutions, who clearly also have bought crap mailing lists.

(FWIW, all my phone numbers are on the National Do-Not-Call Registry, and I’m opted out of commercial snail mail with the Direct Marketing Association, so no legitimate data company should be selling my contact information to businesses and organizations I’ve not dealt with before.)

Maybe it’s too much to hope for a solution that frees me from having to throw away all these unwanted letters and beg off these unwanted phone calls. (Not to mention saving the people contacting the expense of pursuing bogus leads.) But maybe I can hope for a scraper-free Internet experience instead.

I know it’s possible, because there used to be a scraper-free search engine – one that searched just hand-picked Web sites created by actual human beings. It was called Yahoo!, and if they’re smart, the latest crew of new managers at Yahoo! could do far worse than trying to recreate a 2012 version of their Web directory, then using it to populate a Google-killing search engine.

For an example of the garbage polluting search engines today, this site came up high in the SERPs when I searched recently for my wife’s name and the name of her website.

Scraper site screen grab

If you know anything about the violin, you should be ROTFL now. For those who aren’t violin fans, allow me to explain that Ivan Galamian, one of the great violin pedagogues of the 20th century, has been dead for over 30 years. While we would have loved to have someone of his stature working for us at Violinist.com, only an idiot scraper script would think he works for us now.

It kills me that good websites, blogs and journals written by thoughtful correspondents get pushed down in the SERPs – and overlooked by potential fans – because of this garbage.

I want a search engine that knows better – that excludes Web domains populated by scraped data and instead searches online sites written by actual human beings. I wouldn’t limit such a search engine to sites written by paid, professional staff. There’s too much rich content to be found in the conversations of others. But blogs, discussion boards and rating-and-review sites included in this search engine should be composed of information submitted by human beings, not scraped from other websites and edited together by bots.

The original Yahoo! lost when start-up rival Google indexed more pages than Yahoo, giving Google an edge over its established competition. But I – and, I suspect, many others – don’t care about the size of a search engine’s database any longer. Google’s right on in its attempt, announced today, to build a more human-driven search engine. But I’m not convinced that adding Google+ links to the SERPs is enough of a change to make a difference in quality.

First, not enough people use Google+. Its 18-and-over-only age limit also disqualifies the millions of teen-agers who help drive the digital conversation. And I fear that Google’s new “Search Plus Your World” approach simply will encourage spammers to flood Google+ with even more bogus accounts and friend requests, in order to boost their reach into the Google SERPs those new “friends” see.

It’s great to use social media to help bring more people into the process of selecting which websites should be indexed in a search engine. But, ultimately, at this point organizations still need more aggressive in-house human oversight in back-checking the results.

Google lost its quality control over its SERPs long ago. Whether it’s search engine results or business lead lists, there’s too much crap data on the market today. That illustrates the continued need for more, and better, human leadership of data cultivation. There’s a market need out there. So who’s going to step forward to fulfill it?