Can tech conquer fake news?
Fraudulent news sites threaten the credibility of online content, but the question is whether technology can understand what is false.
Just as fake ad traffic threatens the online ad industry, so fake news content threatens the credibility of online content.
It is, essentially, the Orwellian vision of modern propaganda, updated. Millions believed preposterous stories this past election because they seemed to come from credible publications — even if the publication (e.g., the fictional “Denver Guardian”) didn’t exist except in a fake web site.
“It’s a big problem,” mobile adtech firm AerServ’s COO Andrew Gerhart said. “”It’s really easy to put up a site — five minutes and $5.”
One question is whether tech — which is utilized to battle ad fraud at scale — can be similarly employed against fake news sites. It’s tricky, since even humans might disagree about some sites’ classification. As our own Danny Sullivan noted about Google’s attempt to technologically filter out or de-emphasize fake news in its search engine:
“The difficulty here is that Google has a real challenge in automatically assessing whether something is actually true or not. As smart as Google is, it can still be very dumb on complex and nuanced topics. It can also be misled by those who accidentally or deliberately post material that seems to be factual in nature but is not.”
And the threat extends beyond politics, since false info can and will be used to smear brands.
The New York Times reported last week, for instance, about a pizzeria in Washington, DC, that was accused of being a child trafficking hub, because it had been implicated in fake news accounts. And Pepsi and other brands were blacklisted by Trump followers because of fake news posted on Reddit.
Whether in politics or commerce, the fight against this threat to the free flow of accurate information involves several clear battlefronts: search, social, consumers and ad tech.
The major ecosystem players in search and social, of course, are Google and Facebook, both of which announced earlier this month that they are working to cut off advertising from sites that misrepresent content.
But the key hurdle is detecting what is fake. And even when content is determined to be fake, it may be mixed with valid content.
Case in point: The Washington Post recently reported that the Russian government fed “a hurricane” of fake news reports to 200 news sites during the recent election. Given the Russians’ devotion of resources to email theft and dissemination during the recent election, that certainly seems plausible.
But there’s a catch. As Rolling Stone magazine and others pointed out, the Post’s source for much of the info about the Russian sourcing came from a mysterious site called PropOrNot, which bills itself as “Your Friendly Neighborhood Propaganda Identification Service.” According to a story in The Intercept:
“[PropOrNot’s] list of Russian disinformation outlets includes WikiLeaks and the Drudge Report, as well as Clinton-critical left-wing websites such as Truthout, Black Agenda Report, Truthdig, and Naked Capitalism, as well as libertarian venues such as Antiwar.com and the Ron Paul Institute.”
In other words, if human observers disagree about whether those 200 or so sites are fake news sites — whether they sometimes employ fake news or whether they just represent different political opinions — how can technology sort this out?
Earlier this month, CUNY professor and author Jeff Jarvis and venture capital firm Betaworks’ founder and CEO John Borthwick offered a variety of actions that could combine human and technological approaches to help filter how the 200 sites cited by the Post might be classified. Their basic principle:
“We do not believe that the platforms should be put in the position of judging what is fake or real, true or false as censors for all. We worry about creating blacklists. And we worry that circular discussions about what is fake and what is truth and whose truth is more truthy masks the fact that there are things that can be done today.”
A list of actions
Their list of 15 actions includes:
- crowdsourced reporting by users made easier by better reporting tools.
- the addition of identifying metadata for trusted news sources.
- an expanded system of verified sources.
- linkbacks to factual sources.
- more accurate extended listings in search when searching dubious news sources.
- the ability to edit items you share that you later find to be false.
- the use of human editors.
Their call for a new system of verification could be backed by a new tech approach that was recently built in 36 hours by four student programmers, who created an open source Chrome browser extension called “FiB: Stop living a lie.”
It uses artificial intelligence to classify any content — text, pictures or links — as verified or non-verified, based on a site’s reputation, comparison to known malware/phishing sites and automated searches on Google and Bing.
France’s Le Monde is already implementing a comparable approach, building an open source database of verified and non-verified sources. The newspaper is also working on a Google-funded initiative to automatically spot hoax news by querying relevant databases.
And Santa Clara University’s The Trust Project at the Markkula Center for Applied Ethics is developing what it describes as “an online indicator intended to signal whether a news operation is trustworthy.”
If search blocks or de-emphasizes the finding of such stories, social networks diminish their viral sharing, and consumers help with tagging, there is one more key engine that needs to be dismantled: ad revenue.
Ad networks, AerServ’s Gerhart told me, “can certify a publisher as ‘not fake’” in terms of their content, just as they certify that a publisher’s inventory actually exists and is displaying their ads properly.
His company already employs Pixalate and Moat to combat fraudulent impressions or viewability issues, and, he said, they could filter out inventory with fraudulent content, as determined by such approaches as Jarvis and Borthwick’s recommendations.
Eric Franchi, co-founder of New York City-based adtech firm Undertone, told me that he “absolutely” agreed that anti-ad fraud solutions should include anti-content fraud, adding that it’s “plausible to think” some brands might eventually publish fake news to damage competitors.
A key question is where the Interactive Advertising Bureau (IAB) is on this topic. It has taken a leadership position in the complicated matter of ad fraud, but an IAB PR representative told me via email that the “IAB does not have policies or standards that relate to editorial content — just the digital advertising itself.” Obviously, though, it does have standards about inventory, and content is part of inventory.
Selective supply-side platforms, private exchanges and direct ad sales have an easier time of screening out fake content than others, because they are more selective about where their ads run. But advertisers buying audiences programmatically, for instance, might reach “males 18–34 who like soccer,” but they are less picky about the inventory.
Performance advertising platform SourceKnowledge’s Director of Product and Marketing Justin Adler told me that his demand-side company could flag fake news sites if advertisers made noise.
But no advertisers have asked them to.
“I think, for the most part, they don’t care,” he said. “Many advertisers are rather indifferent about where their advertisements are running.”
But some adtech providers are taking the initiative to screen their inventory via human auditors.
AppNexus, for instance, recently banned Breitbart News from employing its platform because of its use of hate speech that was caught via a human audit. Breitbart News is sometimes also cited as a site conveying fake news.
And display and video ad provider Engage:BDR has had a human scanning the pages of sites where it serves ads since it was founded in 2009. A year ago, it set up a formal inventory quality department to weed out fake news sites, Marketing Director Sydney Goldman told me, and to flag any other content “that is not brand-friendly.”
The current flood of informational propaganda is a characteristically web-based problem, where an infinite number of sources are constantly being shared.
So it’s appropriate that web-based communities — search engines, social platforms, users and advertisers — can, if they so choose, use tech to bring information back to reality.