Someday, this story may be written by a computer
Computers are already graduating from writing financial reports and short marketing messages to self-creating ads and attempts at essays.
If you write marketing or advertising text for a living, you may want to get a second job skill.
That’s because software that writes text is here, and it is tackling a growing list of assignments.
Several companies offer software that regularly churns out thousands of stories and reports based on structured data, like financial results. Ads that literally write themselves emerged last week, as IBM announced a new service based on its Watson supercomputer. A program called Quakebot has generated earthquake stories for the LA Times. And at least one software service is regularly creating short email and marketing text, while another is attempting to deliver full essays on any subject.
A billion and a half stories
The Wordsmith natural language generation software from Durham, North Carolina-based Automated Insights, launched last year, currently creates about 3,700 quarterly earnings stories for the Associated Press.
It utilizes raw earnings data from Zacks Investment Research to generate about a dozen times more such stories than AP reporters could do by hand in an equal period of time, all written in the AP’s style. So far, the displaced reporters are reportedly delighted, since earnings stories are among their dullest writing chores.
AP is only one of Automated Insights’ 200+ clients, for which it creates about a billion and a half stories annually from structured data. Wordsmith also generates car descriptions for Edmunds and Fantasy Football recaps for Yahoo, as well as product category landing pages, fund reports, ratings reviews, ticket sales reports and other formats. The company claims the “world’s only open API” for natural language generation, allowing it to “use any [structured] data source to automatically generate unique narratives on a massive scale.”
Here’s a sample from Friday’s Yahoo! Finance, credited as: “This story was generated by Automated Insights”:
Capstone Turbine reports fourth-quarter loss but tops expectations
CHATSWORTH, Calif. (AP) _ Capstone Turbine Corp. (CPST) on Thursday reported a loss of $5.3 million in its fiscal fourth quarter.
On a per-share basis, the Chatsworth, California-based company said it had a loss of 25 cents. The results topped Wall Street expectations. The average estimate of five analysts surveyed by Zacks Investment Research was for a loss of 31 cents per share.
The maker of turbine systems for energy generation posted revenue of $18.9 million in the period.
For the year, the company reported that its loss narrowed to $25.2 million, or $1.39 per share. Revenue was reported as $85.2 million.
“Numbers get articulated”
A competitor, Chicago-based Narrative Science, generates earnings previews for Forbes Magazine and other clients with its Quill software. Through its natural language generation, VP of Professional Services Keelin McDonell told me, “numbers get articulated” by drawing on any data in a consistent form, such as CSV, XML or JSON files.
Once a report structure is set up, she said, the generated content usually goes out to the intended audience without human review.
McDonell noted that financial institution Credit Suisse found many users couldn’t interpret charts in one of its online dashboards, so Narrative Science’s software was employed to generate narrative reports. Its big enterprise customers include USAA, MasterCard, Deloitte and an investment firm called In-Q-Tel, which backs “innovative technology solutions to support the missions of the US Intelligence Community.” Narrative Science declined to provide any detail on what it does for In-Q-Tel.
The company’s Quill Engage application serves about 14,000 small to medium-sized businesses, for such uses as reports about websites’ or AdWords’ performance, based on rules set up by the business. Here, for example, is a sample of a generated website report:
Another platform that draws on financial data is a crowd-sourced neural network called Emma/Mansi (Machine Augmented Neural Search Interface), an artificial intelligence system from a company named Stealth Mode Inc. One of its specialties is to predict which Exchange-Traded Funds will be most profitable. It also sidelines as a financial writer, such as a story Emma recently created in competition with an actual writer for the Financial Times:
While Automated Insights and Narrative Science are doing their best to turn structured data-heavy reports and narratives into just another output option, New York City-based Persado intends to become marketers’ Best Assistant Ever for short marketing messages.
It offers what co-founder and VP of product Assaf Baciu described to me as a “cognitive content generation platform,” employing natural language processing, machine learning, statistical analysis and computational linguistics.
The platform needs to know what the marketer’s offer is about, but it can then generate content from scratch or take existing content and suggest more effective versions.
Up to 400,000 versions
The platform can create huge numbers of variations — as many as 400,000 versions, according to Baciu — which are winnowed down algorithmically. It then sends out, say, the 10 best ones, and A/B tests those.
Baciu pointed out that the key difference from what, say, Automated Insights provides, is that Persado-generated content is specifically intended to create short marketing messages for driving action.
Persado’s output includes subject lines, email bodies, short landing pages, Facebook posts, display ads and direct (that is, post office) mail — each of which can be up to 600 characters, including spaces and punctuation. Full blog posts and other kinds of long-form narration, he said, are outside Persado’s job description. Here’s a Persado graphic comparing its output to humans’:
If you’re thinking of taking comfort in a human marketer’s ability to convey emotion, think again. Persado says it understands how to convey feelings.
It has catalogued the “cognitive triggers” for 19 different emotions, including excitement, pride, exclusivity, achievement and gratification. For instance, Baciu said, the platform might use the phrase “awesome news” or “you’re about to miss out” to show excitement.
New York City-based digital marketing agency MondoLabs recently started using Persado to write ads for programmatic delivery. VP of marketing Laura McGarrity told me that a staff person enters draft text for ads, parameters are set, and the platform spits out multiple variations that are specifically designed for social, web or email. Results are then A/B tested.
In each case, she said, there are multiple suggestions, and although MondoLabs is still in the early stages of using the platform, she noted that Persado claims its copy can boost engagement up to 70 percent.
“I would have to have 10 people to pump out that quality and volume without Persado,” she told me, instead of the one worker dedicated to the task. “It’s pretty impressive.”
Watson gets in the game
But Persado’s Darwinian approach to writing marketing and ad text — generate zillions, whittle them down algorithmically, and then test — may someday soon be seen as a shotgun approach compared to the new sharpshooting from systems like IBM’s Watson.
Last week, IBM’s Weather Channel subsidiary announced its new “cognitive ads,” where supercomputer Watson takes a break from finding medical cures to write original, responsive ads.
For instance, a viewer of an ad for seasoned tomato sauce Manwich might type in any ingredient, along with a question like, “What can I make for dinner using Manwich and these ingredients?” The Watson-powered ad will return an appropriate, original recipe, based on Watson’s understanding of how human taste works. I also saw a remote demo where the Watson ad returns written advice about using the allergy medicine Flonase after the viewer of the ad asks a question, such as about the drug’s safety for small children.
IBM told me via email that Watson Ads can “surface insights from massive amounts of data, but it does not generate original content.”
But, while Watson may just be pulling and combining knowledge or even excerpts from massive amounts of content, it sure looks like it is generating original text to a viewer of the ad.
Writers of marketing-related text might still claim a refuge in long-form posts and articles. But New York City-based Articoolo is trying to enter that refuge, with claims it can generate long-form narratives based on a few keywords you supply.
Here, for instance, is the text generated to my entered keywords of “software that writes”:
“As is the scenario with any technology that’s used, one must consider the good and the bad points and the same holds true for the resume writing applications. The most evident advantage of using free software is that there’s absolutely no cost involved. Only access to the internet and an OS that’s appropriate with the merchandise are needed. Another good thing about using applications that writes vitae is the fact the fact that all the templates are generic enough to adjust to nearly every career kind or field of experience or interest. This kind of generalization makes it possible for people to ensure their resumes are customized as far as possible as well as give them templates and recommendations which could help them in completing the resume.
“The templates are extremely helpful and the resume writing applications typically offers a variety of styles and templates including functional, chronological, specialized and hybrid. For new employees and for all those with limited work history like latest graduates, the functional template is ideal. For individuals with a significant work history and number of accomplishments to their credit in their career, the chronological template will work best. There are people who’ve worked for a long time period, but have held only one job title regardless of work duties improving and changing with time and for such individuals the hybrid template, which is a blend of chronological and practical, is best suited.
“For individuals with very specific abilities like compute professionals, trades people, medical specialists and others, the technical template is appropriate. Most free programs which write resumes provide choices for contacting ones resume, posting ones resume on-line and creating follow up and addressing letters or emails. With the advantages, there are also a few disadvantages included with the resume writing software. Probably the biggest disadvantage is that all of the templates tend to be structured and resistant to change and change, meaning that one can’t really deviate from the template much, despite the need for additional space or the need to do away with a few sections or headings of the resume. There’s also an extremely limited selection of fonts, graphics and characters you can select from to utilize in the covering letter or resume. These free softwares usually don’t have the advancement necessary for professional and high level resumes since the resume writing software doesn’t have options that extremely trained and qualified professionals require in their covering letter or resume.”
“Not yet perfect”
Given that Articoolo took the term “software that writes” to mean resume-writing software, the entry is not bad for a computer’s first cut, notwithstanding the fact that it needs a good edit to be readable and clean.
But, when The Next Web tried it out, they discovered on two occasions that Articoolo-generated content actually contained rewritten content from two EzineArticles essays that had been published online.
I asked CEO and co-founder Doron Tal about the roughness and apparent plagiarism. Via email, he replied:
“The quality of our content is not yet perfect. It can’t completely replace a human writer, at [least] not yet. But we never claim that our goal is to replace human writers. Instead, we are aiming at helping writers do their job quicker and more efficient. Using Articoolo one can save about 90% of the time and effort it would take to write an article. The cost of using our tool is so low [99 cents to $1.25/article] that it is almost a no-brainer. And by the way, since the review on TNW was written, more than 2 months ago, we’ve accomplished few more development milestones which improved the quality of the articles our algorithm writes so I urge you to try it yourself.”
“Regarding the claim that an article that was written with our tool appeared to contain content from other sources on the Web — it is true that our algorithm sometimes uses different kind of content sources from the web but the AI process that it does with the content leaves almost no trace of the original text. Each and every article goes through our own thorough clearance process to make sure it is completely unique and then also a Copyscape verification so we can be 100% positive about its uniqueness.”
When I asked if this involved “spinning” existing content, so that some words are changed to disguise the source, he replied:
“We take content bits from several resources after analyzing them (semantic, sentiment, etc.) and constructing it into a new coherent piece of content. The rephrasing phase at the end is also not just replacing words for synonyms, but a broader concept of syntax and grammar sentence rebuilding that we developed.”
Humans’ “one big edge?”
Articoolo’s struggle with creating usable, much less excellent, long-form content offers a brief but not permanent respite for those of us who write long-form marketing-related content.
After all, Watson and other supercomputing systems have demonstrated their ability to understand, relate and utilize concepts, so it would seem that the day when they can articulate their insights is coming.
Certainly, reports and short articles based on structured data appear to have succumbed to writing software. Narrative Science’s McDonell told me that her company can foresee their Quill software being embedded in all kinds of software, leading to the day when you generate a written report for structured data as readily as you can now generate a PDF.
MondoLabs’ McGarrity said that Persado and similar software are “absolutely the future,” because of the huge cost saving in generating so many useful variations. And Persado’s Baciu told me that he expects 60 to 80 percent of marketing content will be generated by platforms in five years, given that much of marketing text is shorter form.
The question, then, is not whether such software will be able to generate most if not every kind of marketing content. The question is whether there is some irreducible… something… that humans will be able to claim as their own, forever.
Slate’s Will Oremus has been sanguine about humans retaining “one big edge,” preventing such software from putting all writers out of a job.
“Humans are already better at thinking like a human than computers will ever be,” he wrote in a story on this subject two years ago. He cited humans’ unique abilities to tell stories and to put data and events into a larger narrative context of trends or related developments.
Our one thing
And Baciu told me that he expects any writing that requires “the human creative genius” would be off-limits to computer-generated text, at least for the next five years. He pointed to poetry, literature, marketing slogans and any content marketing “with an opinion.”
Unfortunately, there is every reason to believe that storytelling, contextualization or creative language and written thought will eventually be replicated by intelligent computing systems.
As Baciu points out, “a machine can learn style,” like how to write “cool” prose. Creativity and contextualization both require unique insights derived from massive amounts of data, a skill computers are already demonstrating.
After all, machine learning is all about generalizing from experience — which is the definition of an informed opinion. The Financial Times noted that the Emma-generated story “even included relevant context such as the possibility of Brexit (although she [Emma] was of the dubious opinion that it would be a ‘tailwind’ for the UK economy.)”
And if Watson can digest reams of information to answer questions about a drug, he already has opinions.
In fact, there’s only one arena that, at the moment, appears sacrosanct for human writers: writing about our experiences in the real world.
We might recreate, for instance, the exceptional customer service we get when we walk into a Trader Joe’s grocery store, or how valuable it is to get a discount coupon on our phone for a nearby iced coffee on a hot July day.
While we should expect software to learn how to opinionize, create original insights or language, write fully engaging stories and contextualize all kinds of data, computer systems still can’t draw on the pleasure of an iced coffee on a hot day.
That is, of course, until Quill, Emma, Watson or one of their supercomputing colleagues can put on a pair of shoes.