Google Feeds Novels to AI to Improve Its Conversation
The Guardian (NY) (09/28/16) Lea, Richard
A paper by researchers at Google Brain detailing how they processed thousands of novels through a neural network as part of a study to improve the software’s conversational fluency has caused quite a stir among authors, who say their work was used without permission. According to Google, after 11,038 novels were “fed” into a neural network, the system was able to generate fluent, natural-sounding sentences. Google says that products such as its Google app will be “much more useful if they can capture the nuance of language better.” Google notes that “in this case, it was particularly useful to have language that frequently repeated the same ideas, so the model could learn many ways to say the same thing–the language, phrasing, and grammar in fiction books tends to be much more varied and rich than in most nonfiction books.” Researchers say the novels used were available online, describing them as “free books written by [as] yet unpublished authors.” Google says the entire collection is available for download from the University of Toronto and has been used by other artificial intelligence researchers. However, many writers whose work was used are adamant that Google should have contacted them for permission, especially if their work is being used by the company to gain a commercial advantage. “If there’s one thing that’s niggling at me it’s that I would have liked to have known,” says Rebecca Forster, whose thriller “Hostile Witness” was used by Google. “With all the technology at their fingertips, it wouldn’t have been too hard to let everyone know.” According to Mary Rasenberger, executive director of the Authors Guild, the project represents “blatant commercial use of expressive authorship” and is a “plain and brazen” violation of copyright law. “Why shouldn’t authors be asked permission, or even informed–not to mention compensated–before their work is used in this manner?” Google has not said if the company plans to reward the authors, or if the people whose expertise was harvested to train their network were ever considered as individuals. Google says the researchers clearly identified where they got the data. “The machine learning community has long published open research with these kinds of datasets, including many academic researchers with this set of free e-books,” Google says. “It doesn’t harm the authors and is done for a very different purpose from the authors’, so it’s fair use under U.S. law.”
Article taken from the ATA Newsbriefs from 10/17/2016.