Bloomberg’s Amanda Stent on NLP, Research, and Ethics

When Amanda Stent, a Natural Language Processing (NLP) Architect at Bloomberg, took on the role of program co-chair for a prestigious conference in computational linguistics, she knew it would be a lot of work. “I had been an area chair several times and I’d developed a lot of ideas about what I wanted to do,” she says. As Stent approaches the culmination of this effort – the opening of 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT) today in New Orleans – she says, “Of course, you never really know what you’re getting yourself into.” Her colleagues from Bloomberg will be presenting three papers at the event.

Stent had a number of goals going into the conference – goals she says were shared by her “fantastic” co-chair, computer science professor Heng Ji of Rensselaer Polytechnic Institute (RPI). Stent wanted to increase the diversity of the reviewers. She wanted it to be easier for reviewers to give good and useful reviews of the papers they were asked to look at. And she wanted to use the conference as a platform to help data scientists, computational linguists, and others reconsider the way they view data.

Stent will be giving a talk on these topics at the Second ACL Workshop on Ethics in Natural Language Processing (Ethics-NLP18) during the conference. “Informed consent at the point of data collection and the point of data sharing is rare,” says Stent. “The scientific community has to jointly adopt principles for ethical data sharing, or we’re going to be regulated out of existence.” Stent’s co-chair Ji is involved in one research community project around data science ethics. Bloomberg sponsors another that Stent is involved in – the Community Principles on Ethical Data Sharing (CPEDS).

In the past, says Stent, NAACL HLT has not collected demographics about its reviewers. But last year, Stent read a CRA-W conference best practices guideline that recommended this and thought it would be wise for NAACL to follow suit. The results were not what she expected. “It used to be the case that 40 percent of people in ACL were women,” she says. But only 20 percent of reviewers were female (even less than the 22 percent of women authors at recent ACL-sponsored conferences). She’s not sure exactly why there’s been such a change, but she does note that graduate programs focusing on neural networks and deep learning seem to have fewer women. “And those areas of study have pretty much taken over our field,” Stent says.

Stent also found only 1.1 percent of reviewers, and less than 1 percent of authors, came from Central and South America. “Maybe they’re publishing at other conferences, or maybe we’re just not plugged in there,” says Stent.

Stent and Ji also wanted to make it easier for reviewers to provide good feedback to authors. That was a bit challenging because the field is growing so quickly, and many of the reviewers are less experienced. She says that more than 850 reviewers had completed their Ph.D. four or fewer years ago, compared to fewer than 250 who completed their Ph.D. five or more years ago. Stent thought a longer and more structured review form would help.

On that form, she also included new questions about the use of data. One question asked reviewers whether the authors documented that they had the rights to use the data. Another asked whether the authors had documented informed consent for use of any data from human participants.  A plurality of reviewers noted that the authors of papers they looked at didn’t say how they got the data, “but they scraped it from the web, because, who cares?” says Stent. “Not only do many authors not care, many reviewers don’t care.”

She had expected the problem to be worse in industry, because academics are supposed to submit their planned research to an Institutional Review Board (IRB), which is charged with ensuring that research is conducted ethically. But in many cases, the academics didn’t even mention IRB review in their papers.

Sometimes, says Stent, researchers will use “fair use” as a shield without understanding what it means. Reddit, she says, “is explicitly a free-for-all,” while Twitter most definitely is not. If someone deletes one of their own posts on Twitter, anyone else with a copy of that post is supposed to delete it as well. “As a field, we have policies around appropriate citation,” says Stent. “We need a policy around data.” She notes that other associations, such as the Linguistics Society of America, already have policies around data.

Stent’s not sure why researchers in linguistics would be ahead of computer scientists, but she says that engineers tend to worry more about the system they’re building than the data they use to run it. “Data is just something they need to feed the system,” says Stent. “People in linguistics are used to thinking about data as a first class object.”

When Stent and her co-chair make their confidential report to the Association’s leadership, they’ll have to count the few submissions that had to be rejected or withdrawn because their use of data was inappropriate. But not everything gets caught in time.

“Computational linguistics cares deeply about open science, which is great,” says Stent. “I want the field to care just as deeply about informed consent and ethical data sharing.”