COLING’2020: Computational Linguistics, Conversation Modeling, and Argumentative Features in News Classification

One of the top NLP conferences in the world, The 28th International Conference on Computational Linguistics (COLING’2020), is taking place this week, from December 8-13, 2020. Normally held once every two years, this year’s conference is being held virtually (note, Bloomberg is a Gold Sponsor of this year’s conference).

On Wednesday, December 9, 2020, at 15:00 CET / 9:00 AM EST, NLP Architect Amanda Stent, who leads the People+Language AI team in the Office of the CTO, will look at the history of Computational Linguistics research on conversational systems during one of the conference’s invited talks. In the talk, she will give an editorial overview of the history of conversation modeling in NLP, with a special focus on COLING publications over time. She will then present how conversation modeling is used in finance applications, including at Bloomberg. She will then present some potential areas where conversation modeling research can have impact over the next decade.

AI researcher Daniel Preoţiuc-Pietro is also publishing a research paper at the conference together with his academic collaborators: Tariq Alhindi, a Ph.D. student from Columbia University’s Department of Computer Science, and his Ph.D. advisor, Smaradan Muresan, a research scientist with the Data Science Institute (DSI) at Columbia University and Adjunct Associate Professor of Computer Science in the The Fu Foundation School of Engineering and Applied Science. Their research entitled “Fact vs. Opinion: The Role of Argumentative Features in News Classificationhighlights their interest in the discourse structure of conversations and news. It will be presented on Friday, December 11, 2020, at 17:00 CET / 11:00 AM EST (Session LONG42 – Text Classification).

First page of “Fact vs. Opinion: The Role of Argumentative Features in News Classification”
Click on the image above to read and download the full paper

We asked Daniel to summarize the team’s research and explain why the results were notable:

Please summarize your research.

Daniel: It is estimated that only 41% of publishers label the type of article (e.g., editorial, review, analysis). Yet, among those who do label the types, there remains a lack of consistency and clarity. Automatically classifying opinion pieces and editorials can better help orient readers in discerning factual information from opinion.

In this paper, we studied the role of argumentation features in classifying opinion and editorial pieces from factual news stories. Our hypothesis was that a key difference between news stories and opinion articles can be found in their structures — particularly the argumentative and persuasive aspects typical of opinion articles. We show that argumentative features help improve the performance of opinion classification and help models generalize better on news sources that were not seen in training.

Why are these results notable? How does it advance the state-of-the-art in the field of computational linguistics?

Daniel: A major finding of a 2018 study led by the Media Insight Project showed that nearly 80% of journalists think their news organizations should clearly mark what is news reporting and what is opinion/commentary in order to combat fake news and regain public trust. Our approach can help news publishers or aggregators more clearly display this information to users. Further, the ability to identify the argumentative elements in news articles (e.g., claims, premises) can be used to better highlight these elements to readers in order to help with facts or opinions.