A Conversation about Anomaly Detection in Finance with Anju Kambadur

Anju Kambadur, Bloomberg’s Head of AI Engineering, helped organize today’s 2nd KDD Workshop on Anomaly Detection in Finance, which is taking place during the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019) in Anchorage, Alaska this week.

During the workshop, Adrian Benton, a Senior Research Scientist in the AI Group, will publish a paper entitled “Calibration for Anomaly Detection,” which proposes an extension of Platt scaling, Charcoal Grill Scaling, to be used in the calibration of the statistical models used to estimate the probability of anomalies. Machine learning engineer Sidney Fletcher will present the Bloomberg Law team’s paper entitled “Textual Outlier Detection and Anomalies in Financial Reporting,” that highlights their recent research on outlier detection focused on identifying topical outliers in high-dimensional datasets to textual data about Risk Factors listed in companies’ annual reports. Clay Eltztroth, a Product Manager for Bloomberg News, will also deliver an Invited Talk entitled “Fake News: A Path to the Truth” about disinformation’s potential impact on the financial markets and how newsrooms are fighting back by using technology to identify it.

We sat down with Anju to learn more about the workshop, the importance of anomaly detection in finance, and why this problem is important for Bloomberg to tackle. The conversation has been edited for length and clarity.

What was your inspiration to hold this workshop? What do you hope that attendees learn from this event? Why now?

One of the co-organizers of this workshop is a good friend from IBM T. J. Watson Research Center, where I worked before coming to Bloomberg, and he had organized this workshop at KDD a few years ago. He called me earlier this year and we got to talking about organizing the second edition of this workshop in order to bring attention to the unique problems we solve in the finance industry, especially the importance of anomaly detection. Some of the problems are unique to our industry and they’re not well explored in academia. For example, a paper being presented in this workshop by our own Adrian Benton talks about the importance of more accurately calibrating probabilities from classifiers, since we care not only about identifying rare events, but also estimating how confident we are in our predictions. I believe that workshops such as this one help academicians understand the finance industry better, and that will hopefully inspire them to work on problems that are important to us.

How are financial markets and the media connected? What effect does one have on the other? How is fake news interconnected with the finance industry?

Events move markets. Since it is impossible for everyone to physically observe every event, the main way these events are communicated to the public — and investors — is through media channels such as news wires, television, radio, and, increasingly, social media. As a result, there has always been a deep connection between media and finance — this is why News is considered to be one of the key pillars of Bloomberg’s offering to its customers across the global capital markets.

Fake news hits at the heart of finance. Our customers count on the information being provided to them being accurate. There are several examples where the markets moved not immediately after an event was reported, but only after Bloomberg reported on the event. This demonstrates both the trust that our industry has in Bloomberg’s reporting, as well as the caution exercised by our customers when they hear about events that might affect their investments.

How are you hoping to bridge the gap between academia and the financial industry?

I wouldn’t necessarily say that there is a gap or divide, but there certainly is some mystery surrounding the problems solved in finance. With this workshop, our goal is to bring together smart people to talk about one another’s interests, to discover some new problems, and to discuss potential new approaches and solutions. I see this as a continuation of Bloomberg’s ongoing investment to foster deeper academic and institutional collaborations through programs such as the Bloomberg Data Science Research Grant Program and our Data Science Ph.D. Fellowships.

What part does Bloomberg have in the financial industry?

As a global information and technology company, Bloomberg quickly and accurately delivers business and financial information, news and insights to customers around the world — driving the world’s financial markets. For example, the Bloomberg Terminal connects finance professionals to a dynamic network of information, people, and ideas. At the core of this network is the ability to deliver real-time data, news, and analytics to some of the most influential people and companies around the world – in business, finance, government, policy or philanthropy – enabling them to make well-informed decisions about their business and financial strategies.

Why did you choose the papers for this workshop? What were your criteria?

The papers being presented in this workshop were chosen like any other peer-reviewed conference — based on the merit and novelty of the ideas presented and their relevance to the topics we listed out in the call for proposals. The criteria were especially important to us in order to keep the focus on topics related to anomaly detection and finance.

What is anomaly detection? How is that relevant to KDD?

In layman’s terms, anomaly detection aims to help detect events that are rare and/or are different from the norm. You can imagine why this is of importance to the finance industry. For example, in consumer banking, anomalies might be bad things — like credit card fault. In other cases, an anomaly might be something that creates a great investment opportunity for our clients. KDD 2019 is one of the premier conferences in data mining, machine learning, and natural language processing. These happen to be the tools we use for anomaly detection, so it’s a great venue at which to hold this workshop.

What are the applications of anomaly detection? What business problems are you looking to solve?

There are many applications that can be posited as uses of anomaly detection. Take the example of a financial analyst changing their buy/hold/sell rating on a particular stock following the company’s quarterly earnings announcement. When an analyst changes their rating, the event in itself is not as important as knowing and understanding the context. Was this analyst expected to change their rating? Were they simply moving towards the market’s expectations? Do markets typically move when this analyst changes their rating? Recasting this problem in the framework of anomaly detection forces us to take the relevant context into account, and leads us to interpretations that make sense for our customers.

What are some of the current solutions to these challenges? How do you hope to move this research forward?

There are several approaches to anomaly detection which span a wide variety of techniques, including supervised, semi-supervised, and unsupervised methods. The solution one chooses depends on the amount of data and the type of data one has, as well as the business problem they’re facing. For example, if you’re trying to detect an anomaly that is expensive to miss, you might invest in systems that have higher recall with human-in-the-loop backup. We are primarily hoping to move this research forward by aiding in the discovery of new problems and solutions by our practitioners and opening it up for discussion with a wider community.

What are some of the challenges that researches have? How do you hope to start removing these roadblocks?

There are several types of challenges. Foremost among them is the lack of knowledge of the types of problems we face in finance. In turn, this leads to a lack of academic and industrial research on these topics. I feel this is the primary roadblock we need to overcome. I’m confident that workshops such as this one will help move the state-of-the-art in solutions to a wide variety of problems that are grouped together under anomaly detection.