Four Engineers’ Posters Presented at the 2019 Grace Hopper Celebration

At the 2019 Grace Hopper Celebration of Women in Computing in Orlando, FL this week, four Bloomberg software engineers are showcasing some technical projects through posters at the conference. Through these posters, the developers are highlighting a variety of AI, data science and NLP applications in finance, as well as new forms of human-computer interaction to enhance the management of software infrastructure.

We asked the presenters to summarize their work and explain why it was notable:

Wednesday, October 2

Poster Session 1: Natural Language Processing (12:00-2:30 PM EDT)
Classifying Sentiment of Tweets Based on External Factors
Vicki Liu

How would you summarize your poster?

My poster explores the concept of grounded emotions, focusing on how external factors – ranging from weather, news exposure, social network emotion charge, timing and mood predisposition – may have a bearing on one’s emotional level throughout the day. By testing the correlation between certain external factors and Twitter sentiment, we explore which of these are most significant in grounding emotions. As a result, we can gain a deeper understanding of the connections that exist between contextual factors and one’s internal emotional state.

What problem were you trying to solve? And what were the key results?

There has been great success using sentiment classifiers based on the textual content of a tweet to classify its sentiment as positive, negative or neutral. However, psychologists have posited that emotions are personal reactions which give validity to interactions with real-world phenomena. For example, they have found interesting patterns in human emotion related to daily, seasonal and weather-related factors. Not only is this something worth learning more about, but also – in today’s political climate – there are many entities that are publishing content to try to sway people to feel positive or negative about certain products or political issues. It is clear that the more we publicly study what primes people toward a particular emotional response, the more we will be able to avoid and inoculate against such unscrupulous tactics. The main results of our study were that, by using a combination of external features, we were able to achieve a 66.9% accuracy WITHOUT analyzing the text of the tweet. This is on par with — or even better than — some state-of-the-art text-based sentiment classifiers that have traditionally been used.

What is notable about your research or technical development?

To the best of my knowledge, this is the first study that looks at the ability to predict a tweet’s sentiment without relying on the textual content of the tweet itself.

Why are you excited to share it with GHC attendees and other experts in your field?

I’m very excited to share this work with the attendees of the Grace Hopper Celebration because I think it is fun to think about how this study might be expanded or continued. You just need a beginner’s understanding of ML concepts and an interest in what drives the human psyche to try it out yourself.

Poster Session 1: Human Computer Interaction (12:00-2:30 PM EDT)
CATBOT: Automated Triage Bot for Incident Management
Kanika Sabharwal

How would you summarize your poster?

Incident management is key to limiting any disruption caused by a crash or failure and restoring normal business operations as quickly as possible. CATBOT, a Python-based automation bot we built, assists in the incident management process by performing real-time detection and accelerating resolution of critical incidents for my team at Bloomberg. It does this by providing diagnostic information related to the failure to our infrastructure engineers, while also isolating the potential root cause owner of the incident.

What problem were you trying to solve? And what were the key results?

Bloomberg acts as the central nervous system of finance, connecting market participants with financial information, data, news and analytics. Ensuring the reliability of this information delivery service is essential. Incident management is one of the many ways we achieve this. By employing CATBOT, we can be even more efficient and consistent in the way we respond. Since we implemented CATBOT to automate incident management, it has successfully saved our team a significant amount of time in both detecting and resolving incidents. Our engineers can now spend the time saved to work on building new infrastructure features and tools.

What is notable about your research or technical development?

CATBOT is an end-to-end system which holistically automates the triage process triage staring with incident detection to resolution assistance. It is quick enough to instantaneously identify critical incidents and smart enough to predict the next step for resolution. It can also be used by other teams to trigger their own custom runbook scripts.

Why are you excited to share it with GHC attendees and other experts in your field?

Incident management is crucial in the tech industry, where every firm needs to perform it as quickly and efficiently as possible. By telling CATBOT’s story, I would like to promote how automation can improve the incident management workflow, thereby encouraging other engineers to adopt a similar approach. Showcasing my project at an event like the Grace Hopper Celebration has two purposes. It helps me gain feedback and thoughts from other women in technology, while encouraging them to pursue careers in coding and software development.

Thursday, October 3

Poster Session 3: Data Science (10:00 AM-12:30 PM EDT)
Stock Price Prediction using Convolutional Neural Networks
Huayan Zhong

How would you summarize your poster?

In our poster, we treat daily stock market data as images and apply a convolutional neural network (CNN), which is very powerful at image processing, to the market images in order to predict future stock price changes.

What problem were you trying to solve? And what were the key results?

We want to find out whether groups of stocks with similar stock price movement can perform better than modeling each stock – and its historical price movement – separately. The result shows that our market-image CNN model works better than our baseline in both short-term and long-term stock price prediction.

What is notable about your research or technical development?

The way we construct our stock market images is unique in that we cluster stocks based on their historical stock price changes instead of using pre-defined stock sectors, such as GICS, a standardized industry classification system for equities used across the global finance community.

While this model cannot be used as a trading strategy (i.e., it does not take into account trading costs, time out of market, and many other factors), its utility is in analyzing the market as a whole and how individual stocks may be related.

Why are you excited to share it with GHC attendees and other experts in your field?

Although it’s very hard to predict stock price movement due to its complexity and volatility, we are presenting a different way to transform financial data by applying machine learning techniques to it. Machine learning has lots of applications in financial services and there are lots of job opportunities for students who are interested in both machine learning and finance.

Poster Session 4: Artificial Intelligence (2:00-4:30 PM EDT)
Customized Spell Checker With Multinomial Bayes Classifier
Sheryl Zhang

How would you summarize your poster?

The use of deep learning for classification and regression is growing in real-world applications. However, sometimes a simple probabilistic classifier can be very powerful. I will demonstrate how we utilized a classic multinomial Bayes classifier to perform customized spell-checking on some of Bloomberg’s text datasets. I will also propose some enhancements to make our approach more precise and tailored to our datasets.

What problem were you trying to solve? And what were the key results?

At Bloomberg, we are building the world’s most trusted information network for financial professionals and data quality is key. We have a massive amount of financial text data and there are a combination of manual and automated processes that we use to bring the data into our system. As a result, it is possible for data errors to be introduced during the data entry process. The solution presented in this poster is one of the approaches we developed to capture and correct errors at different data ingestion points. This approach, which has helped us clean up errors in multiple datasets, has outperformed a deep learning-based approach to this task in both precision and cost-effectiveness.

What is notable about your research or technical development?

When we researched the approach to detect and correct anomalies in text data, we carefully looked into what kinds of errors there were and tried to use as much information as possible from the dataset itself to develop the most suitable solution. In this approach, we chose not to depend on external dictionaries and corpora so the model can learn the probability distribution of text in the dataset itself. This approach results in a highly-accurate and computationally-efficient solution which makes it more useful in and easier to integrate into our real-world applications.

Why are you excited to share it with GHC attendees and other experts in your field?

Data quality assurance is a common problem space in the industry and the approach we proposed is a very cost-effective solution to one often-faced scenario. Therefore, I believe many people can benefit from the work we did and the lessons we learned. In addition, Grace Hopper Celebration is a conference full of passionate women engineers. Sharing my work with them and hearing their thoughts will also give me good feedback that I can use to improve my work. I also hope that by showing not only my work, but also my experience of being a female software engineer will be inspirational to the young professionals attending the conference, thereby encouraging more women to work in the field of software engineering.