Data Science Research Grants: Announcing Our Second Round of Winners

The Bloomberg Data Science Research Grant Program aims to support cutting-edge research in the broad field of machine learning, including specific areas such as natural language processing, information retrieval, machine-translation, deep neural networks, etc. In April, we announced the winners of the first round, inviting proposals for the second round. Today, we are happy to announce the winners of the second round.

We received over sixty proposals in a diverse set of areas, and a committee of Bloomberg researchers – from our R&D and CTO offices – selected proposals by Chris Dyer and Alex Smola from the Carnegie Mellon University, Ameet Talwalkar from the University of California at Los Angeles, Mark Dredze and Benjamin Van Durme from Johns Hopkins University and Mausam from the Indian Institute of Technology, Delhi.


About the winners:

Deep Topic Models, Professor Alexander Smola and Professor Chris Dyer, Carnegie Mellon University

Representations of language anchored in deep neural-networks have led to highly flexible and accurate models. However, these models fail to accurately capture temporal dynamics in content, e.g. for news. Prof. Smola and Prof. Dyer propose to combine Deep Networks and Topic Models to accurately model events and address these shortcomings of existing approaches.

Distributed Local Learning via Random Forests, Professor Ameet Talwalkar, University of California at Los Angeles

The machine learning community is actively developing methods that leverage distributed computing architectures to tackle the scale and diversity of modern datasets. The standard approach of retrofitting classical statistical models has been fruitful, but faces some fundamental computational and statistical limitations. To address these issues, Prof. Talwalkar intends to study a family of machine learning methods (random forests) that directly target massive data sets.

Establishing Trust in Tweets, Professor Mark Dredze, Johns Hopkins University

Twitter data provides a valuable source of information for a variety of applications. Yet even when automated methods discern the relevance or meaning of a tweet, the question remains: can we trust this information? As humans, we are pretty good at determining if the type of information matches the person delivering it: can we trust this information from this person? The goal of this work is to develop algorithms that can do this automatically.

Report Linking, Professor Benjamin Van Durme, Johns Hopkins University

Prof. Van Durme proposes a global modeling strategy to detect and link relations and entities across multiple documents, such as corporate press releases and newspaper articles. This report linking framework will be used to isolate facts stated in multiple documents, and then help determine which of these facts are novelly expressed. This will result in an open source platform for helping to organize the information contained in large document collections.

Coherent Multi-Document Summarization, Professor Mausam, Indian Institute of Technology (IIT), Delhi.

Prof. Mausam proposes to build a multi-document summarization system that can read a large set of news documents on a given topic and automatically summarize them in a coherent and readable fashion.

For more information on our grants program we invite you to visit and to follow us on Twitter @TechAtBloomberg for updates.