Learning how to predict future events from patterns of past events is a critical challenge in the field of artificial intelligence. As machine learning pioneer Yann LeCun writes, “prediction is the essence of intelligence.” Researchers have recently experienced breakthroughs with neural networks, which have led to more accurate predictions in a wide variety of situations.
Despite these advances, neural models may “overfit,” picking up on spurious patterns during the training process due to significant over-parameterization. Overfitting remains a chronic weakness of neural systems, particularly in situations where abundant neural links in the network outweigh patterns found in the training examples. Under these circumstances, relying purely on neural models may lead to bad predictions on novel inputs. Machine learning systems designed to make predictions about creditworthiness, for instance, may erroneously end up identifying businesses with high debt levels and low profits as “safe.” Overfitting remains a barrier in applying machine learning to a range of high-stakes domains like medicine.
This week, a joint team of researchers from the Department of Computer Science at Johns Hopkins University and Bloomberg are presenting their paper “Neural Datalog Through Time: Informed Temporal Modeling via Logical Specification” at the Thirty-seventh International Conference on Machine Learning (ICML 2020). The team includes Hongyuan Mei, Guanghui Qin, and Professor Jason Eisner from Johns Hopkins, and AI/ML researcher Minjie Xu from Bloomberg.
The paper attacks the problem of event prediction through a unique method: a neural-symbolic hybrid. “Neural Datalog” is built around a deductive database based in the Datalog programming language that defines and dynamically reconfigures a neural generative model.
The paper proposes a specific modeling language for encoding the database of facts and their mutual influences, thereby allowing practitioners to define facts and rules that are relevant to a problem space. The resulting structural neural network architectures work to prevent the model from overfitting on irrelevant patterns in the data.
For a system that predicts future travel to Chicago, for instance, the database might pre-define that a specific individual’s travel plans are determined by certain facts about Chicago, such as the weather. This prevents the model from identifying irrelevant factors that may be incidentally correlated with, but do not determine travel behavior, like sports scores.
Datalog itself harkens back to a different era in artificial intelligence research. Originally developed in the 1970s, Datalog was designed to enable researchers to build deductive databases, systems that can make deductions based on sets of formal facts and rules that are stored in the database. The researchers discovered that the language was ideally suited to the problem space and was a straightforward interface for encoding domain-specific knowledge.
This unusual combination of new and old makes a major impact. The researchers tested the neural Datalog approach against two very different real-world problems: the prediction of TV viewership behavior, and the passing behavior of robot soccer players during the annual Robocup competition. In both cases, the neural Datalog approach showed significantly better performance against strongly competitive neural methods like Know-Evolve and DyRep.
This paper involved extensive work over two years. As Mei describes it, the publication of the paper is “a rest stop in the middle of a long journey.” The research originated in Mei’s experience as a member of the inaugural class of Bloomberg Data Science Ph.D. Fellows. Launched in 2018, the program provides support and encouragement for groundbreaking publications in both academic journals and conference proceedings by exceptional Ph.D. students who may be early in their careers in broadly-construed fields of data science, including natural language processing, machine learning, and artificial intelligence.
Fellows participate in a 12-week summer internship with Bloomberg data scientists in the Office of the CTO and/or AI Engineering group, collaborating with the teams to bring cutting-edge research and engineering firepower to bear on tough problems. The fellowship provides exposure to the finance industry and Bloomberg’s broad, high-quality data. As Minjie Xu, Mei’s co-author and fellowship program mentor remarks, “the Fellows program is a great opportunity for top-tier researchers to share their insights with industry and apply the latest techniques to some of the hardest data challenges out there.”
The method developed by Mei and his collaborators was inspired during the fellowship by, among other technical motivations, the complex challenge of modeling market events, such as earning reports, news articles, and stock movements. These events are both numerous and have complex dependencies among them, making it difficult to address with pure neural models that do not incorporate domain-specific knowledge.
Mei believes this paper provides practitioners with a useful tool to automatically configure and reconfigure a neural network based on their domain knowledge written in Datalog. For example, medical experts need only specify their knowledge about pharmaceuticals and clinical measurements to obtain a neural sequential model that can predict future patient outcomes, without bothering learning how to code the model.
The paper also opens up many future directions for research. For example, the framework may be adapted to enforce logic constraints to a natural language generator to ensure that its output is meaningful and coherent.
In the current framework, however, the knowledge is still hand-specified by humans. Mei is hopeful that, at some point in the future, one interesting – but very challenging – task would be to learn to generate useful and meaningful Datalog programs from the data automatically. This would massively expand the areas where the hybrids explored by him and his co-authors could be applied.
Also at ICML 2020, senior software engineer Dan Sun co-authored a paper “Serverless Inferencing on Kubernetes” about KFServing, which Seldon CTO Clive Cox will be discussing in an invited talk during the Workshop on Challenges in Deploying and Monitoring Machine Learning Systems on Friday, July 17th. During this same workshop, senior software engineer Yuzhui Liu will participate in a panel about open problems in machine learning systems, addressing the unsolved issue of deploying ML models serverlessly at scale.
Finally, machine learning quant researcher Achintya Gopal will present a paper “Quasi-autoregressive residual flows (QuAR flows)” during the ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (INNF+ 20) on Saturday, July 18th. Watch his poster presentation during the workshop’s poster session (1:00-1:40 PM EDT).