Type Bill Gates’ name into Google and the search engine will eagerly suggest things you might want to know about him. His net worth? Book recommendations? Information about his house?
This functionality is called auto-complete, and it’s crucial to allowing users to ask a search algorithm questions in conversational English, or natural language. The Bloomberg Terminal already features natural language question answering systems for a number of financial domains, such as bonds or equities. An effective auto-complete system is the key to rolling out a single, unified question answering system across the Terminal.
“Auto-complete is very useful for guiding our users toward formulating their queries in a way the system can actually understand,” says Konstantine Arkoudas, a senior research scientist and software engineer who leads Bloomberg’s Question Answering team in the Machine Learning Engineering group. If the system is suggesting completions, it means that it’s understanding user input and can come up with appropriate answers. If the system stops offering completions, says Arkoudas, it’s a signal to the user that, “at that point, we have lost them.”
Generating completions is simpler if the system has a very large vault of queries to draw from (although for complex queries, the combinatorics are such that no vault will ever be large enough). The system can then generate a set of candidates by consulting similar queries that users have issued in the past. Bloomberg, by contrast, is rolling out its question answering capability while simultaneously debuting auto-complete. “We are facing a so-called ‘cold-start’ problem,” says Arkoudas. “It’s not as if the system has been up and running for many years and we have millions and millions of queries to choose from.”
Arkoudas and Mohamed Yahya, another senior research scientist and software engineer at Bloomberg, have developed a series of algorithms to address this problem. They presented the challenges and their solutions in a presentation entitled “Auto-completion for Question Answering Systems at Bloomberg,” earlier today at the SIGIR Symposium on IR in Practice (SIRIP 2018), formally known as the SIGIR Industry Track, which is held as part of The 41st International ACM Special Interest Group on Information Retrieval (SIGIR 2018) Conference on Research and Development in Information Retrieval, being held this week at the University of Michigan in Ann Arbor, Michigan.
While Arkoudas and Yahya don’t have billions of queries to work from, they rely on the fact that the question answering system maps queries to their semantics, and they have an idea of what a proper query structure looks like. Among other things, this allows them to generate queries synthetically. “We can automatically generate millions of queries as if they were typed by the users,” says Arkoudas. A neural language model ranks those queries based on how well they fit natural language patterns.
Arkoudas and Yahya then created an algorithm to extract small chunks of meanings, or atomic constraints, from each query. A query that asked for every bond issued by IBM or Apple that matures before 2025 and is denominated in U.S. dollars would have three constraints: IBM/Apple, maturity before 2025, and the USD denomination. Using semantics, those constraints then become candidates for completing an input and can be suggested as someone is typing. The algorithm then uses statistical techniques (beyond language models) to score the candidate completions in their new (previously unseen) contexts.
But Bloomberg is going beyond the use of auto-complete to answer questions that users already have in mind. There’s a second goal as well: “The idea is to use auto-complete as a way of showing people things they are able to do with the Terminal that they might have no idea about,” says Arkoudas. If someone is searching news stories, a given word in their query might refer to a person, a company or a topic. “The completion list is limited in number because you can only suggest up to ten completions,” says Arkoudas. “You want the presented options to be as diverse as possible.” In this case, the list of completions would ideally contain news about someone or something that the user did not know was covered by Bloomberg.
Bloomberg is testing this model internally to gauge how well it might work as users begin to type more natural language queries into the Terminal. To evaluate the model, testers are asked if the suggested completions look like natural language, if they seem complete, and if the completed questions were properly understood by the system, says Yahya. The ultimate test lies in how often someone clicks on an auto-completion rather than manually typing in an entire question.
Yahya points out that, until now, the vast majority of the work on auto-complete has been done in the context of document search. The goal has been to find a document that matches a few keywords, rather than to find concrete answers to a potentially complex, multi-constraint question. But, as more companies begin to experiment with dialog systems and natural language in general, they’re more likely to find themselves in a similar position to Arkoudas and Yahya: trying to build an auto-completion system without an existing user base to fall back on. “That really means that auto-complete and question answering systems must go hand-in-hand,” says Yahya. “Otherwise people are asking questions of a system without understanding its full capabilities and limitations.”