Bloomberg’s Global Data & CTO Data Science Teams Publish Best Practices for Data Annotation Projects

Annotation involves labelling data sets to make them more valuable to human readers or machines. As a result, annotation is quickly becoming an important sub-discipline within machine learning, where data – both structured and unstructured – is labeled to build and improve the performance of models. Since the first rule of computer science is “garbage in, garbage out,” data that has been improperly or insufficiently annotated can render the efforts of machine learning fruitless.

Annotation is crucial at a technology company like Bloomberg, where massive amounts of financial data are moving through its various data pipelines and analytics every day. Thanks to the company’s engineers and data scientists leveraging annotated data to create powerful machine learning models, Bloomberg clients can discover exactly the information about global capital markets they’re looking for at the time it is needed most. For example, when dividend quotes are displayed to a Bloomberg client on the Terminal, the majority of them have been extracted by a machine learning algorithm from financial documents like company press releases and/or stock exchange releases that may have been ingested in a different format by Bloomberg only seconds prior. Human curation cannot compete with this speed, but in order for the algorithm to select the most relevant new data, the model must first be trained on data that has been annotated by humans.

Machine learning models need both valid and reliable annotations, according to Tina Tseng, a Legal Analyst for Bloomberg Law. Not only must they be accurate to conform to the user’s understanding of the task, but also must be consistent so that the model can recognize data patterns.

Tina has been managing annotation initiatives for the past decade. When co-workers approached her for advice on the subject, she realized that the successful execution of these types of projects is an essential component of machine learning that is often overlooked because it is generally learned through practical experience and not through formal training.

Amanda Stent, NLP Architect and head of the People+Language AI team, which provides technical oversight for the company’s human computation strategy as part of the Data Science team in Bloomberg’s Office of the CTO, laments this gap in computer science education.

“I wish they taught annotation and data management best practices at universities, but they do not. A majority of computer science graduate students, including those that are involved with machine learning, use data that is handed to them, and I have met many Ph.D. graduates who don’t have the faintest clue about the data that they themselves spent three years working on while pursuing their Ph.D.”

Often, computer science Ph.D.’s don’t have the domain knowledge required to fully understand the complex data sets that they must leverage. Instead, they must rely on subject matter experts with deep knowledge of the field and problem to which they’re trying to apply machine learning (e.g., biology, finance or law) in order to manage the annotation process. As a result, annotation is typically a team effort, so “soft” skills like effective communication and collaboration are required.

Tina and Amanda, together with Domenic Maida, the Chief Data Officer who leads Bloomberg’s Global Data department, recognized that Bloomberg would benefit by establishing best practices for annotation, and by making them accessible throughout the company and beyond. This resulted in the publication of “Best Practices for Managing Data Annotation Projects,” a practical guide to the planning, execution, and evaluation of this increasingly critical work. It captures a wealth of wisdom for applied annotation projects, collected from more than 30 experienced annotation project managers from different teams in Bloomberg’s Global Data department.

Best Practices for Managing Data Annotation Projects
Click on the image above to read and download the full annotation handbook

The guide provides detailed advice for each step of the annotation process, from preliminary tasks like identifying stakeholders, establishing goals, selecting communication methods, and the consideration of budgets and timelines, all the way through to quality assurance and the detection of data drift and other anomalies.

Tina and Amanda emphasize two best practices in particular. The first is the need for clear guidelines. If the workforce isn’t united in its understanding of what data to review and how to annotate it, problems can – and will – arise. Guidelines need to include nuances about the data, says Amanda.

“You might think something is really simple. Everybody knows what an ‘organization’ is. However, there are actually many wrinkles. Let’s take the name ‘New York,’ for example. Is it an organization or a location? Well, it depends. New York sold bonds — New York is an organization. I live in New York — New York is a location. And, is “New York” part of the organization’s name if you have the New York Jets and the New York Mets?”

Large data sets can be riddled with similar tricky semantic wrinkles, resulting in the need for comprehensive guidelines that provide annotators with clear instructions for how to handle these subtleties. However, even the most detailed set of guidelines can be insufficient in a scenario where annotation needs are highly complex. For this reason, annotation project managers should always consider the type of workforce they are engaging: in-house employees, outside vendors, or a “crowd.” Highly specialized workforces will consume a larger portion of the budget, but may be needed for annotation tasks requiring deep expertise or where data privacy concerns restrict the type of workforce available. These are just a few of the decision criteria involved in planning an annotation project.

Second, the Best Practices document provides advice for evaluating the success of an organization’s annotation projects. To determine how a machine learning model is performing and where it can be improved, Tina emphasizes the value of concrete metrics.

“You cannot rely on anecdotal evidence; you need to have an understanding of how the overall data set is being treated by the model and use quantitative evaluation techniques to identify trends.”

When discussing project evaluation, Amanda is quick to note that it’s a common misconception to think that annotation projects have a defined endpoint. Instead, she encourages project managers to think of these as ongoing, iterative tasks requiring continuous improvement over time.

“You constantly need to keep the data up to date, whether you are serving clients directly or keeping a machine learning model up to date, so it’s a beautiful end-of-the-rainbow fantasy that many project managers hold, that annotation projects will finish.”

Tina recommends constant evaluation of the strategic decisions made throughout the process, including how the project is defined, its guidelines, and the choice of workforce. “You should always be revisiting your decisions and re-evaluating whether they are producing the results that you want in the most efficient way.”

While the annotation process can be a heavy lift when performed properly, given its ongoing importance for any organization deploying machine learning models – regardless of the industry – thoughtful planning and design can pay substantial dividends over the long-term.