Humans in the Loop: AI & Machine Learning in the Bloomberg Terminal
Originally published on bloomberg.com
The Bloomberg Terminal provides access to more than 35 million financial instruments across all asset classes. That’s a lot of data, and to make it useful, AI and machine learning (ML) are playing an increasingly central role in the Terminal’s ongoing evolution.
Machine learning is about scouring data at speed and scale that is far beyond what human analysts can do. Then, the patterns or anomalies that are discovered can be used to derive powerful insights and guide the automation of all kinds of arduous or tedious tasks that humans used to have to perform manually.
While AI continues to fall short of human intelligence in many applications, there are areas where it vastly outshines the performance of human agents. Machines can identify trends and patterns hidden across millions of documents, and this ability improves over time. Machines also behave consistently, in an unbiased fashion, without committing the kinds of mistakes that humans inevitably make.
“Humans are good at doing things deliberately, but when we make a decision, we start from whole cloth,” says Gideon Mann, Head of ML Product & Research in Bloomberg’s CTO Office. “Machines execute the same way every time, so even if they make a mistake, they do so with the same error characteristic.”
The Bloomberg Terminal currently employs AI and ML techniques in several exciting ways, and we can expect this practice to expand rapidly in the coming years. The story begins some 20 years ago…
Keeping Humans in the Loop
When we started in the 80s, data extraction was a manual process. Today, our engineers and data analysts build, train, and use AI to process unstructured data at massive speeds and scale — so our customers are in the know faster.
The rise of the machines
Prior to the 2000s, all tasks related to data collection, analysis, and distribution at Bloomberg were performed manually, because the technology did not yet exist to automate them. The new millennium brought some low-level automation to the company’s workflows, with the emergence of primitive models operating by a series of if-then rules coded by humans. As the decade came to a close, true ML took flight within the company. Under this new approach, humans annotate data in order to train a machine to make various associations based on their labels. The machine “learns” how to make decisions, guided by this training data, and produces ever more accurate results over time. This approach can scale dramatically beyond traditional rules-based programming.
In the last decade, there has been an explosive growth in the use of ML applications within Bloomberg. According to James Hook, Head of the company’s Data department, there are a number of broad applications for AI/ML and data science within Bloomberg.
One is information extraction, where computer vision and/or natural language processing (NLP) algorithms are used to read unstructured documents — data that’s arranged in a format that’s typically difficult for machines to read — in order to extract semantic meaning from them. With these techniques, the Terminal can present insights to users that are drawn from video, audio, blog posts, tweets, and more.
Anju Kambadur, Head of Bloomberg’s AI Engineering group, explains how this works:
“It typically starts by asking questions of every document. Let’s say we have a press release. What are the entities mentioned in the document? Who are the executives involved? Who are the other companies they’re doing business with? Are there any supply chain relationships exposed in the document? Then, once you’ve determined the entities, you need to measure the salience of the relationships between them and associate the content with specific topics. A document might be about electric vehicles, it might be about oil, it might be relevant to the U.S., it might be relevant to the APAC region — all of these are called ‘topic codes’ and they’re assigned using machine learning.”
All of this information, and much more, can be extracted from unstructured documents using natural language processing models.
Another area is quality control, where techniques like anomaly detection are used to spot problems with dataset accuracy, among other areas. Using anomaly detection methods, the Terminal can spot the potential for a hidden investment opportunity, or flag suspicious market activity. For example, if a financial analyst was to change their rating of a particular stock following the company’s quarterly earnings announcement, anomaly detection would be able to provide context around whether this is considered a typical behavior, or whether this action is worthy of being presented to Bloomberg clients as a data point worth considering in an investment decision.
And then there’s insight generation, where AI/ML is used to analyze large datasets and unlock investment signals that might not otherwise be observed. One example of this is using highly correlated data like credit card transactions to gain visibility into recent company performance and consumer trends. Another is analyzing and summarizing the millions of news stories that are ingested into the Bloomberg Terminal each day to understand the key questions and themes that are driving specific markets or economic sectors or trading volume in a specific company’s securities.
Humans in the loop
When we think of machine intelligence, we imagine an unfeeling autonomous machine, cold and impartial. In reality, however, the practice of ML is very much a team effort between humans and machines. Humans, for now at least, still define ontologies and methodologies, and perform annotations and quality assurance tasks. Bloomberg has moved quickly to increase staff capacity to perform these tasks at scale. In this scenario, the machines aren’t replacing human workers; they are simply shifting their workflows away from more tedious, repetitive tasks toward higher level strategic oversight.
“It’s really a transfer of human skill from manually extracting data points to thinking about defining and creating workflows,” says Mann.
Ketevan Tsereteli, a Senior Researcher in Bloomberg Engineering’s Artificial Intelligence (AI) group, explains how this transfer works in practice.
“Previously, in the manual workflow, you might have a team of data analysts that would be trained to find mergers and acquisition news in press releases and to extract the relevant information. They would have a lot of domain expertise on how this information is reported across different regions. Today, these same people are instrumental in collecting and labeling this information, and providing feedback on an ML model’s performance, pointing out where it made correct and incorrect assumptions. In this way, that domain expertise is gradually transferred from human to machine.”
Humans are required at every step to ensure the models are performing optimally and improving over time. It’s a collaborative effort involving ML engineers who build the learning systems and underlying infrastructure, AI researchers and data scientists who design and implement workflows, and annotators — journalists and other subject matter experts — who collect and label training data and perform quality assurance.
“We have thousands of analysts in our Data department who have deep subject matter expertise in areas that matter most to our clients, like finance, law, and government,” explains ML/AI Data Strategist Tina Tseng. “They not only understand the data in these areas, but also how the data is used by our customers. They work very closely with our engineers and data scientists to develop our automation solutions.”
Annotation is critical, not just for training models, but also for evaluating their performance.
“We’ll annotate data as a truth set — what they call a “golden” copy of the data,” says Tseng. “The model’s outputs can be automatically compared to that evaluation set so that we can calculate statistics to quantify how well the model is performing. Evaluation sets are used in both supervised and unsupervised learning.”
Check out “Best Practices for Managing Data Annotation Projects,” a practical guide published by Bloomberg’s CTO Office and Data department about planning and implementing data annotation initiatives.