Innovation, or How Thomson Reuters Won the Other Information War


Shawn Malhotra

The brand that is now frequently cited as Canada’s most globally recognized was born in the mining towns of Northern Ontario in the early 1930s. Roy Thomson, the son of a Toronto barber, launched North Bay’s first radio station, then the Timmins Daily Press. Today, Thomson Reuters is the world’s largest multimedia news agency, providing business and international news, financial market data and regulatory information to the financial, legal, tax and accounting industries as well as a range of other content to consumers around the world.

Thomson Reuters has been serving professional knowledge workers in more than 100 countries for more than 100 years. As all good Canadians do, we bring a grounded, common-sense approach to the way we run our global business. But, perhaps the most inherently Canadian trait we share with all enduring businesses is a healthy focus on innovation. 

We did ‘Big Data’ long before the term became corporate jargon. We began applying machine learning 15 years before Watson played Jeopardy. Today, our AI tools use social media to verify fact from fiction and enable our journalists to filter out social media noise and identify breaking news events. 

Ours is a great story, and a very Canadian one. Thomson began as a newspaper company in Northern Ontario in the 1930s. Over the decades, the business expanded across industries and borders, including a highly successful ownership stake in a North Sea oil and gas consortium. But the company’s leaders had the foresight to know that oil was a non-renewable resource. They needed to find the oil wells of the 21st century: information. 

Fast-forward to roughly 30 years ago—Thomson was a holding company that published about 200 newspapers, along with textbooks and professional journals, as well as the largest leisure travel business in the UK. Thomson had the foresight to understand the sweeping change the internet and digitization of content would bring and started to divest its print assets for higher-margin digital, professional information and services assets. 

While at its core, Thomson remains an information publishing company, early investment in electronic delivery became a corporate priority. At the time, the Thomson Corporation provided much of the specialized information content the world’s financial, legal and research organizations relied on to make business-critical decisions. In 2008, the company bought Reuters Group, a global financial information and news business.

Throughout our journey, we have been innovating and building a company designed to compete in the Information Age. Customers used to pay for printed volumes of need-to-know data. They moved on to networks of information stored in databases and delivered in electronic form. Today, they pay for the right answer, delivered at the right time when they need it in their working lives. 

With the explosion of data and proliferation of “free” information, it has become harder than ever to extract true value from this wealth of opportunity. The foundation of our innovation efforts has been the work we do under the hood. Need-to-know proprietary data is important. But, the key has always been the information architecture around the data that enables us to extract value-added meaning from the information. 

To make better use of the data we had, it needed to be “freed” from the silos it was created for and managed in. So, in the early 1990s, Thomson Reuters began phasing in artificial intelligence, natural language processing and machine-learning technologies to with increasing sophistication. 

Back in 1975, the CN Tower had just been completed and journalists were still writing stories on typewriters and filing them by phone. There was no internet and no personal computers. West Publishing (a future division of Thomson) launched Westlaw, one of the first online legal research services. Attorneys used ‘dumb’ terminals to dial-up to a mainframe. The content was limited (disk space was expensive), the search language simplistic. Soon the search was enhanced to allow the use of Boolean terms. Full text search only came much later. Over the next decade, the content expanded significantly, but search engine technology remained much the same. 

In 1992, we launched the first commercial natural language search engine. It was the first search engine in-market to introduce probabilistic ranked searches for natural language queries—a form of machine learning. This program used statistics to provide an estimate of what answer is the one the lawyer is probably looking for. Before this, results were simplistically ordered and users had to wade through long lists of irrelevant responses to find their answer. 

We were also one of the earliest in the information industry to introduce full machine-assisted automation at scale for text-mining and content-enhancement technologies. This enabled the search of mass amounts of unstructured data and dramatically reduced the time it takes to sift through hundreds of legal documents.  

By 2000, the internet and the web were seeing exponential growth. Machine learning approaches for many information tasks started to get more and more traction. We created machine learning technology that enabled us to manage the massive scale of data we had. And, in contrast to the general public, our customers (lawyers, accountants and bankers) had very specific information needs. 

In short, while a Google search allows users to ask a simple question and receive a factual compilation of information, a WestLaw search goes a step further—the software comes back to the inquirer with a complete set of jurisprudence. The sweat and hours that lawyers would have devoted to unearthing the individual pieces of information needed for understanding a precedent are now handled by our software. The platform includes natural language search, which further simplifies the way legal research is conducted, helping researchers find answers in one tenth of the time.

In the last twenty years, the amount of information has exploded—for all business professionals. The quantity is overwhelming, and it is accelerating rapidly. For context, more data has been created in the past two years than in the entire previous history of the human race. At Thomson Reuters, we now process and collect more data in a single day than we did in a month five years ago. Even as we move to the cloud, we still store 60,000 terabytes of data in our data centers. To put that in context, the U.S. Library of Congress contains 200 terabytes of data and the total size of Wikipedia is 3 terabytes. Thomson Reuters data is used to price $3 trillion in assets daily—nearly 2.5 million price updates per second.

We believe that the key to extracting value is to do more with data. In order to effectively use data, it’s important to understand how it connects to the real world. By using shared platforms and working across our businesses we are making our data more accessible and valuable for our customers, no matter how they access it. Our customers rely on us for the answers they need. 

Today, across Thomson Reuters, we use our subject matter experts, artificial intelligence and machine learning to continually improve how we find, extract, tag and structure data. We fuse world-class content with emerging technology and deep domain expertise ensuring our answers stay ahead of the curve. We have transformed ourselves from a publishing company into an information and technology company. 

Our customers are dependent on knowing about events and risks that can affect their companies, their clients, markets or supply chains. Staying up-to-date is critical, but the amount of data that is produced daily is overwhelming. We use AI technologies to automatically consume and analyze the fire hose of data from news, markets and social media.

Behind the scenes, AI technologies have been deployed across Thomson Reuters. The vast data sources that we have create an almost unlimited number of opportunities for specialized information extraction. The various solutions further expanded our knowledge base and connected the content, making research easier and enabling new forms of analytics.

That is why we are investing in technology. Globally, we invest more than $3 billion per year on technology. We have more than 12,000 software engineers, systems architects and data scientists around the world who design and develop products that address the complex needs of conducting business in today’s world and advance our customers’ experiences. 

In 2016, we opened our Thomson Reuters Technology Centre in downtown Toronto, which is also home to our global Centre for AI and Cognitive Computing. The Toronto Technology Centre is expected to create 400 new technology jobs by the end of 2018 and up to 1,500 jobs in total. In the fall of 2017, we announced we are investing $100 million in a permanent location for our technology centre. 

Ten years after the first iPhone®, a new digital world powered by big data, cognitive computing and the cloud promises to change the way we live, work and interact. We have been a pioneer of digital product development for decades. From using Blockchain to bring developing countries the confidence of secure land records to using machine learning to help journalists and readers alike separate fact from fiction. We are applying cutting-edge technologies to emerging challenges.  


Shawn Malhotra is Vice President, Thomson Reuters Technology Centre.