How Successful Investors Are Using AI to Get ESG Data: A Quick Guide

By: Jorge Alvarez | November 16, 2022

Environmental, social, and governance (ESG) data. 

It’s a valuable tool that’s become a standard measurement in sustainable finance for corporate stakeholders.

However, due to the growing demand and need for accurate and timely ESG data in investment decision-making and the ESG finance field, it’s also difficult to attain.

And if you’re reading this, it’s because you likely use ESG data regularly and are looking to improve your data or insights into the data. Or you’re new to ESG data and want to understand it better and how to get accurate and timely data using AI. Whatever your reason, we’ve got you covered.

But before we dive into these points, let’s cover a quick history of ESG.

Who created ESG (plus when and why)

Kofi Annan, former United Nations Secretary-General, invited a group of financial institutions to develop policies and guidance on how to better incorporate ESG issues in securities brokerage services, asset management, and associated research functions. In 2004, this joint initiative published “Who Cares Wins: The Global Compact Connecting Financial Markets to a Changing World,” a report that the UN later shared in the 2006 United Nations Principles for Responsible Investment (PRI) report. It would be the first time ESG criteria are incorporated in companies’ financial performance evaluations.

Stronger, more resilient, and sustainable

According to the “Who Cares Wins” report, the contributors were convinced that “in a more globalized, interconnected, and competitive world, the way that environmental, social, and corporate governance issues are managed is part of companies’ overall management quality needed to compete successfully.” The report goes on to state that “Companies that perform better with regard to these issues can increase shareholder value by, for example, properly managing [ESG risks], anticipating regulatory action or accessing new markets, while at the same time contributing to the sustainable development of the societies in which they operate.” 

The cohort believes ESG issues can significantly affect a company’s reputation and brand, an essential part of its value. And as the report puts it, “Endorsing institutions are convinced that a better consideration of environmental, social, and governance factors will ultimately contribute to stronger and more resilient investment markets, as well as contribute to the sustainable development of societies.”

As a standard measurement, ESG becomes a way for companies to demonstrate accountability, trust, and transparency in their ESG goals to appeal to customers, employees, and investors. But how is this data produced and seen?


Where does ESG data come from?

Besides implementing ESG principles and policies, companies are asked to provide information and reports on related performance in a consistent and standardized format. This ESG reporting includes identifying and communicating key challenges and value drivers through normal investor relations communication channels. Companies are also encouraged to mention ESG information in their annual reports.

As you might notice in this scenario, ESG data comes primarily from the very companies we want to evaluate. See a conflict here?


Today’s ESG data challenges

At their core, ESG metrics capture a company’s performance on a given ESG issue. When this aim is achieved, investors can use the data to evaluate and hold companies accountable for their ESG performance. But how would you know whether ESG data accurately capture a firm’s performance?

In the Journal of Applied Corporate Finance, Sakis Kotsantonis and George Serafeim share “Four Things No One Will Tell You About ESG Data.” Here’s a summary:

    1. ESG measuring, data, and how companies report them are inconsistent.
    2. Lack of benchmarking transparency undermines the reliability of peer performance ranking.
    3. ESG data providers deal with “data gaps” differently, and their gap-filling approaches could lead to significant discrepancies.
    4. Interpretation differences among ESG data providers are considerable and are growing with the quantity of data becoming publicly available.

“Although 92% of S&P companies were reporting ESG metrics by the end of 2020, according to a 2020 BlackRock survey of clients, 53% of global respondents cited ‘poor quality or availability of ESG data and analytics; and another 33% cited ‘poor quality of sustainability investment reporting’ as the two biggest barriers to adopting sustainable investing.”Deloitte


Artificial intelligence (AI) to meet rising ESG data demands

Even as investors consider ESG one of many major market factors, sourcing and analyzing data remains a problem. “The absence of standardized ESG datasets and reporting methodologies makes it difficult for issuers to disclose meaningful information on sustainability,” according to a post on WorldQuant.

But despite the ESG data limitations, ESG investing demands continue to grow. For instance, in its 2021 Key Findings, RBC Global Asset Management found that 75% of respondents of 800-plus institutional investors had integrated ESG principles into their investment approach, an increase from 67% since 2017.

Machine learning helps with this demand. For instance, advances in natural language processing (NLP) in machine-learning techniques have made it possible to extract unstructured data from web sources, like news, blogs, forums, and social media, to gain timely and accurate ESG insights. This alternative data has been integral for seeing an entity’s ESG controversies or events in near real time, providing a unique perspective to ESG data and details, filling the data gaps more accurately.

How to get ESG data using natural language processing (NLP)

NLP algorithms can read billions of news, articles, and text-based web data. It categorizes extracted data and can determine positive and negative sentiments, producing potential predictive indicators. Investors and researchers can use NLP to mine keywords and categories of underlying data to evaluate portfolio companies or see their exposure to ESG factors.

Some ESG rating agencies are now integrating or outsourcing NLP-derived datasets into their processes to extrapolate ESG scores. Likewise, investment firms, like asset managers, are incorporating NLP-enhanced web data into risk management, especially when looking into private-equity-type assets. Many are meeting their needs with NLP companies, such as SESAMm and others.

Learn more about NLP.

What makes SESAMm better at extracting ESG data

SESAMm is better at extracting ESG-related data for many reasons. First, it has one of the largest data collection sources to extract data from (data lake). Second, its NLP machine learning algorithms are tuned specially to key indicators.

1. SESAMm’s massive data lake

What makes SESAMm’s data lake unique and ideal for investment research and advanced analytics? SESAMm’s data lake is:

      • Broad and large
      • Includes more than 100 languages
      • Updated in near real time 

Including data since 2008, the data lake consists of more than four million data sources made up of more than 20 billion articles, forums, and messages, such as professional news sites, blogs, and social media, increasing by an average of six million per day. The data lake is also updated hourly to give investors near real-time insights into their investment interests.

Moreover, the coverage is global, with 40% of the sources in English (U.S. and international) and 60% in multiple languages, including Japanese, Chinese, and Eastern European. We select and curate these sources to maximize coverage of both public and private companies, focusing on quality, quantity, and frequency to ensure a consistently high input value.

Learn more about data lakes.

2. SESAMm’s machine-learning process

SESAMm’s developers tune the machine-learning algorithms for key indicators such as mention volume, sentiment analysis and emotion, ESG, and SDG. Additionally, they optimize the structure and schema for optimized SQL queries.

For example, our knowledge graph, a digital representation of a network of real-world entities, puts the schema in context through semantic metadata and linking, providing a framework for analytics, data integration, sharing, and unification. In other words, we map and label the concepts, entities, and events and connect and identify their relationships for quick and accurate recall.

Learn more about knowledge graphs.


To learn how you can generate NLP-enhanced ESG data for your firm, or to request a demo, reach out today.