NLP | AI | Text Analysis

What Investors Ought to Know About Knowledge Graphs: The Core of Text Analysis

June 2, 2022

•

5 mins read

Researching and analyzing investment opportunities can be challenging for asset management—private equity and hedge fund portfolio managers, researchers, and analysts—because, of course, you want to make sure that you're a good steward of your client's investments.

And when you find and source data, such as traditional or alternative data, you also want to make sure it's reliable and that the methods used to gather it are tried and true.

This article aims to give you an inside look into SESAMm's knowledge graph—one of the key reasons SESAMm's NLP-derived alternative data is reliable and trusted. We'll explain what a knowledge graph is, why it's important, how it works, and what makes SESAMm's knowledge graph unique.

What is a knowledge graph?

A knowledge graph is a digital representation of a network of real-world entities, the foundation of a search engine or question-answering service. This structured data model puts the schema in context through linking and semantic metadata, providing a framework for data integration, analytics, unification, and sharing. In other words, it's like a map and legend, with the legend labeling the concepts, entities, and events and the map connecting and identifying their relationships. These details are stored in a graph database and visualized as a graph representation, hence the term knowledge graph.

Fun fact: The expression, knowledge graph, gained popularity after Google used it in 2012 to name their semantic network.

Two types of knowledge graphs

There are two general types of knowledge graphs: open and private. Open knowledge graphs are open to the public. They're created and made available by organizations such as Wikidata, DBpedia, and Yago. Private knowledge graphs are often only used by organizations that create them, like Google, WolframAlpha, Facebook, and SESAMm (of course). Some offer them up for a fee or subscription, such as Crunchbase and OpenCorporates.

Why a knowledge graph is important

Knowledge graphs are important because they equip us with a model to see how everything relates from a big-picture view, creating new knowledge. Its benefits include:

Incorporating disparate data sources, avoiding data silos
Integrating structured and unstructured data
Revealing insights from hierarchical data
Outlining relationships
Defining communities

Knowledge graphs inform machine learning algorithms

From a data science and artificial intelligence (AI) perspective, knowledge graphs provide machine-readable details, adding context and depth to data-driven AI techniques such as machine learning. Using knowledge graphs and machine learning models together improves system accuracy and extends the range of machine learning capabilities for better explainability and trustworthiness.

How a knowledge graph works

The core of a knowledge graph is its knowledge model, a collection of interconnected descriptions of concepts, entities, events, and relationships known as an ontology. This model provides a framework for statements or taxonomy. Each statement consists of a subject, predicate, and object (Figure 1)—known as a triple model—and each subject or object is represented only once in the context of the other subjects and their relationships. For example, in this simple sentence, "The boy kicks the ball," The boy is the subject, and kicker is the predicate because he kicks the ball, the object.

Subject, predicate, object illustration — *Figure1: Apple is the subject, chief executive officer is the predicate, and Tim Cook is the object.*

Likewise, each statement consists of three components: nodes, edges, and labels. A node, or vertice, represents an entity, which can be anything existing in the real world, such as a person, company, or object. For instance, in this example (Figure 2), Barack Obama is the subject node, Malia and Sasha are object nodes, and the edges, or relationships, are labeled as father or sibling, respectively.

figure2-node-edge-label — *Figure 2: How the relationships between nodes can be labeled.*

What makes SESAMm's knowledge graph unique?

SESAMm uses open and private datasets with custom, curated information to create our proprietary knowledge graph. As a result, the knowledge graph is a vast map connecting and integrating over 70 million related entities and their keywords, relating each organization to its brands, products, associated executives, names, nicknames, and exchange identifiers in the case of public companies from a data repository made up of more than 18 billion articles and messages and growing.

The knowledge graph is updated regularly

Entities within the knowledge graph are updated weekly and tagged to ensure we correctly track their changes. For instance, the CEO of a company today might not be its CEO tomorrow. And brands might be bought and sold, changing the parent company with each sale. So, weekly updates within the knowledge graph ensure the system is aware of these changes.

NLP-driven accuracy

At SESAMm, named entity disambiguation (NED), a natural language processing (NLP) technique, identifies named entities based on their context and usage. Text referencing "Elon," for example, could refer indirectly to Tesla through its CEO or to a university in North Carolina. Only the context allows us to differentiate, and NED considers that context when classifying entities. This method is superior to simple pattern matching, which limits the number of possible matches, requires frequent manual adjustments, and can't distinguish homophones.

SESAMm uses three other NLP tools to identify entities and create actionable insights: lemmatization, embeddings, and similarity. The lemmatization process normalizes a word into its base form (morphology) to help identify and aggregate entities. Embedding assigns the entity a numerical value to help analyze how words change meaning depending on context and understand the subtle differences between words that refer to the same concept. Similarity measures whether two words, sentences, or objects are close to one another in meaning.

Learn more: “Gain Insights Fom Financial and ESG Data Using AI: A Comprehensive Guide.”

How SESAMm's knowledge graph benefits you

SESAMm tailored its knowledge graph to find, extract, and analyze data about public or private entities, which isn't readily available from the web or standard rating firms. This unique implementation of a knowledge graph provides insights to give you an edge when researching, analyzing, and submitting recommendations to the portfolio manager or clients.

SESAMm's premiere platform, TextReveal®, allows you to leverage NLP-driven insights fully and receive high-quality results through data streams, modular API and dashboard visualization, and signals and alerts. It's perfect for many quantitative, quantamental, and ESG investment use cases.

Learn how SESAMm can support you in your investment decision-making and request a demo today.

Related Blogs

ESG | NLP | Risk Alerts

S&P 500 ESG Index Drops Tesla: This Analysis Supports the Decision

July 6, 2022

•

5 mins read

May 2, 2022. The S&P 500 ousts Tesla, Inc. from the S&P 500 ESG Index. Tesla is widely recognized as the firm that ushered electric vehicle making into the mainstream. So the index’s move seems unreasonable or possibly made in error to many, raising some interesting questions:

How does an environmentally-friendly corporation like Tesla get dropped from an ESG index?
Why does a potentially non-environment-friendly company like Exxon make the ESG index and remain on it?
What do these moves mean about the integrity and validity of ESG scores and ratings?

Before we go on, let’s bring some context.

Why did the S&P 500 ESG Index drop Tesla?

May 18, 2022. In an S&P blog post, "The (Re)Balancing Act of the S&P 500 ESG Index," a spokesperson announces and explains their decision. Here are the bullet points:

Global industry group peers pushed Tesla’s S&P DJI ESG Score further down the ranks in the GICS industry group: Automobiles & Components.
A decline in criteria level scores related to Tesla’s low carbon strategy and codes of business conduct contributed to its 2021 S&P DJI ESG Score.
A media and stakeholder analysis identified "two separate events centered around claims of racial discrimination and poor working conditions at Tesla’s Fremont factory."
The analysis also highlights "the handling of the NHTSA investigation after multiple deaths and injuries were linked to its autopilot vehicles, affecting the company’s S&P DJI ESG Score at the criteria level, and its overall score."

companies-left-out-of-SPESGindex-post-rebalance

Companies, including Tesla, left out of the S&P 500 ESG Index post-rebalance. Image courtesy of Indexology Blog.

The S&P blog post summarizes their case about dropping Tesla, "While Tesla may be playing its part in taking fuel-powered cars off the road, it has fallen behind its peers when examined through a wider ESG lens." And in this statement lies the crux of why the index dropped Tesla and why others are still on.

Analyzing Tesla’s web data

SESAMm’s TextReveal® insights suggest that the S&P 500’s decision to remove Tesla could be justified based on increasing controversy levels concerning discrimination, ethical standards, and work health and safety. By analyzing text related to ESG topics across the web, we picked up trends for the following subtopics:

climate_change_atmospheric_pollution
ethical_standards
discrimination_racism_sexism
labor_standards
health_and_safety_at_work
general_environmental_impact

Tesla’s ESG scores (six subtopics)

ESG scores, 1-year moving average, Tesla, all source types — *Figure 1: Tesla ESG scores for volumes and sentiments (1-year moving average), all source types.*

Regarding the volume features (Figure 1), we observed a significant increase in the scores related to ethical standards, discrimination, and atmospheric pollution for Tesla before the controversy. The conclusions are mostly the same for ESG sentiment (negative) scores. An interesting note is that the negative score of health and safety at work slightly increased in the months before the removal of Tesla from the index.

ESG scores, 1-year moving average, Tesla, all source types, select subtopics — *Figure 2: Tesla ESG scores for volumes and sentiments (1-year moving average), all source types, select subtopics.*

Comparing Tesla’s sentiment with other S&P 500 ESG Index companies

To see how Tesla’s ESG sentiment scores compared with other companies, we must rescale them with respect to a large universe of companies. This process means that for a given company, we use percentiles of the distribution of each subtopic’s ESG score to do a rescaling to the S&P 500 ESG constituents list after the 2022 rebalancing. Rescaling allows us to compare the companies with each other because the rescaled score indicates how bad the company is compared to the others, according to a specific ESG subtopic.

The following graphs show different sets of subtopics, plotting the mean of the respective rescaled scores if several topics are considered. Here are the companies considered.

Companies removed from the index:

Tesla
Delta Air Lines
Chevron Corporation

Companies that joined the index after the 2022 rebalancing:

American International Group
Expedia Group

Companies still part of the index:

Exxon Mobil
Apple
Amazon

Tesla, Delta, Chevron, AIG, and Expedia compared

Rescaled scores: Apple, Amazon, and Exxon — *Figure 3: Six-subtopic rescaled scores for Tesla, Delta, Chevron, AIG, and Expedia.*

Apple, Amazon, and Exxon compared

The S&P 500’s choice is reasonable

Our analysis shows that the S&P 500’s decision to oust Tesla from the ESG index is reasonable. We found significant subtopic volumes and negative sentiment that support the S&P 500’s claims of racial discrimination, poor working conditions, and other controversies.

Thanks for reading this quick analysis. For a more detailed report, including Chevron’s and Delta’s ESG scores, reach out to a representative today.

SESAMm’s ready-to-use alternative data

Leverage our alternative data streams to incorporate systematic insights into your alpha signals or risk monitoring your entire portfolio. From tracking global sentiment to analyzing retail communities like WallStreetBets and integrating ESG alternative data into your systems, our solutions will make generating value from web insights easy.

NLP | Alternative Data | AI

Summer Roundup: Our 10 Most-Read Blog Posts This Year (So Far)

September 7, 2022

•

5 mins read

Summer is almost over for us in the northern hemisphere. (We know. It's sad for us, too.) And with this seasonal shift comes back-to-school and back-to-work activities, including taking a last-minute vacation. And vacations mean time for reading, right?

While they may not be beach reads, we think we have some great choices. These are the posts that have been most popular on SESAMm's blog in the past five months. Let's get started with SESAMm's most-read blog posts since this spring, starting with number 10.

#10 What Investors Ought to Know About Natural Language Processing: A Quick Guide

Read this quick guide about what natural language processing is, how it’s used, why it's important to uncover financial alternative data. Bonus: Get an overview of how NLP works at SESAMm.

#9 S&P 500 ESG Index Drops Tesla: This Analysis Supports the Decision

Review SESAMm's analysis based on its ready-to-use data streams, revealing red flags that support the decision to oust Tesla, Inc. from the S&P 500 ESG Index.

#8 Alternative Data Trends: NLP Analysis on Commercial Real Estate

Read the takeaways of the current commercial real estate market we extracted using SESAMm’s NLP-powered engine to analyze web data.

#7 VIDEO: ESG Data Challenges and How AI and NLP Offer Solutions

sesamm-japan-investor-forum-title-slide-SESAMm

Watch CEO Sylvain Forté at Japan Investor Forum, discussing ESG data, its challenges, and how to use AI and NLP to generate insights on millions of companies.

#6 How Organizations Are Using NLP To Detect Greenwashing

See how we apply our NLP capabilities to identify companies likely to engage in greenwashing practices by analyzing text in billions of web-based articles.

#5 Alternative Data Trends: 5 Effects of the Failing Musk-Twitter Deal

Based on alternative data, discover how Elon Musk’s personal and related brands measure up to public sentiment following his failed acquisition of Twitter.

#4 What Investors Ought to Know About Data Lakes: A Quick Guide

Discover why SESAMm’s data lake is ideal for investment research and other basics like what a data lake is, why it’s important, what it does, and how it works.

#3 Gain Insights From Financial and ESG Data Using AI: A Comprehensive Guide

Gain Insights From Financial and ESG Data Using AI A Comprehensive Guide-1

Learn how SESAMm’s AI and NLP platform is used to gain financial and ESG insights from alternative data for systematic trading, fundamental research, and more.

#2 What Investors Ought to Know About Knowledge Graphs: The Core of Text Analysis

Introducing Knowledge Graphs The Core of Text Analysis

Learn what SESAMm’s Knowledge Graph is, what it does, and how it’s used in text analysis for financial research, such as in private equity and hedge funds.

#1 Predicting stock price movements using news and social media data

Tokio Marine & Nichido Fire Insurance Company and SESAMm work together to predict stock price movements using NLP-generated data from news and social media.

Thank you for reading through our Summer Roundup: the 10 most-read blog posts this year.

Which is your favorite? How would you rate these posts? Let us know what you think on Twitter or LinkedIn.

Webinar: Secondaries Investing

03/13/2026

•

5 mins read

Secondaries investors evaluate large, diversified portfolios under compressed timelines, with the level of detail and underlying company visibility differing by transaction type.

In this context, screening is embedded in the underwriting workflow, not a one-off exercise: it helps apply investment guidelines, support LP opt-outs, prioritize follow-up diligence, and enable ongoing monitoring over the life of the investment.

Watch this webinar replay to hear Jessica Huang, Private Equity and Secondaries ESG Lead at Ares Management, and Sylvain Forté, CEO at SESAMm, discuss:

The operational and data challenges secondaries teams face
How screening is applied in secondaries investing in practice
How AI helps teams scale screening and support ongoing monitoring workflows

What Investors Ought to Know About Knowledge Graphs: The Core of Text Analysis

What is a knowledge graph?

Two types of knowledge graphs

Why a knowledge graph is important

Knowledge graphs inform machine learning algorithms

How a knowledge graph works

What makes SESAMm's knowledge graph unique?

The knowledge graph is updated regularly

NLP-driven accuracy

How SESAMm's knowledge graph benefits you

Related Blogs

Stay ahead with the latest in ESG and AI intelligence

Solution

Others

Resources

About