Ebook: Unmasking Greenwashing: How to Identify Genuine and Deceiving Sustainability Initiatives with AI
November 15, 2023
•
5 mins read
In our latest research, “Unmasking Greenwashing with AI” our ESG and Research & Analytics teams provide a comprehensive analysis of greenwashing trends using AI-powered text analysis.
Notable increase in mentions of greenwashing, with a 3.3x rise since 2021.
This increase suggests both a real growth in deceptive sustainability practices and a rise in public awareness.
Greenwashing represents 55% of all reputational laundering, underscoring a major shift towards environmental deception.
Surge in climate lawsuits with over 3x increase in climate change lawsuits since 2020, highlighting legal risks for misleading practices
It underscores the financial sector’s dual role in both contributing to and fighting against greenwashing through its investment practices
Researching and analyzing investment opportunities can be challenging for asset management—private equity and hedge fund portfolio managers, researchers, and analysts—because, of course, you want to make sure that you're a good steward of your client's investments.
And when you find and source data, such as traditional or alternative data, you also want to make sure it's reliable and that the methods used to gather it are tried and true.
This article aims to give you an inside look into SESAMm's knowledge graph—one of the key reasons SESAMm's NLP-derived alternative data is reliable and trusted. We'll explain what a knowledge graph is, why it's important, how it works, and what makes SESAMm's knowledge graph unique.
What is a knowledge graph?
A knowledge graph is a digital representation of a network of real-world entities, the foundation of a search engine or question-answering service. This structured data model puts the schema in context through linking and semantic metadata, providing a framework for data integration, analytics, unification, and sharing. In other words, it's like a map and legend, with the legend labeling the concepts, entities, and events and the map connecting and identifying their relationships. These details are stored in a graph database and visualized as a graph representation, hence the term knowledge graph.
Fun fact: The expression, knowledge graph, gained popularity after Google used it in 2012 to name their semantic network.
Two types of knowledge graphs
There are two general types of knowledge graphs: open and private. Open knowledge graphs are open to the public. They're created and made available by organizations such as Wikidata, DBpedia, and Yago. Private knowledge graphs are often only used by organizations that create them, like Google, WolframAlpha, Facebook, and SESAMm (of course). Some offer them up for a fee or subscription, such as Crunchbase and OpenCorporates.
Why a knowledge graph is important
Knowledge graphs are important because they equip us with a model to see how everything relates from a big-picture view, creating new knowledge. Its benefits include:
Incorporating disparate data sources, avoiding data silos
From a data science and artificial intelligence (AI) perspective, knowledge graphs provide machine-readable details, adding context and depth to data-driven AI techniques such as machine learning. Using knowledge graphs and machine learning models together improves system accuracy and extends the range of machine learning capabilities for better explainability and trustworthiness.
How a knowledge graph works
The core of a knowledge graph is its knowledge model, a collection of interconnected descriptions of concepts, entities, events, and relationships known as an ontology. This model provides a framework for statements or taxonomy. Each statement consists of a subject, predicate, and object (Figure 1)—known as a triple model—and each subject or object is represented only once in the context of the other subjects and their relationships. For example, in this simple sentence, "The boy kicks the ball," The boy is the subject, and kicker is the predicate because he kicks the ball, the object.
Figure1: Apple is the subject, chief executive officer is the predicate, and Tim Cook is the object.
Likewise, each statement consists of three components: nodes, edges, and labels. A node, or vertice, represents an entity, which can be anything existing in the real world, such as a person, company, or object. For instance, in this example (Figure 2), Barack Obama is the subject node, Malia and Sasha are object nodes, and the edges, or relationships, are labeled as father or sibling, respectively.
Figure 2: How the relationships between nodes can be labeled.
What makes SESAMm's knowledge graph unique?
SESAMm uses open and private datasets with custom, curated information to create our proprietary knowledge graph. As a result, the knowledge graph is a vast map connecting and integrating over 70 million related entities and their keywords, relating each organization to its brands, products, associated executives, names, nicknames, and exchange identifiers in the case of public companies from a data repository made up of more than 18 billion articles and messages and growing.
The knowledge graph is updated regularly
Entities within the knowledge graph are updated weekly and tagged to ensure we correctly track their changes. For instance, the CEO of a company today might not be its CEO tomorrow. And brands might be bought and sold, changing the parent company with each sale. So, weekly updates within the knowledge graph ensure the system is aware of these changes.
NLP-driven accuracy
At SESAMm, named entity disambiguation (NED), a natural language processing (NLP) technique, identifies named entities based on their context and usage. Text referencing "Elon," for example, could refer indirectly to Tesla through its CEO or to a university in North Carolina. Only the context allows us to differentiate, and NED considers that context when classifying entities. This method is superior to simple pattern matching, which limits the number of possible matches, requires frequent manual adjustments, and can't distinguish homophones.
SESAMm uses three other NLP tools to identify entities and create actionable insights: lemmatization, embeddings, and similarity. The lemmatization process normalizes a word into its base form (morphology) to help identify and aggregate entities. Embedding assigns the entity a numerical value to help analyze how words change meaning depending on context and understand the subtle differences between words that refer to the same concept. Similarity measures whether two words, sentences, or objects are close to one another in meaning.
SESAMm tailored its knowledge graph to find, extract, and analyze data about public or private entities, which isn't readily available from the web or standard rating firms. This unique implementation of a knowledge graph provides insights to give you an edge when researching, analyzing, and submitting recommendations to the portfolio manager or clients.
SESAMm's premiere platform, TextReveal®, allows you to leverage NLP-driven insights fully and receive high-quality results through data streams, modular API and dashboard visualization, and signals and alerts. It's perfect for many quantitative, quantamental, and ESG investment use cases.
Learn how SESAMm can support you in your investment decision-making and request a demo today.
A big thanks to SESAMm's investors, partners, and clients
We thank our clients, investors, and partners for your support and patronage. Thank you for being such a big part of SESAMm; you're why we do what we do, and many of you have been involved since day one. And your generous and encouraging attitude has helped get us here today.
About the award
This award is granted by a panel of leading industry experts based on our exceptional client service, innovative product development, and strong and sustainable business growth over the past 12 months.
Honored and excited
We're honored to earn Best Data Provider, Alternative Data Sources at 2023 Fund Intelligence Operations and Services Awards. We're also excited for our clients and partners because our products and services are game-changers for hedge fund services. And while we have more work to do and clients to serve, we think the future looks bright for us, our partners, and our clients.
About SESAMm and TextReveal
SESAMm is a leading NLP technology company serving global investment firms, corporations, and investors, such as private equity firms, hedge funds, and other asset management firms. Through TextReveal, we give you NLP capabilities to generate your own alternative data for use cases, such as ESG and SDG, sentiment, private equity due diligence, corporation studies, and more. And with access to SESAMm’s massive data lake, made up of 20 billion articles and messages and growing, you can make better investment decisions.
Reach out to SESAMm
For a personal demonstration of our award-winning platform, reach out to a representative
Controversial business involvement screening is moving beyond its origins as a compliance exercise.
Under frameworks like SFDR and the EU Taxonomy, investors must prove that their portfolios not only promote sustainability but also exclude activities fundamentally at odds with environmental, social, or ethical principles. This marks a shift from static disclosure toward dynamic accountability, and it has broadened both the scope and ambition of ESG screening.
Historically, exclusions focused on a narrow range of activities - weapons, tobacco, or fossil fuels - and primarily applied to public equities. Today, that universe has expanded dramatically. Private markets, secondaries portfolios, and private credit exposures are now expected to undergo the same scrutiny as listed assets. This reflects not only regulatory alignment but also diversifying investor expectations, as institutions incorporate reputational, cultural, and mission-based constraints into their investment frameworks.
Modern exclusion policies increasingly include areas not yet covered by regulation but relevant to ethics, faith, or social impact. Examples range from pork-related activities in Sharia-compliant portfolios to emerging debates over cryptocurrency mining and trading, and even biotechnology topics such as human cloning or genetic manipulation that raise profound ethical questions. These additions illustrate how business involvement screening is evolving from a rule-based checklist into a reflection of each investor’s worldview and stakeholder commitments.
This evolution, however, brings complexity. Private assets and novel sectors often lack standardized data or public disclosures. ESG, compliance, and deal teams must process incomplete information, document decisions, and adapt quickly to new mandates - all without expanding headcount. The result is a growing need for automation that can adapt to human nuance.
SESAMm’s AI-powered business involvement screening meets that need. By allowing investors to screen based on their own exclusion categories and thresholds, it translates varied mandates - from regulatory to reputational - into a single, automated process.
Automating Controversial Business Involvement Screening in Public and Private Assets
SESAMm’s platform uses a new AI agent approach that scans and analyzes vast amounts of information. Below, we provide an overview of SESAMm’s business involvement screening capabilities and how they address investors’ needs for automation, thresholding, and flexible outputs.
Comprehensive Coverage through Big Data
SESAMm utilizes its AI engine to monitor over 30 billion articles and 10 million new documents daily from various sources, including news sites and NGOs. This extensive data collection spans multiple languages and local outlets, enabling it to detect obscure references to companies and raise alerts for issues such as misconduct. SESAMm's coverage encompasses millions of public and private companies, enabling users to conduct thorough screenings of any entity, including private companies and subsidiaries.
Customizable Exclusion Frameworks
SESAMm’s business involvement screening gives investors control over what to screen and how to classify it. Users can request customization of exclusion categories to mirror their own policy, whether based on regulation (e.g., SFDR, EU Taxonomy) or internal mandates (e.g., faith-based or reputational constraints). In addition to standard ESG categories like fossil fuels or weapons, investors can add custom topics. This flexibility allows ESG, compliance, and secondaries teams to tailor the tool to their precise needs,.
Threshold-Based Classification
SESAMm’s business involvement screening module is built around the concept of threshold-based flags. The AI utilizes structured data and unstructured signals to determine involvement levels. The output for each company is a clear classification: No Involvement, Limited Involvement, or Significant Involvement for each category. These classifications correspond to thresholds – limited might mean some involvement but below the exclusion threshold, significant means above the threshold or its a core business. By encoding the thresholds in the system, SESAMm ensures consistency with the investor’s policy. This is crucial for automation: rather than an analyst manually checking revenue percentages and news, the system does it automatically and provides clear justification.
Rapid Portfolio Screening Process
The system is designed for fast, self-contained screening. A user simply uploads a list or portfolio, and within hours receives a complete file summarizing involvement across all exclusion categories. The output includes company-level classifications, summaries of supporting evidence, and references to sources. This enables investors to integrate the results directly into due diligence workflows, risk committees, or regulatory reporting, with no ongoing manual data maintenance required.
Cost and Resource Efficiency
Automating this process saves substantial analyst time, particularly for rating agencies and secondaries investors managing high volumes of entities. Rating agencies can use the pre-classified results as a baseline input for their own ESG or credit assessments, reducing the manual data-gathering burden. LPs and GPs can run large private company universes in-house without additional research teams. In secondaries, where a full portfolio review can take days of analyst effort, SESAMm’s workflow compresses that timeline to just a few hours, enabling ESG validation to fit seamlessly into transaction schedules.
Auditability and Verification
Each classification is fully transparent. Analysts can drill down into the evidence behind a flag, including links to original articles, filings, or corporate statements, and verify the AI’s reasoning. Automatic translation ensures accessibility across languages. This transparency builds trust in the results and provides auditable documentation for LP reporting or regulator reviews.
As ESG investing matures, the leaders will be those who can implement exclusions transparently, efficiently, and in alignment with evolving norms. The next frontier is no longer just regulatory compliance - it is the ability to anticipate what clients and society will expect tomorrow, and to operationalize those expectations across all asset classes. SESAMm’s technology makes that possible: a platform that keeps pace with both policy evolution and moral expectations, bringing consistency and clarity to an increasingly complex ESG landscape.
Stay ahead with the latest in ESG and AI intelligence
Join our mailing list to receive new reports, event invites, and updates from SESAMm directly to your inbox.