Blogs

Insights & Updates

A Score You Can Rely On: How SESAMm Defines What Its Controversy Rating Measures

July 21, 2026

•

5 mins read

SESAMm's public methodology defines exactly what its Controversy Exposure Score measures, and where it stops. Why marking the boundaries builds trust.

Most data companies lead with what their product can do. SESAMm's public methodology does that, and then goes a step further. It defines, in plain terms, exactly what the Controversy Exposure Score measures and where its boundaries lie. That precision, now public and free to access under the EU ESG Rating Regulation that entered into force on 2 July 2026, is what makes the score dependable.

The logic is straightforward. A number is only as useful as the user's understanding of it. SESAMm would rather its clients understand the score completely than take it on trust, because a well-understood score is a score that can be used with confidence.

Built From the Public Record

The CES is built entirely from public and licensed media and web content. That foundation gives it a clear and well-defined scope, and SESAMm is precise about what that scope includes.

Coverage is richer for some entities than others. Large and high-profile companies generate far more reporting than small or private ones. SESAMm addresses this directly by rebasing each event against an entity's own media history rather than absolute volume, so a company is measured against its own baseline rather than penalised for simply attracting more press. For entities with a persistently low profile, the methodology is explicit that the underlying signal is thinner, which tells a user precisely where to bring additional sources to bear.

Language and source access define the rest of the scope. The pipeline reads a broad and growing set of languages and ingests an extensive range of public and licensed sources. Where a controversy is reported mainly in a language or a publication outside that set, the methodology says so plainly. Defining these edges is what allows a user to place the score accurately within a wider process.

The Discipline of Not Guessing

One principle deserves particular attention, because it sets SESAMm apart from a common industry habit. Where direct coverage of an entity is thin, SESAMm does not fill the gap with proxy data, sector benchmarks or estimated values.

This is a deliberate quality choice. Substituting averages would produce a tidier-looking dataset, but it would manufacture information that does not exist. SESAMm reports only what the evidence supports. For a low-visibility entity, that means a low score reflects the controversies actually detected, and the methodology is clear that this is a measure of detected exposure rather than a clean bill of health. The result is a number a client can stand behind, because nothing in it is invented.

This alo clarifies how the score is best read. The CES measures exposure to negative controversies, which makes it a sharp, single-purpose instrument. It is designed to surface risk, not to certify virtue, and pairing it with positive-performance data is exactly how SESAMm intends it to be used.

An Early Signal, Drawn From Public Reporting

The CES reflects controversies as reported in public sources, which can include allegations that are still moving through the courts. The methodology is precise about what this means: the score records the existence and salience of reporting, and it is built to give risk teams an early signal rather than a legal conclusion.

This is one of the score's most valuable properties. Reputational and ESG risk very often crystallises long before any legal process concludes. A measure that captures reported exposure as it emerges, while being clear that reporting is not a verdict, gives a risk team time to act early and to weigh the signal appropriately. That combination of timeliness and precision is precisely what makes it useful in practice.

Rigorously Engineered, Openly Documented

Because the score is produced by an AI pipeline, SESAMm documents both how that pipeline works and the controls that keep it accurate. The engineering is the headline here, and it is substantial.

A language-model filter screens for false positives before any event is surfaced. A dual-layer human quality-assurance process, run daily by SESAMm's Research and Analytics team and escalated where needed to the Methodology Lead, reviews accuracy, corrects confirmed issues at the source, and feeds recurring patterns back into the training corpus so the system improves over time. Before any material change to the methodology is deployed, it is backtested against a historical event database, reviewed on the entities it most affects, and signed off by the Methodology Lead. The methodology is formally reviewed at least once a year.

SESAMm also names the structural properties of statistical models openly, because describing a system you understand and control is what gives these safeguards their meaning. The point is not that any model is flawless. It is that the controls are designed for exactly the points where models need them, and that the whole arrangement is documented for anyone to inspect.

Why Defining the Boundaries Builds Trust

There is a quiet truth in ESG data. The providers willing to define the edges of their product are often the ones most worth trusting, because they are describing a system they genuinely understand and operate. A score presented as all-seeing invites misuse. A score presented with its scope clearly marked can be integrated thoughtfully, weighted sensibly, and combined with other inputs exactly as the methodology intends.

SESAMm's view is that this clarity is part of doing AI well, not a step back from it. The same analysts who design the methodology are the ones who test and refine it every day, and the new regulation now gives the whole market a reason to hold itself to the same standard. Precision about what a score means is not a limitation on its value. It is the foundation of it.

‍

To read the full methodology, including how the Controversy Exposure Score is built and the safeguards that keep it accurate, visit sesamm.com/methodology.

NLP | Alternative Data | AI

Summer Roundup: Our 10 Most-Read Blog Posts This Year (So Far)

September 7, 2022

•

5 mins read

Summer is almost over for us in the northern hemisphere. (We know. It's sad for us, too.) And with this seasonal shift comes back-to-school and back-to-work activities, including taking a last-minute vacation. And vacations mean time for reading, right?

While they may not be beach reads, we think we have some great choices. These are the posts that have been most popular on SESAMm's blog in the past five months. Let's get started with SESAMm's most-read blog posts since this spring, starting with number 10.

#10 What Investors Ought to Know About Natural Language Processing: A Quick Guide

Read this quick guide about what natural language processing is, how it’s used, why it's important to uncover financial alternative data. Bonus: Get an overview of how NLP works at SESAMm.

#9 S&P 500 ESG Index Drops Tesla: This Analysis Supports the Decision

Review SESAMm's analysis based on its ready-to-use data streams, revealing red flags that support the decision to oust Tesla, Inc. from the S&P 500 ESG Index.

#8 Alternative Data Trends: NLP Analysis on Commercial Real Estate

Read the takeaways of the current commercial real estate market we extracted using SESAMm’s NLP-powered engine to analyze web data.

#7 VIDEO: ESG Data Challenges and How AI and NLP Offer Solutions

sesamm-japan-investor-forum-title-slide-SESAMm

Watch CEO Sylvain Forté at Japan Investor Forum, discussing ESG data, its challenges, and how to use AI and NLP to generate insights on millions of companies.

#6 How Organizations Are Using NLP To Detect Greenwashing

See how we apply our NLP capabilities to identify companies likely to engage in greenwashing practices by analyzing text in billions of web-based articles.

#5 Alternative Data Trends: 5 Effects of the Failing Musk-Twitter Deal

Based on alternative data, discover how Elon Musk’s personal and related brands measure up to public sentiment following his failed acquisition of Twitter.

#4 What Investors Ought to Know About Data Lakes: A Quick Guide

Discover why SESAMm’s data lake is ideal for investment research and other basics like what a data lake is, why it’s important, what it does, and how it works.

#3 Gain Insights From Financial and ESG Data Using AI: A Comprehensive Guide

Gain Insights From Financial and ESG Data Using AI A Comprehensive Guide-1

Learn how SESAMm’s AI and NLP platform is used to gain financial and ESG insights from alternative data for systematic trading, fundamental research, and more.

#2 What Investors Ought to Know About Knowledge Graphs: The Core of Text Analysis

Introducing Knowledge Graphs The Core of Text Analysis

Learn what SESAMm’s Knowledge Graph is, what it does, and how it’s used in text analysis for financial research, such as in private equity and hedge funds.

#1 Predicting stock price movements using news and social media data

Tokio Marine & Nichido Fire Insurance Company and SESAMm work together to predict stock price movements using NLP-generated data from news and social media.

Thank you for reading through our Summer Roundup: the 10 most-read blog posts this year.

Which is your favorite? How would you rate these posts? Let us know what you think on Twitter or LinkedIn.

ESG | AI | SDG

How SDGs and AI Impact Investment Strategies: A Comprehensive Guide

August 24, 2022

•

5 mins read

The modern world is in a peculiar place right now. We’ve got the technology and resources to improve our planet, but we often don’t know how to use them despite our best intentions. Or, at the very least, we don’t know where to put our efforts.
Consequently, some investors are looking into Sustainable Development Goals (SDGs). Not only do they want their investments to earn more, but they also want them to do good. If you’re also interested in doing good with your investments, it’s essential to understand the SDGs and their meaning for your portfolio.
In this article, we’ll break down the SDG basics, SDG scores, their relevance to investing, and how SESAMm can help you get and read SDG metrics. But first, a quick review of SDGs.

What SDG means

SDGs, or Sustainable Development Goals, are a set of 17 goals that the United Nations set in 2015 to be achieved by the year 2030, a framework that “provides a shared blueprint for peace and prosperity for people and the planet, now and into the future.” The global goals and the 2030 Agenda for Sustainable Development cover issues such as human rights, poverty, health, education, gender equality, and environmental sustainability, and they were designed to be universal across countries and continents worldwide. Here are the 17 UN Sustainable Development Goals:

SDG 1: No Poverty: Striving to end poverty in all its forms everywhere. This goal underscores the importance of equitable resource distribution and access to basic needs.

SDG 2: Zero Hunger: Aiming to end hunger, achieve food security, improve nutrition, and promote sustainable agriculture, thereby ensuring that everyone, everywhere, has enough quality food to lead a healthy life.

SDG 3: Good Health and Well-being: It emphasizes the need for universal healthcare access, including reproductive, maternal, and child healthcare, and combats health threats by supporting research and development of vaccines and medicines.

SDG 4: Quality Education: Envisioning inclusive and equitable quality education and lifelong learning opportunities for all, this goal recognizes education as the foundation of empowerment and prosperity.

SDG 5: Gender Equality: Achieving gender equality and empowering all women and girls to participate fully in societal, economic, and political spheres

SDG 6: Clean Water and Sanitation: This goal aims to ensure the availability and sustainable management of water and sanitation for all, recognizing the essential role of water resources in sustaining life and ecosystems.

SDG 7: Affordable and Clean Energy: Promoting access to affordable, reliable, sustainable, and modern energy for all; this goal underscores the critical nature of energy in achieving other SDGs and the transition towards renewable energy sources to combat climate change.

SDG 8: Decent Work and Economic Growth: It focuses on promoting sustained, inclusive economic growth, full and productive employment, and decent work for all, highlighting the role of the private sector in initiating impactful initiatives.

SDG 9: Industry, Innovation, and Infrastructure: Aiming to build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation, this goal recognizes the importance of a robust infrastructure and an innovative ecosystem as drivers of economic growth and development.

SDG 10: Reduced Inequalities: This goal seeks to reduce inequality within and among countries, focusing on policies designed to achieve greater equity and involve stakeholders from all sectors of society in decision-making processes.

SDG 11: Sustainable Cities and Communities: It aims to make cities and human settlements inclusive, safe, resilient, and sustainable, emphasizing the need for green public spaces, improved urban planning, and sustainable construction practices.

SDG 12: Responsible Consumption and Production: Focusing on promoting resource and energy efficiency, sustainable infrastructure, and providing access to a better quality of life for all, this goal underscores the importance of adopting sustainable practices and reducing waste.

SDG 13: Climate Action: Taking urgent action to combat climate change and its impacts, this goal underscores the necessity for countries, stakeholders, and the private sector to collaborate in reducing emissions and enhancing renewable energy usage.

SDG 14: Life Below Water: Aimed at conserving and sustainably using the oceans, seas, and marine resources for sustainable development, this goal addresses the critical importance of our aquatic ecosystems.

SDG 15: Life on Land: Protecting, restoring, and promoting sustainable use of terrestrial ecosystems, sustainably managing forests, combating desertification, halting and reversing land degradation, and halting biodiversity loss.

SDG 16: Peace, Justice, and Strong Institutions: Promoting peaceful and inclusive societies for sustainable development, providing access to justice for all, and building effective, accountable, and inclusive institutions at all levels.

SDG 17: Partnerships for the Goals: This goal recognizes the importance of revitalizing the global partnership for sustainable development and the role of strong partnerships in achieving the SDGs, involving governments, the private sector, civil society, and others.

*The UN’s 17 Sustainable Development Goals. Image courtesy of UN.org.*

What are SDG scores?

Each Sustainable Development Goal has specific targets or indicators that help measure progress toward achieving those targets over time. SDG scores are numerical values given to each entity (country, company, person, etc.) based on their performance in meeting specific targets or indicators for each particular goal. Incorporating these evaluations into the decision-making process is crucial for stakeholders across various sectors, including the private sector, healthcare, financial services, and more. These stakeholders can leverage insights from SDG scores to prioritize initiatives that address critical issues like climate change, emissions reduction, and ecosystem preservation.

How do SDGs relate to ESG?

The environmental, social, and governance (ESG) framework is a tool to achieve and comply with the SDG goals. From a company’s perspective, ESG and SDG frameworks emphasize the importance of measuring and reporting progress. Companies incorporating ESG criteria into their operations often report on their sustainability performance, which can directly show their contribution towards achieving specific SDGs. For investors, ESG metrics provide a tangible way to evaluate companies' potential risks and opportunities related to sustainability, which can also align with the broader objectives of the SDGs.

The SDGs primarily focus on global challenges such as poverty, inequality, climate change, and environmental degradation, which represent the environmental and social pillars of ESG.

Within the same principles, several of these goals directly relate to the governance pillar of ESG. On the one hand, goal 16 aims to reduce corruption and bribery, develop effective and transparent institutions, and ensure inclusive and representative decision-making. On the other hand, goal 17 strives to enhance international cooperation, encourage effective public, public-private, and civil society partnerships, and ensure that policies are coherent and integrated, all of which are governance-related issues.

While the SDGs might not explicitly label these aspects as 'governance' in the way the ESG framework and regulatory landscapes do, the inclusion of these goals demonstrates a clear recognition of the importance of governance in achieving sustainable development.
SDGs and ESG also have different purposes. ESG measures companies’ environmental, social, and governance performance risks and initiatives, while SDGs evaluate any entity’s performance in reaching its goals. Put another way, SDGs represent the goals, while ESG concerns methodology and processes.

Learn more: “Gain Insights From Financial and ESG Data Using AI: A Comprehensive Guide.”

Why are SDGs important to investors?

At the company level, SDGs help align corporate strategy with society’s needs. Because the UN designed SDGs to be measurable, countries, companies, and people can hold themselves accountable for progress toward achieving them. And because the goals are measurable, we can score a company’s efforts, giving you an indicator to invest responsibly by aligning your portfolios with SDGs.

According to a publication by McKinsey & Company, sustainable investing appears to have a positive effect, if any, on returns. In other words, investors care about SDGs not only because they benefit society but also because they measurably support better investment decisions. For example, by incorporating SDGs into company assessments, investors can identify well-run businesses that are better positioned to benefit from the positive effects of improved social and economic conditions. SDGs also allow investors to make better-informed decisions within a defined investment time horizon by focusing on a company’s business exposure toward them. Investors can thus better measure and track a company’s opportunity exposure as a result of its achievement of the SDGs.

How to measure an entity’s SDG score

There are tools available to measure progress toward each goal—and those tools will play an essential role in helping investors decide which entities they want to invest in and which ones they don’t want to support. For example, SESAMm’s platform, TextReveal®, can analyze web data to generate SDG scores for virtually any entity in our data lake.

How SESAMm provides SDG scores

SESAMm provides SDG scores through its platform, TextReveal, a platform that allows investors to gain insights into companies, people, or topics. Specifically, we use artificial intelligence (AI) to track entities’ contributions toward SDGs, including public and private companies.

We track the 17 Sustainable Development Goals and the 169 underlying targets to detect negative news and positive events, using a similar algorithm we use for ESG alerts and gathering alternative data. Each UN SDG item displays a score from 0 to 5 to show the intensity of the company’s positive impact. Then, we translate the information into multiple languages.

dashboard view of Aker Carbon Captures SDG performance — *This dashboard view example shows some SDG scores for Aker Carbon Capture.*

We queried the Norwegian carbon capture company, Aker Carbon Capture, using our SDG positive impact dashboard over the past three years. As you might notice, Aker contributes to the goals associated with Partnerships, Climate Action, Clean Energy, and Sustainability. Maybe they could do more regarding Decent Work and Economic Growth, and Responsible Consumption and Production, but overall, the company’s online data shows a positive contribution.

See how SESAMm can help you with your SDG research

SESAMm is the leading provider of AI solutions and analytics for investment firms and corporations.

Our AI and NLP platform, TextReveal:

Analyzes text in billions of web-based articles and messages
Generates investment insights, ESG and SDG analysis used in systematic trading, fundamental research, risk management, and sustainability analysis
Enables a more quantitative approach to leveraging the value of web data that’s less prone to human bias
Addresses a growing need in public and private investment sectors for robust, timely, and granular sentiment and SDG data

SESAMm’s AI Technology Reveals ESG Insights

Discover unparalleled insights into ESG controversies, risks, and opportunities across industries. Learn more about how SESAMm can help you analyze millions of private and public companies using AI-powered text analysis tools.

Alternative Data | Text Analysis | Sentiment Analysis

Alternative Data Trends: 5 Effects of the Failing Musk-Twitter Deal

August 17, 2022

•

5 mins read

On April 24, 2022, Elon Musk, CEO of Tesla, Space X, The Boring Company, and Neuralink—and one of the most popular people on Twitter with one of the largest followings—reached an agreement to buy Twitter for roughly 44 billion dollars. On July 8, 2022, the deal failed to materialize after Musk withdrew from the negotiations due to his concerns about the company's alleged overabundance of fake Twitter user accounts, aka bots. As a result, the Twitter stock price plummeted by 15% after the announcement.

Now that his deal to buy Twitter has failed and culminated in a legal battle, Musk's public sentiment has reached all-time lows. The public sentiment for Twitter has also taken a hit. In general, public sentiment surrounding this deal was largely negative from both sides:

Musk's fans were disappointed because they thought it would allow him to spread his message about sustainable energy sources further.
Twitter's users were happy because they believed his involvement would have led to changes that would have made the platform less accessible than ever before.

But how exactly was public sentiment affected by the fallout of Elon Musk's failed Twitter acquisition? Let's find out. Here are five effects of the failed Musk-Twitter deal.

Notes: Dates on the charts follow the day/month/year format. Additional timeline source: “A timeline of Elon Musk's tumultuous Twitter acquisition attempt,” ABCnews.com, July 13, 2022.

1. Merger and acquisition sentiment dropped from the beginning

Mention volume and polarity chart for M&A and Twitter — *Figure 1: Twitter M&A sentiment took a hit at key events during Musk’s evaluation period.*

Musk had been exploring the possibility of purchasing Twitter as early as January 2022 when he began increasing his positions in Twitter stock. By March 14, Musk became the largest shareholder in the company, according to a securities filing. And that's when the sentiment toward the acquisition began to drop.

M&A sentiment experienced a further drop when Musk officially announced his offer to purchase the Twitter company on April 14, 2022. On Reddit, for example, members of the r/Economics community posted and engaged with the following: Elon Musk Launches $43 Billion Hostile Takeover of Twitter, a post that since has been removed but represents one of many sources feeding sentiment toward the topic.

In May 2022, Musk announced a hold on the deal, pushing M&A sentiment even farther down. And more recently, in late June and early July when Twitter sued Musk for breaching the M&A agreement, M&A sentiment fell deeper into the negative space.

2. Sentiment for Elon Musk and Twitter declined likewise

Mention volume and polarity chart for Elon Musk, Twitter, and M&A — *Figure 2: Overall, Musk’s sentiment polarity suffers the most.*

But how do Elon Musk's and Twitter's sentiments evolve with M&A mentions?

In measuring and analyzing M&A mentions in web data, we found that Twitter's brand suffered but not nearly as much as Musk's. Both of their sentiments dropped in April when Musk announced his offer. However, Musk's sentiment suffered more when he put the deal on hold in May and again in June when Twitter filed a lawsuit against him.

Figure 2 shows two additional drops in Musk's sentiment for July. These correspond to news events regarding the trial, including news about the trial's start date in October.

3. Musk's brands, Tesla and SpaceX, suffered, too

Tesla stock price and polarity chart for Musk, SpaceX, and Tesla sentiment — *Figure 3: SpaceX sentiment polarity nearly matches Musk’s.*

Unfortunately for Musk, his other brands also experienced a drop in sentiment. For example, Tesla's sentiment experienced corresponding declines compared to Musk's, but not nearly as much as SpaceX's (Figure 3). One reason for this disparity could be the open letter SpaceX's workers wrote. The workers voiced their concern about Musk's behavior in this letter, stating, "Elon's behavior in the public sphere is a frequent source of distraction and embarrassment for us."

Further, in Figure 3, we track Tesla's stock performance. Initial data shows a possible correlation between Tesla's stock price and sentiment. However, further analysis and backtesting are needed to confirm this correlation.

4. Musk's sentiment suffered more than Twitter's

Twitter stock price and polarity chart for Twitter and Musk — *Figure 4: Twitter’s sentiment polarity isn’t as affected as Musk’s.*

Twitter's sentiment remained relatively stable, seeing only a minor drop when Musk became the largest shareholder. Even Twitter's stock price remained stable, experiencing a temporary increase when Musk purchased Twitter stock but settling after. It's worth noting that Twitter's stock price was declining before January 2022, which might have influenced Musk's decision to buy.

In contrast, Musk's sentiment took a huge hit when he became the largest shareholder.

5. It’s not only about Musk and Twitter

Mention volume and polarity chart for the open source code topic — *Figure 5: Musk possibly gained the open-source community’s favor, if the rise in polarity is an indication.*

In late 2021 and early 2022, open-source sentiment polarity was dipping. It experienced its biggest dip after February 2022 when Twitter admitted it had mistakenly removed Ukraine open-source intelligence accounts.

However, in April 2022, Musk said that one of the ways he wanted to improve Twitter was to make its algorithms open source to increase trust. How did the open-source community take the news? According to the chart (Figure 5), well. Open-source sentiment polarity jumped back up.

Analyzing the M&A sentiments

Overall, Elon Musk’s sentiment polarity reached lower levels than those of Twitter and his other brands—although SpaceX took a significant hit, too. Whether because of his brash public statements or his employees criticizing his focus and intentions, data shows that netizens were not supportive of his attempted acquisition. And with the Twitter v. Musk court battle scheduled and looming, his sentiment doesn’t seem like it will be improving anytime soon.

Reach out to SESAMm

To learn more about how we analyze web data or to request a demo, reach out to one of our representatives.

NLP | Alternative Data | AI

Predicting stock price movements using news and social media data

August 10, 2022

•

5 mins read

Tokio Marine & Nichido Fire Insurance Co., Ltd. (TMNF) tapped SESAMm for a joint research venture to predict future stock price movements and discovered two key findings:

NLP data from news and social networking websites can have strong relationships with investor behavior. Thus, it’s possible to forecast investors’ rational reactions to changes in data and price movements based on those relationships.
NLP data proved to help anticipate tail events. For example, given the macroeconomic environment of the last 10 years, the stock market performed well. So in this context, investors are sensitive to negative narratives in times of uncertainty, such as the 2015 market sell-off, the U.S.-China trade war, the coronavirus pandemic, and the start of the Ukraine-Russian war, and post their concerns online.

Providing safety and security since 1879

Tokio Marine Insurance Company was first established in 1879. Over the years, it has added products and services, acquired other businesses, and merged with other companies to eventually become Tokio Marine & Nichido Fire Insurance Co., Ltd. Commonly called Tokio Marine Nichido today, the company is a property and casualty insurance subsidiary of Tokio Marine Holdings, the largest non-mutual private insurance group in Japan. Its products and services provide safety and security to its clients and partners, contributing to more fulfilling lifestyles and business development.

One of the company’s philosophies is to be a good corporate citizen and fulfill its social responsibilities, including protecting the global environment, promoting human rights, creating a responsible working environment, and contributing to society and individual local communities. Recently, the Emperor of Japan awarded Tokio Marine Holdings, Inc. the Medal with Dark Blue Ribbon for donating to the Japan Student Services Organization to support students who face financial difficulty during the COVID-19 pandemic. Individuals, corporations, or organizations are awarded the Medal with Dark Blue Ribbon for their outstanding contributions to the public.

Transforming and accepting the challenge to grow

According to TMNF, “The business environment surrounding the insurance industry is changing at a faster pace than ever due to changes in demographics, advances in technologies, such as autonomous driving and AI, and longer-term trends, such as the intensification and frequent occurrence of natural disasters, as well as further progress in digitalization due to the COVID-19 pandemic.”

“The business environment surrounding the insurance industry is changing at a faster pace than ever…”

“While these changes in the business environment pose a threat, we consider them to be excellent opportunities for transformation and the creation of new value.” So they’ve adopted the concept, “Transformation (“X”) and Challenge to Growth 2023: Aiming to be the company most chosen for quality and its passion.” Ultimately, it strives to support customers and local communities in times of need while contributing to social responsibility. Five social issues that it will prioritize are:

Global climate change and the increase in natural disasters
The increased burden of long-term care and healthcare due to the aging of society and advances in medical technology
Technological innovation and its effects on the environment
Symbiotic society and responding to the novel coronavirus
Industrial infrastructure and how it supports economic growth and innovation

Leveraging a partner with the right technology

To secure and protect its clients’ assets while elevating social issues, Tokio Marine Nichido sought out an edge in the stock market. Under these circumstances, it was fortunate that TMNF discovered SESAMm in 2020 through the Plug and Play Japan program, a platform with an event that connects Japan to markets abroad. SESAMm had presented its NLP alternative data solution, TextReveal®, to which TMNF considered the platform for access to alternative data and sought collaboration with the SESAMm team for a research project.

“SESAMm has the technology to extract text sentiment from news data with a neural network.”
– Tokio Marine & Nichido Fire Insurance Co. Ltd representative

Extract relations between NLP data and the financial market

In 2021, Tokio Marine Nichido Insurance began collaborating with SESAMm to develop an AI analytics model for alternative data. It models the impact of news and social networking data on investor behavior for stock and bond markets, transforming text information into knowledge usable by TMNF. For instance, when the model detects a negative narrative raising uncertainty in the market, investors can use this signal to reduce their risk exposure.

Predicting future stock price movements from news and social media data

Tokio Marine Nichido and SESAMm’s joint research found that natural language data from news and social networking sites effectively predict future stock price movements. In the case involving the pandemic, for example, there was a time lag of as long as a month between the time COVID-19 became news and the time it affected the U.S. stock market (Figure 1). By using SESAMm’s technology to analyze news data during this period, the team found that US news and social networking sentiment had already deteriorated sharply before stock prices reacted. This sentiment deterioration is due to the fear of the coronavirus-spread effect on the global economy. In an all-time high S&P 500, U.S. investors did not initially consider this risk. In comparison, HSI companies were closer to the coronavirus spread risk, resulting in HSI investors reacting ahead of those in the U.S.

Chart comparing U.S. news sentiment to Hang Seng Index and S&P 500 Index — *Figure 1: In 2020, U.S. news sentiment falls ahead of the stock market in response to COVID-19 concerns.*

The model can calculate sentiment for each company by analyzing the news of individual companies. It’s also possible to create a composite to measure the sentiment related to a stock index. The sentiment data also helps management and investor relations because it provides a quantitative means of understanding the extent to which investors are concerned about certain news about their company.

Verifying the results

Verification using Japanese has revealed that the timing of bottoming and ceiling of text sentiment precedes those of stock prices. The collaborating team compared the performance of:

A model that uses only orthodox financial and economic data as inputs
A model that considers NLP and financial and economic data, confirming that the latter could generate higher alpha

SESAMm equity model performance and characteristics chart and tables — *Figure 2: Back-testing confirms that SESAMm’s equity model can predict a market downturn, capturing changes in text sentiment and reducing positions ahead of market crashes.*

Since measuring sentiment is mean reversionary by nature, the TMNF team believes it provides good support for position management during rallies and crashes. It’s also valuable for avoiding forced loss-cut at the bottom when liquidity temporarily evaporates and the market crashes.

Expanding the research to other use cases

In addition to analyzing the stock market, Tokio Marine Nichido also expanded the scope of the research to include R&D on using natural language data in trading U.S. high-yield bonds. Research shows that NLP data can help provide a hedging signal for the negatively skewed high-yield market (Figure 3) by capturing deteriorating text sentiment (Figure 5). For example, these signals can inform investors to reduce positions before market reactions.

U.S. High-Yield Total Return index monthly returns chart — *Figure 3: NLP data can help provide a hedging signal by capturing deteriorating text sentiment.*

High-yield model performance comparison chart — *Figure 4: An NLP-informed high-yield strategy can outperform the U.S. high-yield total return index and a strategy without NLP. Same volatility level for the three back-tests.*

TMNF is also applying the research to estimate the Fed’s stance—hawkish or dovish—using natural language data, too. It hypothesizes that the market will be focused on the Fed’s stance on interest rate hikes in the next few years.

“The model developed in collaboration with SESAMm is simple in structure, yet, it’s an orthodox and robust model that uses valid data as input.”

Summarizing the collaboration

In developing models, Tokio Marine Nichido believes it is essential to consider “what data to consider” and to keep it simple. And TMNF achieved these tenets. The model developed in collaboration with SESAMm is simple in structure, yet, it’s an orthodox and robust model that uses valid data as input which is preferable to a risky over-fitting by increasing complexity.

A representation of the collaborative NLP alternative data model — *Figure 6: The joint Tokio Marine Nichido and SESAMm NLP alternative data model: Simple yet robust.*

Get in touch with SESAMm

To learn more about Tokio Marine Nichido’s case study or to request a TextReveal demo, reach out to us here:

Alternative Data | Big Data | AI

What Investors Ought to Know About Data Lakes: A Quick Guide

July 27, 2022

•

5 mins read

If you’ve taken a basic computer course, you might have learned this famous phrase: Garbage in. Garbage out. It’s become so popular that people use it in other references, like diet and exercise and video or audio signal flow. But I digress.

What does the garbage in, garbage out phrase have to do with data lakes? Think of it this way, if you were to build an ideal lake for leisure, would you pump in any water? Probably not. My guess is that you’d want the cleanest, bluest, purest water you could find that would provide an ideal place for swimming, fishing, or whatever activity you like to do at a lake. So similar to the reason to pump good water into an actual lake for an ideal relaxing vacation spot, for example, we want to pump good data into a data lake because it yields ideal results.

Before we discuss SESAMm’s data lake, we’ll cover a few of these basics:

What is a data lake?
Why is a data lake needed?
How does a data lake work?

What is a data lake?

Data lakes are centralized repositories organizations use to store large amounts of unstructured, semi-structured, and structured data.

Data lake vs. data warehouse

The main differences between a data lake and a data warehouse are how they store your data and how the data is used. For example, data warehouses typically store hierarchically structured data in files or folders. In contrast, data lakes use flat architecture and object storage. Also, with a data lake, the data is raw with no specific purpose. But with a data warehouse, the information is structured, filtered, and processed for a particular purpose.

Why is a data lake needed?

Organizations like SESAMm employ a data lake for two main reasons:

Take advantage of advanced and sophisticated analytical techniques applied to complex and diverse data.
Perform data access and retrieval activities more efficiently and easily.

More specifically, companies employ data lakes for simple data management, to store and catalog data securely, and to conduct data analytics. For instance, data lakes allow you to import any data amount from multiple sources in their original format.

They also allow various roles within your organization—business analysts, data developers, and data scientists—to access data sets. Moreover, they can use their preferred frameworks and tools, such as Apache Hadoop, Spark, and Presto, to name a few, without moving data to a separate analytics system.

Furthermore, data lakes allow companies to generate various insights, from reporting on historical data to forecasting likely outcomes through incorporating AI and machine learning models, practices that can prescribe suggested actions to achieve better results.

Benefits of a data lake

The biggest benefit of a data lake is that you can ingest your raw data in its native format. This raw unstructured format allows you to use the data in various applications and understand the data from multiple perspectives, running different types of analytics from dashboards and visualizations to big data processing and machine learning. However, if you have a specific intent for your data lake, including applying AI and machine learning, structured data input is ideal.

Another benefit to a data lake is because, according to AWS, “Organizations that successfully generate business value from their data will outperform their peers.” AWS further explains, “An Aberdeen survey saw organizations who implemented a data lake outperforming similar companies by 9% in organic revenue growth. These leaders were able to do new types of analytics like machine learning over new sources like log files, data from click-streams, social media, and internet-connected devices stored in the data lake. This [ability] helped them to identify and act upon opportunities for business growth faster by attracting and retaining customers, boosting productivity, proactively maintaining devices, and making informed decisions.”

How a data lake works (not technical)

As an investor, you probably won’t be building your own data lake because that’s what companies like SESAMm are for, but this section will give you a quick overview of how a data lake works.

You only need a few elements to make a data lake work without getting too technical. First, you need to source data. Sources can include:

Binary data (audio, images, and video)
Semi-structured data (CSV, JSON, logs, and XML)
Structured data from relational databases (columns and rows)
Unstructured data (documents, emails, and PDFs)

Second, you need reliable, secure, and fast data storage for your sourced data. Cloud storage providers could provide better scalability and affordability compared to on-premises solutions. Third, you need an analytics platform to access and analyze your sourced data. There are many open source and commercial platforms to choose from should creating a data lake be of interest to you, but we won’t get into the details here.

Last, you need to store the data in an open format like object storage. Object storage stores data with metadata tags, identifiers that make it easier to locate and retrieve data across regions. Overall, object storage and similar open formats enable many apps to take advantage of the data inexpensively while improving performance.

Four reasons SESAMm's data lake provides a unique foundation for data scientists' and investors' use cases

What makes SESAMm’s data lake unique and ideal for investment research and advanced analytics? SESAMm’s data lake is:

Broad and large
Includes more than 100 languages
Tuned to key indicators
Updated in near real time

Including data since 2008, the data lake consists of more than four million data sources made up of more than 20 billion articles, forums, and messages, such as professional news sites, blogs, and social media, increasing by an average of six million per day.

Moreover, the coverage is global, with 40% of the sources in English (the U.S. and international) and 60% in multiple languages. We select and curate these sources to maximize coverage of both public and private companies, focusing on quality, quantity, and frequency to ensure a consistently high input value.

SESAMm’s developers also tune the machine learning algorithms for key indicators such as mention volume, sentiment and emotion, ESG, and SDG. Additionally, they optimize the structure and schema for optimized SQL queries. The data lake is also updated hourly to give investors near real-time insights into their investment interests.

To learn how you can generate alternative data from text using NLP algorithms on our industry-leading, ready-to-use data lake, request a demo today.

ESG | NLP | Alternative Data

Alternative Data Trends: NLP Analysis on Commercial Real Estate

July 21, 2022

•

5 mins read

Housing and construction fees have skyrocketed over the past few years. This increase goes back to multiple factors: economic unrest, raw materials disruption, and labor shortage, to name a few. What does web data have to say about all this?

In this week’s “Alternative Data Trends” issue, we’ll talk about commercial real estate, unveiling the industry’s ESG and SDG conformity and the effects of COVID-19 on the supply chain and labor.

Commercial real estate volume of mentions

While analyzing web data dealing with commercial real estate, we detected an evident increase in the industry’s volume of mentions. This trend spiked in April 2020 and was initially hindered by the COVID pandemic, which resulted in a drop in sentiment polarity. Still, it witnessed a rapid recovery leveraging digitalization and e-solutions (Figure 1).

Commercial real estate market mentions chart — *Figure 1: Commercial real estate market mentions Feb 2015 to Mar 2022.*

Case study: Unibail-Rodmaco-Westfield

To further understand the commercial real estate industry, we studied Unibail-Rodamco-Westfield and its competitors. Unibail-Rodmaco, a French commercial real estate company, acquired Westfield, a U.S. company, in December 2017. This acquisition accentuated its market share and grew its web voice share compared to its competitors (Figure 2).

Unibail volume of mentions chart — *Figure 2: Unibail volume of mentions compared to the market.*

The chart in Figure 3 shows that the company’s volume of mentions has been increasing ever since the acquisition occurred. However, a negative sentiment polarity has been steadily increasing due to social ESG risks related to collective health crises during COVID and security-disrupting threats. In addition, the company faced difficulties collecting rent from retailers leading to lawsuits.

The arrows in this chart indicate Unibail ESG risks in time. The first arrow points to the social risks generated by security threats, in 2016, and the second arrow points to the issue of unpaid rent and lawsuits filed regarding the matter, in 2020.

Unibail ESG risks chart — *Figure 3: Unibail ESG risks.*

According to web data, Unibail has the second highest volume of sustainability mentions among analyzed groups. The company was notably related to sustainable development goals number 8* and number 12**. This volume is manifested in their initiatives to help unemployed people and maintain sustainable ethics and practices when launching their malls and shopping centers (Figure 4).

* Social development goal for decent work and economic growth.

** Social development goal for responsible consumption and production.

Unibail SDG mentions chart — *Figure 4: Unibail SDG volume of mentions compared to the market.*

The impact of COVID on the emerging commercial real estate market

As previously mentioned, COVID had several effects on the industry, both negative and positive. Furthermore, it reshaped the market and its work policies. Some companies, as well, chose to switch to remote work and digitalization. In Figure 5, we can see that sentiment related to remote work policies has steadily improved since the pandemic started. However, in the last few months, we’ve seen a sharp decline, potentially signaling a negative reaction to some companies requiring employees back to their offices.

Remote work policies volume of mentions chart — *Figure 5: Remote work policies’ volume of mentions.*

In addition, the pandemic has resulted in labor shortage and supply chain disruption, eventually leading to tremendous inflationary pressure. Raw materials prices, including oil, gas, iron, and wood, have witnessed a drastic increase and a disequilibrium between the volume of demand and the quantity available (Figure 6).

Labor shortage and supply chain disruption chart — *Figure 6: Labor shortage and supply chain disruption Feb 2015 - Dec 2021.*

Data source

To produce this analysis, we combined natural language processing with billions of textual web data related to the real estate market, commercial real estate in particular. Using NLP-powered models gives us an edge as we can extract ESG, SDG, and financial insights that aren’t necessarily obvious or easy to detect. These insights help investors make better investment decisions.

SESAMm leverages artificial intelligence and machine learning to help you decipher and understand timely sentiments, trends, and ESG metrics on a wide range of public and private companies.

Stay in touch with SESAMm

Thanks for reading this issue of Alternative Data Trends. Be sure to catch the next issue by subscribing to our blog. And if you'd like a TextReveal® demo, send us a message via the form.

NLP | Alternative Data | Big Data

What Investors Ought to Know About Natural Language Processing: A Quick Guide

July 13, 2022

•

5 mins read

In this issue of the "what investors ought to know about…" series, we'll cover natural language processing (NLP), a tool that draws from the computer science and computational linguistics disciplines. In the last topic, we discussed knowledge graphs as the core of text analysis. And if knowledge graphs are the core of the data’s context, NLP is the transition to understanding the data.

What is natural language processing?

Natural language processing is an artificial intelligence (AI) technology that automates the data analysis of mined textual, unstructured data to include natural language understanding and natural language generation to simulate a human's ability to create language. It combines computational linguistics with machine learning and deep learning models, performing a special linguistic analysis by algorithms so a machine can "read" text.

Where is natural language processing used?

Today, various industries use NLP, from email filters to virtual assistants and search engines to chatbots. Here's a list of common ways natural language processing is used:

Chatbots: Chatbots are computer programs that use NLP. They simulate human conversation by identifying a sentence's intent, determining suitable topics, keywords, and emotions, and calculating the best response based on the data's interpretation.
Email filters: Email filters apply machine learning using many data samples to sort emails into the right inbox.
Machine translation: Translation software like Google Translate or Microsoft Translator use NLP to translate text from one language to another, such as English to French.
Natural language generation (NLG): NLG, a subfield of NLP, builds applications or computer systems that can automatically produce natural language texts of various types by using a semantic representation as input. Applications of NLG include question answering and text summarization.
Predicting and autocorrecting text: Predictive text and autocorrect use NLP to recognize and recall commonly used words and names to make text suggestions and correct common errors.
Search engines: Search engines like Google search use NLP machine learning to interpret a searcher's intent and provide relevant results. It can even suggest subjects and topics related to the query the searcher might be interested in.
Virtual and voice assistants: Virtual assistants like Apple's Siri or Amazon's Alexa use NLP technology to understand and respond to voice requests. Speech-to-text can dictate messages and notes, and speech recognition can control everything from smartphone apps and smart speakers to thermostats and home security systems.
Web sentiment analysis: Sentiment analysis automates classifying opinions in a text as positive, negative, or neutral. It's a method companies like SESAMm use to monitor sentiments like a brand's sentiment on the web and social media.

Why natural language processing is important to uncover financial-related alternative data

NLP is important because it helps resolve human language ambiguity in big datasets (big data). Languages are complex, diverse, and expressed in unlimited ways, from speaking hundreds of languages and dialects to having a unique set of grammar and syntax rules, slang, and terms for each. In text form, these variables are unstructured text. But with NLP, we can transform unstructured data into structured data and make sense of it.

Because of NLP's power, investors can research and analyze unstructured data from the web to gain insights into financial and ESG data. You can use this wealth of information to focus on systematic data processing, risk management, and alpha discovery through contexts, such as:

Major global indices sentiment
Euronext exchange sentiment
Private company sentiment
ESG risks for public and private companies worldwide

A quick overview of how natural language processing works at SESAMm

At SESAMm, we use named entity recognition (NER), which extracts the names of people, places, and other entities from text, and then named entity disambiguation (NED) to identify named entities based on their context and usage. For example, text referencing "Elon" could refer indirectly to Tesla through its CEO or a university in North Carolina. NED considers the context when classifying entities for an accurate match. Compared to simple pattern matching, which limits the number of possible matches, requires frequent manual adjustments, and can't distinguish homophones, NED is superior.

*Process representation for NER and NED.*

When identifying entities and creating actionable insights, SESAMm uses three other NLP tools: lemmatization and stemming, embeddings, and similarity. The lemmatization process normalizes a word into its base form (morphology) to help identify and aggregate entities. Embedding assigns the entity a numerical value to help analyze how words change meaning depending on context and understand the subtle differences between words that refer to the same concept—similarity measures whether two words, sentences, or objects are close to one another in meaning.

*Representation of nodes in a knowledge graph.*

Of course, NLP couldn't function without the core of the text analytics process: knowledge graphs. A knowledge graph is a digital representation of a network of real-world entities, the foundation of a search engine or question-answering service. This structured data model puts the schema in context through semantic metadata and linking, providing a framework for analytics, data integration, sharing, and unification. In other words, it's like a map and legend, with the legend labeling the concepts, entities, and events and the map connecting and identifying their relationships. These details are stored in a graph database and visualized as a graph representation, hence the term knowledge graph.

SESAMm's natural language processing platform for investment research and analysis

SESAMm is the leading provider of natural language processing and machine learning solutions and analytics for investment firms and corporations.

Our AI and NLP platform, TextReveal®:

Analyzes text in billions of web-based articles and messages
Generates investment insights and ESG analysis used in systematic trading, fundamental research, risk management, and sustainability analysis
Enables a more quantitative approach to leveraging the value of web data that's less prone to human bias
Addresses a growing need in public and private investment sectors for robust, timely, and granular sentiment and ESG data

For a personal demo, contact us today.

ESG | NLP | Risk Alerts

S&P 500 ESG Index Drops Tesla: This Analysis Supports the Decision

July 6, 2022

•

5 mins read

May 2, 2022. The S&P 500 ousts Tesla, Inc. from the S&P 500 ESG Index. Tesla is widely recognized as the firm that ushered electric vehicle making into the mainstream. So the index’s move seems unreasonable or possibly made in error to many, raising some interesting questions:

How does an environmentally-friendly corporation like Tesla get dropped from an ESG index?
Why does a potentially non-environment-friendly company like Exxon make the ESG index and remain on it?
What do these moves mean about the integrity and validity of ESG scores and ratings?

Before we go on, let’s bring some context.

Why did the S&P 500 ESG Index drop Tesla?

May 18, 2022. In an S&P blog post, "The (Re)Balancing Act of the S&P 500 ESG Index," a spokesperson announces and explains their decision. Here are the bullet points:

Global industry group peers pushed Tesla’s S&P DJI ESG Score further down the ranks in the GICS industry group: Automobiles & Components.
A decline in criteria level scores related to Tesla’s low carbon strategy and codes of business conduct contributed to its 2021 S&P DJI ESG Score.
A media and stakeholder analysis identified "two separate events centered around claims of racial discrimination and poor working conditions at Tesla’s Fremont factory."
The analysis also highlights "the handling of the NHTSA investigation after multiple deaths and injuries were linked to its autopilot vehicles, affecting the company’s S&P DJI ESG Score at the criteria level, and its overall score."

companies-left-out-of-SPESGindex-post-rebalance

Companies, including Tesla, left out of the S&P 500 ESG Index post-rebalance. Image courtesy of Indexology Blog.

The S&P blog post summarizes their case about dropping Tesla, "While Tesla may be playing its part in taking fuel-powered cars off the road, it has fallen behind its peers when examined through a wider ESG lens." And in this statement lies the crux of why the index dropped Tesla and why others are still on.

Analyzing Tesla’s web data

SESAMm’s TextReveal® insights suggest that the S&P 500’s decision to remove Tesla could be justified based on increasing controversy levels concerning discrimination, ethical standards, and work health and safety. By analyzing text related to ESG topics across the web, we picked up trends for the following subtopics:

climate_change_atmospheric_pollution
ethical_standards
discrimination_racism_sexism
labor_standards
health_and_safety_at_work
general_environmental_impact

Tesla’s ESG scores (six subtopics)

ESG scores, 1-year moving average, Tesla, all source types — *Figure 1: Tesla ESG scores for volumes and sentiments (1-year moving average), all source types.*

Regarding the volume features (Figure 1), we observed a significant increase in the scores related to ethical standards, discrimination, and atmospheric pollution for Tesla before the controversy. The conclusions are mostly the same for ESG sentiment (negative) scores. An interesting note is that the negative score of health and safety at work slightly increased in the months before the removal of Tesla from the index.

ESG scores, 1-year moving average, Tesla, all source types, select subtopics — *Figure 2: Tesla ESG scores for volumes and sentiments (1-year moving average), all source types, select subtopics.*

Comparing Tesla’s sentiment with other S&P 500 ESG Index companies

To see how Tesla’s ESG sentiment scores compared with other companies, we must rescale them with respect to a large universe of companies. This process means that for a given company, we use percentiles of the distribution of each subtopic’s ESG score to do a rescaling to the S&P 500 ESG constituents list after the 2022 rebalancing. Rescaling allows us to compare the companies with each other because the rescaled score indicates how bad the company is compared to the others, according to a specific ESG subtopic.

The following graphs show different sets of subtopics, plotting the mean of the respective rescaled scores if several topics are considered. Here are the companies considered.

Companies removed from the index:

Tesla
Delta Air Lines
Chevron Corporation

Companies that joined the index after the 2022 rebalancing:

American International Group
Expedia Group

Companies still part of the index:

Exxon Mobil
Apple
Amazon

Tesla, Delta, Chevron, AIG, and Expedia compared

Rescaled scores: Apple, Amazon, and Exxon — *Figure 3: Six-subtopic rescaled scores for Tesla, Delta, Chevron, AIG, and Expedia.*

Apple, Amazon, and Exxon compared

The S&P 500’s choice is reasonable

Our analysis shows that the S&P 500’s decision to oust Tesla from the ESG index is reasonable. We found significant subtopic volumes and negative sentiment that support the S&P 500’s claims of racial discrimination, poor working conditions, and other controversies.

Thanks for reading this quick analysis. For a more detailed report, including Chevron’s and Delta’s ESG scores, reach out to a representative today.

SESAMm’s ready-to-use alternative data

Leverage our alternative data streams to incorporate systematic insights into your alpha signals or risk monitoring your entire portfolio. From tracking global sentiment to analyzing retail communities like WallStreetBets and integrating ESG alternative data into your systems, our solutions will make generating value from web insights easy.

ESG | NLP | Alternative Data

Alternative Data Trends: The U.S. Baby Formula Shortage

June 23, 2022

•

5 mins read

Imagine finding out you've run out of milk immediately after pouring a bowl of cereal. Or maybe realizing you don't have eggs while in the middle of baking a cake. We've all been there, and it's frustrating, to say the least. And this scene has been playing around the globe over the last couple of years for many foods and products. One day it's microchip shortages, and the next, it's baby formula.

Unfortunate as it is, it's one thing for consumers to cope with an empty car lot because of chip shortages. It's another to cope with a hungry infant because store shelves that once contained baby formula are now bare. For those parents and caretakers, their emotions are beyond feeling frustrated. They feel anger and panic, the sort of emotions that they share with their friends and colleagues on social media and forums. The kind of expression that can change the public's sentiment about a company, which in turn can move markets.

This Alternative Data Trends post will examine web data concerning the baby formula shortage. We'll analyze articles, social media, and forum conversations culminating in the U.S. crisis as the news reaches national exposure. We'll also highlight red flags investors could've seen had they monitored the situation with an AI-powered text analysis tool like SESAMm's TextReveal®.

Early warnings: When baby formula supplies began to run dry vs. when it became a national crisis

If we compare absolute and relative volumes—relative being mentions about the topic compared to our entire data lake—the term "formula milk market" yields parallel results. Mentions spike in May when the crisis reaches national coverage (see Figure 1).

SESAMm line chart formula milk market mention volumes — *Figure 1: Absolute and relative mention volumes for “formula milk market” match.*

However, comparing absolute and relative volumes for the term "formula milk shortage," we find red flags as early as January 2022, four months before the crisis receives national attention (see Figure 2). Relative mentions spike on three occasions before absolute volumes register any significant noise. The fourth instance matches a ripple on the absolute chart.

SESAMm line chart formula milk shortage mention volumes — *Figure 2: Relative mention volumes for “formula milk shortage” show possible controversies.*

These articles provide an example of the content published around the times of those rises in mentions:

Jan. 12, 2022: "Baby Formula Is Hard to Find. Brands and Stores Are Divided Over Why.," WSJ
Feb. 9, 2022: "Baby formula shortage has some families scrambling," WCAX-CBS
Mar. 7, 2022: "Baby formula shortage and recalls affecting families," KGUN-ABC
Apr. 11, 2022: "A shortage of baby formula is worsening and causing some stores to limit sales," WHYY-PBS

Analyzing the sentiment and polarity of the formula milk market

In short, the e-reputation of the formula milk market has been negative since the beginning of 2022 (see Figure 3). Positive sentiment drops and reflects the opposing negative sentiment almost exactly until May, when the news about the crisis breaks. Likewise, polarity trends downward over the same period.

Note: Polarity represents a company's aggregate of positive and negative sentiment (opinions, reviews), ranging from -1 to 1. A zero score means that there is as much positive as negative sentiment. High e-reputation brands can have polarity scores of more than 0.5.

SESAMm line chart formula milk market sentiment analysis and polarity over time — *Figure 3: “Formula milk market” sentiment analysis and polarity moved negatively over time*

The web data shows that some articles correlate with sentiment and polarity changes. For instance, in February, AboutLawsuits.com posted, "Similac and Enfamil Baby Formula Shortages Create Infant Feeding Challenges." And in May, HealthDay, a prominent syndicator of health news, posts, "U.S. Baby Formula Shortage Worsens."

Analyzing formula milk brands

In the U.S., four brands produce the bulk of formula milk: Abbott, Mead Johnson, Nestlé, and Perrigo. Abbott and Nestlé hold the largest share of the formula milk market.

Figure 4: Abbott gains more than 75% of mention volume share in Q1 2022.

When we group these four brands' mentions from January 2021 to June 2022, we can see how their mention volumes compare (Figure 4). For example, at the beginning of the graph, we can see that Abbott and Nestlé have more mention-volume relative to their market share. However, at the end of 2021, Mead Johnson and Abbott experience spikes in mentions due to lawsuits against their formulas. Then, in Q1 2022, Abbott mentions increased drastically after its formulas were recalled due to possible contamination, taking more than 75% of the mention volume.

Analyzing ESG risks by company

Evaluating the top four brands' ESG risks reveals red flags as early as May 2021, when Abbott was fined for including a banned flavoring in its baby formula (see Figure 5). And most red flags were raised before the national shortage crisis was publicized widely.

Figure 5: Abbott ESG risk rises above the 5% threshold as early as May 2021.

Besides Abbott, most brands' risks remained low during 2021. However, ESG risks increase during Q1 2022 for all four major formula milk brands. The FDA warned consumers against using specific Abbott formulas during an investigation of infant illnesses possibly caused by Abbott's formulas. Perrigo settles a lawsuit from the State of California over baby formula lead levels. And a panel of federal judges centralizes many baby formula lawsuits into one federal court, including cases against Mead Johnson. Nestlé stays below the 5% threshold throughout, with little to no controversies found.

Three tactics and a summary

The baby formula market in the U.S. has been volatile for many reasons, which we won't get into in this article. However, this volatility could be seen and planned for. In this case, here are some tactics you can take to minimize your investment risks:

Employ a tool like SESAMm’s TextReveal to evaluate web data for insights into your investments. With premiere NLP technology, you can uncover sentiment and ESG insights about your industry, portfolio companies, or current investments.
Expand your research term for deeper insights. In this study, the term "formula milk market" had matching absolute and relative volumes. From this view, nothing looks out of place, and there aren't any red flags. However, when we expanded our research with the term "formula milk shortage," we found many controversies before the crisis gained national attention.
Dig into the controversies' causes. It's not enough to acknowledge a red flag. It would be best if you looked into what the potential reason is. Is the controversy caused by external factors or internal ones? Maybe both? Is the issue a one-time occurrence, or is it a pattern? So it's essential to avoid black-box tools. With solutions such as TextReveal that allow you to see beyond, you can access the underlying articles triggering the red flags.

Stay in touch with SESAMm

Thanks for reading this issue of Alternative Data Trends. Be sure to catch the next issue by subscribing to our blog. And if you'd like a TextReveal demo, send us a message via the form.