Predicting stock price movements using news and social media data

Picture of Antonio Banda Antonio Banda | August 10, 2022

Tokio Marine & Nichido Fire Insurance Co., Ltd. (TMNF) tapped SESAMm for a joint research venture to predict future stock price movements and discovered two key findings:

  1. NLP data from news and social networking websites can have strong relationships with investor behavior. Thus, it’s possible to forecast investors’ rational reactions to changes in data and price movements based on those relationships.

  2. NLP data proved to help anticipate tail events. For example, given the macroeconomic environment of the last 10 years, the stock market performed well. So in this context, investors are sensitive to negative narratives in times of uncertainty, such as the 2015 market sell-off, the U.S.-China trade war, the coronavirus pandemic, and the start of the Ukraine-Russian war, and post their concerns online.

Providing safety and security since 1879

Tokio Marine Insurance Company was first established in 1879. Over the years, it has added products and services, acquired other businesses, and merged with other companies to eventually become Tokio Marine & Nichido Fire Insurance Co., Ltd. Commonly called Tokio Marine Nichido today, the company is a property and casualty insurance subsidiary of Tokio Marine Holdings, the largest non-mutual private insurance group in Japan. Its products and services provide safety and security to its clients and partners, contributing to more fulfilling lifestyles and business development.

One of the company’s philosophies is to be a good corporate citizen and fulfill its social responsibilities, including protecting the global environment, promoting human rights, creating a responsible working environment, and contributing to society and individual local communities. Recently, the Emperor of Japan awarded Tokio Marine Holdings, Inc. the Medal with Dark Blue Ribbon for donating to the Japan Student Services Organization to support students who face financial difficulty during the COVID-19 pandemic. Individuals, corporations, or organizations are awarded the Medal with Dark Blue Ribbon for their outstanding contributions to the public.

Transforming and accepting the challenge to grow

According to TMNF, “The business environment surrounding the insurance industry is changing at a faster pace than ever due to changes in demographics, advances in technologies, such as autonomous driving and AI, and longer-term trends, such as the intensification and frequent occurrence of natural disasters, as well as further progress in digitalization due to the COVID-19 pandemic.”

“The business environment surrounding the insurance industry is changing at a faster pace than ever…”

“While these changes in the business environment pose a threat, we consider them to be excellent opportunities for transformation and the creation of new value.” So they’ve adopted the concept, “Transformation (“X”) and Challenge to Growth 2023: Aiming to be the company most chosen for quality and its passion.” Ultimately, it strives to support customers and local communities in times of need while contributing to social responsibility. Five social issues that it will prioritize are:

  • Global climate change and the increase in natural disasters
  • The increased burden of long-term care and healthcare due to the aging of society and advances in medical technology
  • Technological innovation and its effects on the environment
  • Symbiotic society and responding to the novel coronavirus
  • Industrial infrastructure and how it supports economic growth and innovation

Leveraging a partner with the right technology

To secure and protect its clients’ assets while elevating social issues, Tokio Marine Nichido sought out an edge in the stock market. Under these circumstances, it was fortunate that TMNF discovered SESAMm in 2020 through the Plug and Play Japan program, a platform with an event that connects Japan to markets abroad. SESAMm had presented its NLP alternative data solution, TextReveal®, to which TMNF considered the platform for access to alternative data and sought collaboration with the SESAMm team for a research project.

“SESAMm has the technology to extract text sentiment from news data with a neural network.”
– Tokio Marine & Nichido Fire Insurance Co. Ltd representative

Extract relations between NLP data and the financial market

In 2021, Tokio Marine Nichido Insurance began collaborating with SESAMm to develop an AI analytics model for alternative data. It models the impact of news and social networking data on investor behavior for stock and bond markets, transforming text information into knowledge usable by TMNF. For instance, when the model detects a negative narrative raising uncertainty in the market, investors can use this signal to reduce their risk exposure.

Predicting future stock price movements from news and social media data

Tokio Marine Nichido and SESAMm’s joint research found that natural language data from news and social networking sites effectively predict future stock price movements. In the case involving the pandemic, for example, there was a time lag of as long as a month between the time COVID-19 became news and the time it affected the U.S. stock market (Figure 1). By using SESAMm’s technology to analyze news data during this period, the team found that US news and social networking sentiment had already deteriorated sharply before stock prices reacted. This sentiment deterioration is due to the fear of the coronavirus-spread effect on the global economy. In an all-time high S&P 500, U.S. investors did not initially consider this risk. In comparison, HSI companies were closer to the coronavirus spread risk, resulting in HSI investors reacting ahead of those in the U.S.

Chart comparing U.S. news sentiment to Hang Seng Index and S&P 500 Index

Figure 1: In 2020, U.S. news sentiment falls ahead of the stock market in response to COVID-19 concerns.

The model can calculate sentiment for each company by analyzing the news of individual companies. It’s also possible to create a composite to measure the sentiment related to a stock index. The sentiment data also helps management and investor relations because it provides a quantitative means of understanding the extent to which investors are concerned about certain news about their company.

Verifying the results

Verification using Japanese has revealed that the timing of bottoming and ceiling of text sentiment precedes those of stock prices. The collaborating team compared the performance of:

  • A model that uses only orthodox financial and economic data as inputs
  • A model that considers NLP and financial and economic data, confirming that the latter could generate higher alpha

SESAMm equity model performance and characteristics chart and tables

Figure 2: Back-testing confirms that SESAMm’s equity model can predict a market downturn, capturing changes in text sentiment and reducing positions ahead of market crashes.

Since measuring sentiment is mean reversionary by nature, the TMNF team believes it provides good support for position management during rallies and crashes. It’s also valuable for avoiding forced loss-cut at the bottom when liquidity temporarily evaporates and the market crashes.

Expanding the research to other use cases

In addition to analyzing the stock market, Tokio Marine Nichido also expanded the scope of the research to include R&D on using natural language data in trading U.S. high-yield bonds. Research shows that NLP data can help provide a hedging signal for the negatively skewed high-yield market (Figure 3) by capturing deteriorating text sentiment (Figure 5). For example, these signals can inform investors to reduce positions before market reactions.

U.S. High-Yield Total Return index monthly returns chartFigure 3: NLP data can help provide a hedging signal by capturing deteriorating text sentiment.

High-yield model performance comparison chart

Figure 4: An NLP-informed high-yield strategy can outperform the U.S. high-yield total return index and a strategy without NLP. Same volatility level for the three back-tests.

High-yield back test performance during major drawdowns

Figure 5: The NLP-informed high-yield strategy delivered positive returns when the U.S. HY market sold off, providing a hedge to a traditional approach.

TMNF is also applying the research to estimate the Fed’s stance—hawkish or dovish—using natural language data, too. It hypothesizes that the market will be focused on the Fed’s stance on interest rate hikes in the next few years.

“The model developed in collaboration with SESAMm is simple in structure, yet, it’s an orthodox and robust model that uses valid data as input.”

Summarizing the collaboration

In developing models, Tokio Marine Nichido believes it is essential to consider “what data to consider” and to keep it simple. And TMNF achieved these tenets. The model developed in collaboration with SESAMm is simple in structure, yet, it’s an orthodox and robust model that uses valid data as input which is preferable to a risky over-fitting by increasing complexity.

A representation of the collaborative NLP alternative data model

Figure 6: The joint Tokio Marine Nichido and SESAMm NLP alternative data model: Simple yet robust.

Get in touch with SESAMm

To learn more about Tokio Marine Nichido’s case study or to request a TextReveal demo, reach out to us here:


Note: The contents of this document do not constitute an offer or solicitation to buy services or shares in any fund. The information in this document does not constitute investment advice or an offer to invest or to provide management services and is subject to correction, completion, and amendment. Past performance is not indicative of future results.