Natural Language Processing – Part III: Feature Engineering

Investor Activism Campaign Volume Reaches New Record High in 2023

Blog

Infographic: The Big Picture 2024 – Supply Chains Outlook

Blog

Banks’ Response to Rising Rates & Liquidity Concerns

Blog

Equity Issuance Ticks up in Q3 while IPO Activity Remains Sluggish

31 Jan, 2020

Natural Language Processing – Part III: Feature Engineering

Author Frank Zhao
Theme Custom
Segment Investment Management
Tags Xpressfeed

Highlights

The latest research from S&P Global Market Intelligence’s Quantamental Research team applies natural language processing (NLP) using domain knowledge to capture alpha from transcripts.

This newest publication, the third in the NLP series introduces new stock selection ideas in the areas of I) Topic Identification, II) Call Transparency and III) Call Sentiment using more advanced NLP techniques.

The new signals complement the existing suite of signals that is offered in our Textual Data Analytics (TDA) product, an off-the-shelf NLP solution for both quantitative and fundamental analysts.

Download The Full Report

Click Here

Unstructured data¹ is largely underexplored in equity investing due to their higher costs². The information content, as a result, remains largely untapped and offers an investment edge³ to discerning investors who are adept at extracting those investment insights. One particularly valuable unstructured data set is S&P Global Market Intelligence’s machine readable earnings call transcripts.

This newest publication, the third in the series (NLP I, NLP II), introduces new stock selection ideas in the areas of I) Topic Identification, II) Call Transparency and III) Call Sentiment using more advanced NLP techniques. The new signals complement the existing suite of signals that is offered in our Textual Data Analytics (TDA)⁴ product, an off-the-shelf NLP solution for both quantitative and fundamental analysts. The high-level U.S. findings are:

Exhibit 5: MSFT’s October 26, 2017 Call Transcript - Executives’ Prepared Remarks & Q&A

^{Note: The exhibit contains five columns of binary flags, a subset of all available binary flags in the analysis.}

Source: S&P Global Market Intelligence Quantamental Research, as of 12/01/2019.

^{1 Unstructured data sets are non-numerical data sets such as texts, audios and images that are of primary source. Primary source data in this context i) are furthest up the information chain containing the most relevant and timely information, ii) are easily mapped to publicly traded firms and iii) have good historical and cross-sectional coverage.}

^{2 The exploration of non-numerical data sets requires more advanced technical infrastructure and tools. The exploration requires more time from researchers to vet the data. The probability of finding additive signals is lower.}

^{3 Other examples may be more sophisticated modeling (e.g., non-linear) and execution efficiency.}

^{4 See Section 5 Data & Methodology in paper for details.}

Learn More About Textual Data Analytics

Request Trial

Download The Full Report

Natural Language Processing – Part III: Feature Engineering

Click Here

Event

Natural Language Processing - Stock Selection Insights from Corporate Earnings Calls

Learn More

Author
Frank Zhao
Theme
Custom
Segment
Investment Management
Tags
Xpressfeed