Alternative data is changing the world of investing.
When it comes to the stock market, alternative data means anything other than price and company fundamentals — satellite imagery, location data from mobile devices, POS transaction data, or social media data posted to platforms like Twitter and StockTwits — that can be used to generate insights that drive investment decisions.
New opportunities for increased returns brought about by alternative data come with obstacles. These data sets are often unstructured with high volume and velocity — making alternative indigestible for most investors.
Consider satellite images of retail store parking lots.
A data set like this consists of millions of photos, indexed to hundreds of retail stocks, each with thousands of stores. Relating these images together requires a complex data infrastructure and extracting useful information requires deep technical knowledge of image processing and computer vision.
This data contains unique investment insights that help investors better estimate company sales and revenue in real time, but only the most quantitatively driven investment operations have the tools and resources to deal with this information at scale.
Advancements in Big Data, Data Science and Machine Learning breed the misconception that the most complex solutions are the best.
Complexity is better if it leads to a deeper understanding of the underlying system that yields better outcomes. But complexity is a spectrum and not a binary — and even small incremental movements towards more complexity can drive better outcomes, which is what it’s all about.
Organizations now have the ability to productize complex data in ways that are more digestible for users.
We started tackling this with the development of MarketLex — our lexicon and semantic rules based sentiment model for social finance. MarketLex utilizes our expertise in how investors talk in the social finance-verse to better understand how an individual piece of content relates to the market.
We can understand in realtime that a user posting “$AAPL has some really strong support at $145, I’m buying here” is bullish (believes that the price will increase) or “$AAPL looks like it is topping here, could be time to get out” is bearish (believes that the price will decrease).
While sentiment can seem obvious for an individual piece of content, the model is invaluable for dealing with the scale and velocity of the content produced.
The true informational advantage comes in understanding the aggregate sentiment of our community as a whole. We now capture this via an index which represents the relative bullishness of a security today compared to its recent history — adding a layer of structure around an otherwise unstructured data set.
More importantly it gives us the building blocks to productize our data to fit into any investing style — getting our data into the hands of investors from across the quantitative complexity spectrum.
Our goal is to create opportunities for all investors, not just those with the most technological know-how, to incorporate some form of social data into their process to make better investing decisions. Though this goal is unique to our niche corner of the finance world, the idea at its core is ubiquitous.
As technological advances continue to add complexity into the world, there is an opportunity to deliver better outcomes with farther reach by making this complexity more broadly accessible.
Join 30,000+ people who read the weekly 🤖Machine Learnings🤖 newsletter to understand how AI impacts the way we work and live.