What To Invest: Autoencoder For Recommendation + ChatGPT-Based Rationale

Tarapong Sreenuch
6 min readJun 5, 2023

--

Affinity: Either None Or Have One (But Too Close)

For a retail investor, financial forums are often the place for an investment tip. He/she might also look out for what his/her broker will be posting every so often. The tip as expected is universal and probably not quite related to his/her affinity. However, the recommendation will always come with an investment rationale to back it up. If the forum is quite influential, then, partly because of positive feedback — I thought, the stock might really turn out to be a good one at least in the short term.

On the contrary, Collaborative Filtering, popularised by Netflix and Amazon, can create highly personalised recommendations. Its recommendation is created based on the listed equities that are commonly bought by a small set of retail traders with past investing patterns similar to a particular investor. If I am myself a trader with low return (or even loss), then Wouldn’t I be ending up following the other low-performance traders like myself? Surely, I wouldn’t want to remain poor.

CopyTrader, a feature offered by eToro, is a portfolio recommendation where a retail investor can clone a high-performing portfolio. This certainly will not be personalised, but still, the user can choose a portfolio closest to his/her affinity. Now, What if I cannot afford the whole portfolio? Indeed, many amateur investors will be able to only buy one or a few. Also, What is the rationale behind the portfolio? and this is something we do not know. This makes it hard to just blindly buy a stock that is part of the portfolio.

What We Need: Opinionated Ones

Let us narrow down the scope and focus on retail-type of investors. Moreover, what is developed must be able to scale up to support increasing platform users with close to zero marginal cost. To make it manageable, we shall leave the aspects of temporal decision-making and portfolio constructions out at this point.

Affinity often dictates how amateurs invest. User’s affinity includes but is not limited to, industry (or sector), sustainability and holding time. Most users on the platform are amateur, the experienced ones are not that many in proportion. To follow the Pereto rule (80:20), we shall prioritise the 80. The recommended equities must reflect the user’s affinity, and in addition, a high rate of return is expected for the equities in the list. How things will turn out, but they are considered profitable at the time of shortlisting.

The platform must be able to provide a sound rationale (and elaborate if needed) on why it is recommending a certain equity. We are paying from our own pocket. To support millions of users, it has to be automated. It will not scale with human experts. Moreover, we would also like a contextual rationale from the bot, not a typical pre-generated text.

Underpinning The Intelligence

Autoencoder-Based Recommender

Take a step away from ML, and let’s not forget why we invest, which is indeed wealth. This is so fundamental why collaborative filtering, commonly used in digital platforms, will not work. The return (i.e. performance) does matter (a lot) in our case. There is also the Cold Start problem, where equities can be high in return but less popular. This intuitively points us to Content-Based Filtering. Here, we can build an embedding model using a deep autoencoder to capture the user’s affinity given the equity features, he/she has traded. The features can be industry, volatility, volume, dividend and so on. However, we exclude the rate of return to avoid clustering on investors’ performance.

To drive wealth, we filter equities listed on the broker based on what is known as good investment, e.g. high return, and potential gain. The candidates are mapped into our embedding space. K Nearest Neighbour search (KNN) is then used to efficiently retrieve the closest (affinity-wise) equities to recommend. We could also use approximate (A) instead of K. This way we are addressing both the requirements of high return and being aligned with the user’s trade patterns.

Reasoning With ChatGPT

From an automation standpoint, coming up with a list of equities, both performing and also aligning with the user’s affinity, is relatively straightforward. However, what was difficult before is to come up with sound reasoning without the help of domain experts. This is probably what differentiates private banking from retail ones. With ChatGPT, this has become possible. With its underlying (or trained on) massive corpus of text data, ChatGPT is just like a knowledgeable person and highly capable of coming up with sound reasoning — It really does better than me, admittedly. The key now is how we create a prompt for it to reason.

To rationalise why to invest in specific equity, we will minimally need its current indicators and ratios, industry outlook, economic indicators and more importantly good knowledge of investing. I am not an expert, but that is what I can think of logically. To make it comprehensive, there will surely be other information and indicators. The latter, i.e. investment body of knowledge (BoK), is something ChatGPT already has. The former 3s are what we have to provide as input to the prompt. With these, we are good with a sound rationale. Moreover, we can control how to elaborate the rationale will be through the word limit and also the variation in the response through its temperature setting.

PoC: Hofu LINE Bot

The Tech stack used for this PoC, depicted in the cover image, is composed of ChatGPT, scikit-learn, Keras, eToro, LINE, Dialogflow and Flex Message. Here, I tapped on eToro’s API to gather user trading data in an anonymised manner for training our content-based recommender. To be precise, it is an affinity embedding model, which is then used to generate feature vectors for the kNN search with the list to recommend. For a platform (or broker), we could think of an add-on service developed by leveraging the data that it owns.

To rationalise equity, I integrated with ChatGPT via OpenAI’s Python-based API. Here, prompt creation and chat completion are carried out using API. This way we can automate and scale to million users with ease. As alluded to earlier, the prompt comprises equity metadata, readily available from eToro API, and industrial outlook which I have to scrape from Yahoo Finance. The latter part can also also be from other investment or economic portals.

The broker I use also has an app on LINE (an IM App widely used in East Asia), but sadly it is neither personalised nor conversational. For PoC, it is natural to piggyback (as an extension) on a super App like LINE. This is where most people spend their time on. Moreover, I can leverage the existing functionalities from the platform and replace UX elements with dialogue, i.e. conversational and probing, built by Dialogflow.

In this PoC, I embedded what I bought in the past (from the broker I use, I don’t trade on eToro), and retrieved the recommended equities which are available to trade on eToro. The list is about right, they are the type of companies that I usually buy. Of course, not all I knew before, this is what we would like to have, which is to find potential fits both in terms of affinities and also performance.

Closing Thoughts

Can we go beyond the affinity-based recommendation to include temporal decision-making and portfolio construction? Yes, certainly. However, we will have to augment what we have now with either technical indicators or a trading algorithm to help with when to buy or sell. The risk engine will also be needed for a portfolio evaluation. However, we have to rethink how recommending as a set not just one equity will look like, probably another embedding space where a set of equities collectively mapped into one feature vector.

I shall close with ChatGPT. I have mixed feelings of optimism and also frightening. ChatGPT’s body of knowledge is incredible. As Andrew Ng put it, in many NLP-related cases we can now develop an ML-based product feature without training data. What was thought to be only possible by Subject Matter Experts (SMEs), we are now able to automate them. This has become a matter of prompting it with related input data for it to conclude. Now I begin to think: Using ‘Few-Shot Prompting’, it might be possible for ChatGPT to capture the trading affinities of a user and come up with a recommendation.

--

--