Predicting Stock Movements With ChatGPT

Daniel Carter
6 min readApr 28, 2023

--

Large Language Models, such as GPT, are undeniably impressive in their ability to generate eloquent and engaging writing based on a sequence of words given to it by users. Their proficiency in this task has even led some to wonder whether these models possess a certain degree of sentience.

As I thought about the capabilities of these models, it occurred to me that GPT’s sequence-predicting abilities could potentially be applied to forecasting stock prices. The ups and downs of the stock market across time can also be considered a sequence, so I embarked on an exploration to see if ChatGPT could predict future stock movements.

To replicate my work, you will need a fundamental understanding of Python and access to the OpenAI API. I highly recommend using Google Colab for simple reproducibility, but other environments will suffice.

Begin by installing the OpenAI API package, followed by importing the necessary modules and configuring the settings accordingly.

!pip install openai
# imports
import pandas as pd
import openai
import yfinance as yf
from tqdm import tqdm
from sklearn import metrics

# pandas dataframe settings
pd.set_option("max_colwidth", 100)
pd.set_option("display.max_rows", None)
pd.set_option("display.max_columns", None)

# insert your OpenAI API key below
openai.api_key = ''

Let’s grab some stock market data from the Yahoo Finance API. Initially, I used more data, but to avoid overloading the GPT model, I reduced it. Also, to minimize day-to-day market noise, I decided to use weekly data.

# get stock data from yahoo finance api
stk= yf.download("AAPL", start="2021-04-21", end="2023-04-21")

# only get adjusted close column
stk = stk[['Adj Close']].copy()

# turn daily data into weekly data
stk = stk.resample('W').mean()
stk['Adj Close'] = stk['Adj Close'].round(2)

Run the following code to get a glimpse of the data:

stk.head()

Let’s transform this ordinary stock market data into data that can be used to make and evaluate predictions.

# price sequence length
num_weeks = 24

# dictionary for data storage
data_dict = {
'start_date': [],
'end_date': [],
'pred_date': [],
'price_seq': [],
'last_week_price': [],
'next_week_price': [],
'y_true': [],
}

# step through stock data with num_weeks window
for i in list(range(len(stk) - num_weeks)):

# cut stock data with 25 week window
# 1 extra week so we know if next week was higher or lower than last week in sequence
stk_cut = stk[i:i+num_weeks+1]

# get data necessary for prediction and evaluation
start_date = stk_cut.index.tolist()[0]
end_date = stk_cut.index.tolist()[-2]
pred_date = stk_cut.index.tolist()[-1]

price_seq = stk_cut['Adj Close'].tolist()[:-1]
last_week_price = stk_cut['Adj Close'].tolist()[-2]
next_week_price = stk_cut['Adj Close'].tolist()[-1]

# label 1 if next week's price was higher, else label 0
y_true = 1 if next_week_price > last_week_price else 0

# store data
data_dict['start_date'].append(start_date)
data_dict['end_date'].append(end_date)
data_dict['pred_date'].append(pred_date)
data_dict['price_seq'].append(price_seq)
data_dict['last_week_price'].append(last_week_price)
data_dict['next_week_price'].append(next_week_price)
data_dict['y_true'].append(y_true)

# make pandas dataframe with data
df = pd.DataFrame(data_dict)

Now, let’s take a closer look at what we’ve just created. Among the useful data we’ve stored, the most important are the price_seqand y_truecolumns. The former represents a sequence of 24 weeks of stock prices, which we will feed into GPT to predict whether the next price will be higher or lower than the last price in the sequence. The latter tells us what actually happened, with 1 indicating that next week’s price was higher and 0 indicating it was lower.

df.head()

By running the following code, we can see that the distribution between higher and lower is almost even:

df['y_true'].value_counts(normalize=True)

If you have used ChatGPT in the past, you know that it does not like to make financial predictions. It’s unfortunate, but I get it. OpenAI does not want to be responsible for a r/wallstreetbets Ape who YOLO’s his life savings into a bad investment because a chat bot told him to. So, in the prompt, I don’t mention stocks or finance at all. GPT also gets a little shy when it comes to making uncertain predictions, but I urge it to take a guess, even if it’s feeling clueless. Finding a good prompt took some tinkering, and I’m sure there are better prompts out there waiting to be discovered.

Six months ago, I never would have thought that I’d be writing paragraphs of natural language in my code to coax a chat bot into doing my bidding, but here we are!

By running the code below, we can harness the power of GPT to make predictions on stock price movements by analyzing sequences of historical data.

# function to get prediction out of reply string
def get_pred_from_reply(reply):
if 'higher' in reply.lower():
return 1
elif 'lower' in reply.lower():
return 0
else:
return -1

# prompt for GPT -- convinces GPT to make guess even if it can't find a pattern
prompt_template = """
Guess if the next number in the following sequence is lower or higher than the last number in the sequence.
Even if there is no recognizable pattern you should still make a guess.
There is no penalty for being wrong, but you should attempt to be right.\n
"""

# set max message history
# GPT can only handle so many tokens (characters), so we limit the number of messages stored
max_history_length = 15

message_history = [] # message history storage
preds = [] # prediction storage

# loop through all prices sequences
for ps in tqdm(df['price_seq']):

# turn price sequence list into a string
price_seq_str = ", ".join(str(x) for x in ps)

# add price sequence string to end of prompt template to create prompt
prompt = prompt_template + price_seq_str

# add prompt to message history
message_history.append({'role': 'user', 'content': prompt})

# keep only the most recent max_history_length messages in the list
message_history = message_history[-max_history_length:]

# get GPT to make a guess with message history for context
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=message_history
)

# retrieve GPT's reply
reply = completion.choices[0].message.content

# add GPT's reply to message history
message_history.append({'role': 'assistant', 'content': reply})

# store GPT's prediction
preds.append(get_pred_from_reply(reply))

# add prediction column to dataframe
df['y_pred'] = preds

Get ready for the exciting part. Unbelievably, GPT predicts the next move in stock prices with 67% accuracy. Given the challenges associated with making stock market predictions, this is an impressive result.

# evaulate performance
accuracy = metrics.accuracy_score(df['y_true'], df['y_pred'])

print(f"Accuracy: {round(accuracy * 100)}%")

I tested it on another stock in a different timeframe to make sure I wasn’t just getting lucky.

stk= yf.download("PG", start="2015-03-31", end="2017-03-31")

After running the same code with Proctor & Gamble stock data, the accuracy was 63%. Although not as high as the Apple example, this is still an impressive result.

While more runtime and money will be required to fully explore this concept, it holds immense promise for integration into more sophisticated trading systems. Of course, it’s not a magic wand that guarantees instant riches. But it’s a tool with the potential to help us make informed trading and investment decisions. What’s most exciting about this concept is that it showcases the versatility of GPT and its potential to transform the way we approach finance.

Note: The information provided in this article is for educational and informational purposes only. It should not be construed as financial or investment advice. Before making any trading or investment decisions, it is important to conduct your own research, carefully consider your objectives, financial situation, and risk tolerance, and consult with a licensed financial advisor if necessary. Any trades or investments made based on the information in this article are done at your own risk and you are solely responsible for the outcomes of such decisions.

--

--

No responses yet