The (Artificially) Intelligent Investor

Apr 20 2020

Shakespeare plays, Stephen King novels, and even Kanye lyrics have all served as training data for recurrent neural networks (RNNs) that generate text. While these projects are intriguing, I wanted to find a more practical use for text generation and decided to explore if RNNs could form coherent investing advice. After considering a few options, I chose to train the model on a copy of The Intelligent Investor by Benjamin Graham, which has been quoted by Warren Buffet as “the best book on investing ever written”. The model’s output certainly didn’t reveal the secret to beating the market as we’ll later see, but it’s still interesting to ponder if AI will one day be capable of offering sound financial advice.

RNNs: A Quick Overview

RNNs are analogous to human learning. When humans think, we don’t start our thinking from scratch each second. For example, in the sentence “Bob plays basketball”, we know that Bob is the person who plays basketball because we retain information about past words while reading sentences. Similarly, RNNs are neural networks with feedback loops, which allow them to use past information before arriving at a final output. However, RNNs can only connect recent information and cannot connect older information as the time gap grows. Gated Recurrent Units (GRUs) are an improved version of RNNs that overcome the short-term memory issue through an update gate that decides what information is relevant to pass on and a reset gate that decides what past information is irrelevant. For an in-depth explanation of GRUs, click here.

Author’s Note: Definition of RNNs reused from my previous article “Machine Learning to Predict Stock Prices

Imports/Loading Data

To start off, we make the necessary imports: Tensorflow, Numpy, and os.

import tensorflow as tfimport numpy as npimport os

Next step is to download our data, which is a .txt file of the Intelligent Investor. I removed the preface, index and a few graphs from the file to help our model generate more relevant text. Once we download the file, we take a look at how many total characters are in it.

from google.colab import files
files.upload()
text = open('The_Intelligent_Investor copy.txt',
 'rb').read().decode(encoding='utf-8')
print ('Length of text: {} characters'.format(len(text)))

Preprocessing

Let’s take a look at how many unique characters exist in the file.

vocab = sorted(set(text))
print ('{} unique characters'.format(len(vocab)))

Our model can’t understand letters so we have to vectorize the text. Each unique character is mapped to an integer for the computer to understand and integers are mapped to the characters so we can later decode the computer’s output.

#Maps characters to ints
char2int = {char: num for num, char in enumerate(vocab)}
#Maps ints to characters
int2char = np.array(vocab)
#Intelligent Investor text represented as ints.
text_as_int = np.array([char2int[char] for char in text])
print(char2int)
print(int2char)

We train the RNN model with the goal of teaching it to predict the most likely character after a given sequence of characters. To do this we will break input sequences from the text into an example sequence and target sequence. The target sequence is the example sequence shifted one character to the right, so the chunks of text have to be one character longer than the length of sequence. For example, if our text is “Stocks”, the example sequence would be “Stock” and the target sequence would be “tocks”.

seq_length = 100
examples_per_epoch = len(text)//(seq_length+1)
# Create examples and targets sequences
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)
def split_input_seq(chunk):
    example_text = chunk[:-1]
    target_text = chunk[1:]
    return example_text, target_text
dataset = sequences.map(split_input_seq)
#look at the first example and target sequence
for example_text, target_text in  dataset.take(1):
    print ('Example data: ',  repr(''.join(int2char[example_text.numpy()])))
    print ('Target data:',  repr(''.join(int2char[target_text.numpy()])))

We shuffle the data and segment it into batches before we train our model. The purpose of shuffling the data is to improve the performance of the model by avoiding overfitting, which is when the model learns the training data too closely and can’t generalize well to the test set.

batch_size = 64
buffer_size = 10000
dataset = dataset.shuffle(buffer_size).batch(batch_size, drop_remainder=True)

Building/Training our Model

With the data prepared for training, we create our model with three layers.

  1. The Embedding layer is our input layer that maps the integer representation of each character into dense vectors of 256 dimensions.

  2. The GRU layer is our hidden layer with 1024 RNN units.

  3. The Softmax layer is our output layer with 109 potential outputs (one for each of our 109 unique characters).

model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), 256, batch_input_shape=[batch_size, None]))
model.add(tf.keras.layers.GRU(1024, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'))
model.add(tf.keras.layers.Dense(len(vocab)))
#summary of our model
model.summary()

Now, we compile our model with the Adam optimizer and sparse categorical cross entropy loss function.

def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels,         logits, from_logits=True)
model.compile(optimizer='adam', loss=loss)

Before we train our model, we make sure to save checkpoints during training. By saving checkpoints, we can quickly recreate our model with a different batch size and restore the saved weights instead of training it again.

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
#Make sure the weights are saved
checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix, save_weights_only=True)
history = model.fit(dataset, epochs=30, callbacks=[checkpoint_callback])

Generating Text

Rebuild our model and load the weights with the batch size changed to 1, which makes the prediction simpler.

model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), 256, batch_input_shape=[1, None]))
model.add(tf.keras.layers.GRU(1024, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'))
model.add(tf.keras.layers.Dense(len(vocab)))
#load weights from previous model
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
#summary of our model
model.summary()

Summary of the new model

Here comes the moment of truth: our model finally reveals its investing advice! The temperature parameter affects the outputs we receive: a lower temperature results in a more conservative output while a higher temperature results in a more creative output that is prone to making more errors. We’ll see examples of this below.

#try any temperature in the range of 0.1 to 1
def generate_text(model, start_string, temperature):
    # Number of characters to generate
    num_generate = 1000
    # Converting our start string to numbers (vectorizing)
    input_eval = [char2int[s] for s in start_string]
    input_eval = tf.expand_dims(input_eval, 0)
    # Empty string to store our results
    text_generated = []
    model.reset_states()
    for i in range(num_generate):
        predictions = model(input_eval)
        # remove the batch dimension
        predictions = tf.squeeze(predictions, 0)
        predictions = predictions / temperature
        predicted_id = tf.random.categorical(predictions,     
        num_samples=1)[-1,0].numpy()
        input_eval = tf.expand_dims([predicted_id], 0)
        text_generated.append(int2char[predicted_id])
    return (start_string + ''.join(text_generated))
print(generate_text(model, start_string="Advice: ", temperature=.5))

Here’s the advice our model generates at a temperature of 0.1.

While the output isn’t even remotely close to any investing advice you should follow, it does a decent job of mimicking The Intelligent Investor. Due to the low temperature, our model doesn’t attempt to be creative and plays it safe by sticking to standard sentences in paragraph format. Let’s see how this changes as we increase the temperature.

At a temperature of 0.5, we can begin to see differences in the output. Our model tries to be more inventive and makes more errors as a result. An example of this is in the second to last line where parentheses are used incorrectly.

Now, the differences at a temperature of 1 are very apparent as our model attempts to generate tables. However, the tradeoff for this increased creativity is that the output becomes mostly incomprehensible. I’ve included a few more outputs at the various temperatures for reference.

Conclusion

As we saw, RNNs aren’t anywhere close to replacing investment advisors for now. With that being said, here are some ways that we can try to improve our model’s output.

  • Increase the number of epochs

  • Get a better training dataset (some formatting was messed up when converting The Intelligent Investor from a pdf to txt file)

  • Use an LSTM layer instead of GRU layer (LSTMs are another improved type of RNN)

References

[1] Google Team, Text generation with an RNN, Tensorflow Core

[2] Aurélien Géron, Natural Language Processing with RNNs and Attention, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Don’t leave yet!

I’m Roshan, a 16 year old passionate about the intersection of artificial intelligence and finance. If you’re further interested in RNNs applied to finance, check out this article: https://towardsdatascience.com/predicting-stock-prices-using-a-keras-lstm-model-4225457f0233

Reach out to me on Linkedin: https://www.linkedin.com/in/roshan-adusumilli-96b104194/

Reference : https://towardsdatascience.com/the-artificially-intelligent-investor-379a180e199f

Last updated