LSTM based Recurrent Neural Networks for Stock Market Price Prediction

0

In this post, we are going to discuss in-depth Recurrent Neural Networks (RRN) in deep learning and their usage in stock market price prediction and forecasting using python programming.

Prerequisites: Python Programming Basics, Keras Basics, Basic Understanding of Neural Networks, and backpropagation knowledge, and Artificial Neural Networks.

Structure of the Post :

  • Introduction to Recurrent Neural Networks (RRN)
  • What are Long Short Term Memory (LSTM) Networks?
  • What is the Structure of LSTM?
  • How do LSTM Networks work?
  • What are Different Types of LSTM Networks available?
  • Implementation of LSTM Algorithm for Stock Market Price Prediction

What are Recurrent Neural Networks ?

Recurrent Neural Networks are the special type of Neural Networks that are capable of storing the previous data and they are able to predict the output by considering the information of past and current samples. The recurrent neural networks find the output by analyzing the set of previous samples but in Artificial Neural Networks consider only the current input samples for both training and prediction. Let me explain RRN with an example

” Let us assume you are watching a movie. The information of the movie or video is presented in the form of scenes or frames. To understand what’s happening in the current situation in the movie or video you need to remember the previous frames or scenes. Generally, you will give more importance to the previous frames or scenes than starting scenes. By considering the previous frames you are able to understand what is happening in the present situation and you might be able to predict what will be the next scene in sometimes. The recurrent neural networks use the same strategy. “

What are Long Short Term Memory Networks ?

Long Short Term Memory is a memory cell used to store the previous information of samples. In contrast to Artificial Neural Networks which are feedforward in nature, Long Short Term Memory Networks are having feedback structures. The LSTM Networks are designed to process a sequence of data at a time. LSTM has an inbuild different feedback networks they are helpful better predictions.

“The LSTM was proposed by Juergen Schmidhuber in 1997 “. Follow the link to read the first paper published on LSTM Architecture

-> Long Short Term Memory Networks Research Paper

Architecture of LSTM

The LSTM single-cell architecture is shown below with all feedback connections and functions that are incorporated with Long Short Term Memory (LSTM) Cell.

LSTM Networks will take 3-dimensional data only as they need number features, number of samples, timeframe. The timeframe indicates how many past samples are taken into consideration for the prediction of the next value using LSTM.

How does LSTM work in prediction?

In general, Recurrent Neural Networks have a problem of vanishing gradient. Upon training the model the gradient becomes a very small value and that doesnot contribute any chnage to the training process. To avoid the problem of vanishing gradient LSTM are introduced as they store the information of the previous and current nodes by treating hiddent nodes are storage units. LSTM itself has few functions such as tanh , sigmoid and feedback functions. They will help to processa nd store the sequentail data in the form of timeframes. To read more about the working each part in LSTM , refer the following link

-> Understanding of LSTM

Different LSTM Architechures

  • Bi-directional LSTM
  • Venila LSTM
  • Stacked LSTM
  • Convolutional LSTM

All these are different types of LSTM networks available to use for different problems.

Implementation of LSTM model for Stcok Market Price Prediction

Now, we will go through step by step process to predict the stock market prices with Deep Neural Networks and Long Short Term Memory Networks.

Part 1 : Data Collection

For this project, we are going to use Google stock price data for the financial year of 2020 – 2021 ( April 2020 to March 2021 ). The data is collected from Yahoo finance (https://in.finance.yahoo.com/ ) historical stock prices database. The Google historical stock prices can found here (https://in.finance.yahoo.com/quote/GOOG?p=GOOG&.tsrc=fin-srch).

you can directly download the file of stock prices in comma seperated values (csv) format as that is more comfortable and easy to write and manage.

Part 2 : Data Pre-processing and Preparation

Every Machine Learning and Deep Learning algorithm needs a preprocessing to extracts more information from input features that will help to increase the accuracy of predictions. The data of the stock market is in numerical format, so we need to apply numerical data pre-processsing techniques such as MinMaxScaler, Normalization, Standardization, and Rank Transformations. For this model, we are going to use MinMaxScaler pre-processing technique.

The data is having date, open, close, volume, adjusted close, and High features and we are going to choose the Open feature for the model training and price prediction. Now we have only 1D data but LSTM networks will accept only 3D data. So we need to transform the one-dimensional data to three-dimensional data to feed it to LSTM layers.

Importing Libraries and Loading the data :
import math
import matplotlib.pyplot as plt
import keras
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Bidirectional
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import *
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping
df=pd.read_csv("/content/GOOG.csv")
Splitting the data into training and testing :
df.shape
training_set = df.iloc[:172, 1:2].values
test_set = df.iloc[172:, 1:2].values 
Preparing the training data in 3D format ( samples, features , timestamps) :
# Feature Scaling
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)
# Creating a data structure with 60 time-steps and 1 output
X_train = []
y_train = []
for i in range(60, 172):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
Part 3 : LSTM Model Development

To build the model , we are using keras and tensorflow to implement and add the neural networks layers ( input , output and hidden )

model = Sequential()
#Adding the first LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
model.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50))
model.add(Dropout(0.2))
# Adding the output layer
model.add(Dense(units = 1))

# Compiling the RNN
model.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
history=model.fit(X_train, y_train, epochs = 100, batch_size = 32)

We are using adam optimizer and the mean square error loss function for the optimization of the loss. The model is having four LSTM layers with the parameter of return sequence is true. The model is run for 100 epochs with a batch size of 32.

Part 4 : Visualization

The plot between model loss and number of epochs is plotted to show the how the model optimizing it’s loss with the help of backpropgation .

Part 5 : Prediction on test data
Preparation of Test data and performsing the model validation :
# Getting the predicted stock price of 2017
dataset_train = df.iloc[:172, 1:2]
dataset_test = df.iloc[172:, 1:2]
dataset_total = pd.concat((dataset_train, dataset_test), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 118):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
print(X_test.shape)

predicted_stock_price = model.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)

true_y=dataset_test.values

plt.xlim([0, 50])
plt.ylim([1400, 1800])
plt.plot(predicted_stock_price, label='Predicted')
plt.plot(true_y, label='Orginal')
plt.xlabel('The Date of the Stock Market')
plt.ylabel('The Price of the Stock')
plt.title('The Visualization of Orginal and Predicted Stock Prices')
plt.legend()
Part 5 : Resuts and Comparision of Predicted and Orginal Stock Prices :
Part 6 : Performance Analysis of LSTM

After implementing any algorithm in Machine Learning and Deep Learning, generally, we will evaluate the algorithm performance on the data. To measure the performance of this algorithm we have Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE).

import math
from sklearn.metrics import mean_squared_error
error=math.sqrt(mean_squared_error(predicted_stock_price,true_y))
error
Complete Code :
import math
import matplotlib.pyplot as plt
import keras
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Bidirectional
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import *
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping
df=pd.read_csv("/content/GOOG.csv")
df.shape
training_set = df.iloc[:172, 1:2].values
test_set = df.iloc[172:, 1:2].values
# Feature Scaling
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)
# Creating a data structure with 60 time-steps and 1 output
X_train = []
y_train = []
for i in range(60, 172):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

model = Sequential()
#Adding the first LSTM layer and some Dropout regularisation
model.add(Bidirectional(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1))))
model.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
model.add(Bidirectional(LSTM(units = 50, return_sequences = True)))
model.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
model.add(Bidirectional(LSTM(units = 50, return_sequences = True)))
model.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
model.add(Bidirectional(LSTM(units = 50)))
model.add(Dropout(0.2))
# Adding the output layer
model.add(Dense(units = 1))

# Compiling the RNN
model.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
history=model.fit(X_train, y_train, epochs = 100, batch_size = 32)

plt.plot(history.history['loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.show()

dataset_test = df.iloc[172:, 1:2]
dataset_test.shape

# Getting the predicted stock price of 2017
dataset_train = df.iloc[:172, 1:2]
dataset_test = df.iloc[172:, 1:2]
dataset_total = pd.concat((dataset_train, dataset_test), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 118):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
print(X_test.shape)

predicted_stock_price = model.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
true_y=dataset_test.values


plt.xlim([0, 50])
plt.ylim([1400, 1800])
plt.plot(predicted_stock_price, label='Predicted')
plt.plot(true_y, label='Orginal')
plt.xlabel('The Date of the Stock Market')
plt.ylabel('The Price of the Stock')
plt.title('The Visualization of Orginal and Predicted Stock Prices')
plt.legend()


predicted_stock_price.shape
true_y.shape
import math
from sklearn.metrics import mean_squared_error
error=math.sqrt(mean_squared_error(predicted_stock_price,true_y))
error
Conclusion

In this post, we have gone through the details of LSTM and how LSTM can be used to predict the stock market prices in tensorflow and keras. The Recurrent Neural Netwoeks are advanced type of networks that are useful more in real time problem solving such as

  • Next Word Prediction ( Like Google Search , Yahoo Search , YouTube )
  • Video Action Recogniton
  • Image Analysis
  • Speech Pricesssing

Advice / Tip: As these were advanced networks, you need to put some work to understand clearly the LSTM and Recurrent Neural Networks. Try to download the Apple Company dataset from yahoo finance and apply the LSTM model on the dataset and find the prediction. Ask yourself questions such as why I am writing this line of code, what will be needed, and what is the output ? . This will help you to get more information on Long Short Term Memory Networks and Recurrent Neural Networks.

Comment with your query , let us discuss more there …. !!

I am Nagaraju and I am currently working as a Visiting Researcher at Mirai Research Innovation Institute, Japan. I love to share the information in Machine Learning and Artificial Intelligence.

LEAVE A REPLY

Please enter your comment!
Please enter your name here