S tock T rend P rediction using Neural N etworks

Stock Trend Prediction using Neural Networks Seminar Presentation Marco Witzmann (11V101009) Ravi Vishvakarma (09005075) RishabhSinghal (09005005) Supervisor Prof. Pushpak Bhattacharyya

Outline • Introduction • Price Fluctuation • Classical Methods • Fundamental Analysis • Technical Analysis • Multiple linear regression • Limits to classical Methods • Neural networks • Implementation • Feed Forward with Back Propagation of errors • Initialization • Random initialization • Multiple linear regression initialization • Algorithm • Steepest descent • Conjugate gradient • Experimental comparison • Recurrent networks • Network comparison • Limits of Neural Networks • Conclusion

Introduction • What is Share? • Smallest unit of ownership of a company • Is Traded in Stock Markets • What is Stock? • Sum of all shares of a company • Price of a Share • Depends on supply and demand in the market

Price Fluctuation • Supply ≈ constant • Demand depends on • Company’s performance • CEO change • Profits & Losses • Market developments in company’s sector • Stock Market Speculations • Financial bubbles • Investor Psychology • Unforeseeable External factors • Climate • Govt. Policies

Fundamental Analysis • In-depth analysis of company’s performance • Market participation • Competition • Expansion Strategy for future Determine expected returns/Profits • Systematic approach • Good for long term investments • Assumption : Stock price depends on actual company value • Short term predictions impossible

Technical Analysis • Analysis of stock charts • Used by 90% of major stock traders • Not taking into account company fundamentals • Subjective pattern recognition and trend analysis • Extrapolating data to predict • Can be automated • Simple trading rules can be formulated • Assumption : History repeats itself • Exaggerates price indicators

Linear regression y x Simple linear regression:

Multiple linear regression • Simple linear regression: • 1 scalar variable (y) • 1 explanatory variable (x) • Multiple linear regression: • 1 scalar variable (y) • n explanatory variables (x1, x2,…,xn) • Modelling of non-linear inputspossiblebychosing e.g. x1=t2

Problems in Multiple Linear Regression • Manyassumptions: • linearityofinputs • homoskedasticity (constantvarianceoferrors) not truefor : • companyvalue: 1,000,000$  ±200.000$ ispossiblevariance • companyvalue: 100,000$  ±200.000$ isimplausablevariance • Independence oferrors • Anscombe Quartett:

Limits to Classical Methods • Rely on simplistic models • Fundamental Analysis: ignores market psychology • As traders base decision on psychology it won’t work • Technical Analysis: ignores information outside stock market • Man-made rules must be simplistic • Multiple linear regression depends on assumptions as power of inputs to be accurate • Need for system with a more holistic approach because • Mathematical models can’t be formulated • Should be able to learn from past and be optimized

Neural Networks • Mathematical models can’t be formulated • Stock Market • Highly non-linear and dynamic system • Too many influencing parameters with different weightage • NN don’t need on Mathematical models • Should be able to learn from past and be optimized  NN can be trained and tested for accurate prediction with past data

Implementation • Subjectively deciding on NN inputs • Fundamental: Volume, Price, etc. • Technical: Moving Averages, Volume trends, etc. • Indices – NSE FMCG, gold metals, etc. • Exchange rates and Interest rates • Economic statistics: GDP, Export and Imports, etc. • Psychological inputs • Designing a model • Feed Forward(FF) v/s Recurrent Neural Network (RNN) • No. of hidden layers • Initial weightages • Connections between different nodes

Implementation (II) • Training the model • Algorithm • FF Network with back propagation of errors • Prevent over-training • Testing • Validating prediction on existing data • Optimizing model • Adding more inputs to get better accuracy • Removing non-correlative inputs

Feed Forward with Back Propagation • Information always moves towards forward direction • Error is measured in forward pass through network • Error is propagated backwards and weights are adjusted to reduce overall error • Risk of being trapped in local minimum • Initialization of weights also becomes an important issue • Two methods to initialize weights will be discussed :- • Random Weight Initialization • Multiple Linear Regression

Random Initialization Weights in NN is initialized to random values Starting point for all initialization methods Minimum amount of time is spent Leads to different result in each session of training of network Most popular method due to its simplicity Doesn’t guarantee a good starting point Initializes weight near saddle point of error surface and local minima Can be the cause of slow learning rate

Multiple linear Regression Initialization Weights between input layer and hidden layer initialized randomly Weights between hidden layer and output layer is obtained by Multiple linear Regression Initialization : output unknown weights : input : input weigh

Multiple linear Regression Initialization Equation derived above is a typical linear regression model ‘s are regressors and ’s can be estimated by standard regression method Weights are initialized near the global optimum May results in faster convergence rate Improvement is not worth time spent in coding and initialization of start state for most cases

SteepestDescent First order steepest descent technique Weights are modified in a direction of negative gradient of the error surface ZigZag search motion Sensitive to parameters like learning rate, etc. Simplicity For most cases, it gives acceptable results ZigZag search motion may spoil good point Slow Convergence

Conjugate Gradient • Algorithm • Set k = 1. Initialize . • Compute . • Set . • Compute by line search, where = arg min . • Update weight vector by . • If network error is less than a pre-set minimum value or the maximum number of iterations has been reached, stop; else go to step 7. • If k+1 > n, then = , k = 1 and go to step 2; Else • set k= k+1 • compute . • compute . • compute new direction: . • go to step 4

Conjugate Gradient • For calculating gradient in step 2 and 7b, objective function f is defined as : actual output : Expected output N: number of training set • Gradient is • For output nodes, Where is the derivative of the activation function

Conjugate Gradient • Converges faster • Othogonal search motion doesn‘t spoil good points • Less number of iterations are required • Requires more time per iteration • Implementation is not simple For the hidden node

Experimental Comparison • Daily trading data of 11 companies in 1994-1996 was collected from Shanghai Stock Exchange for technical Analysis • First 500 entries were used as training data • Rest 150 were testing data • Raw data was processed into various technical indicators • Ten technical indicators were selected as input to the neural Network. Some of them are :- • Relative strength index on day t-1 (RSI(t-1)) • Moving average convergence-divergence on day t-1 (MACD(t-1)) • MACD signal line on day t-1 (MACD Signal Line (t-1)) • Lagging input of past 5 days‘ change in exponential moving average(, etc. • Stochastic %D and %K on day t-1 (%K(t-1) and %D(t-1))

Experimental Comparison • EMA gives an average value of data with greater weight to the latest data • 12-day EMA and 26-day EMA are used • RSI is an oscillator which measures strength of up versus down over a certain time interval • Higher value of RSI means stronger market and vice versa • MACD is the difference between two moving average of price • MACD signal line smoothens MACD • Stochastic is an oscillator that tracks the relationship of each closing price to the recent high-low range. It has 2 lines : • %K is the raw Stochastic • %D smoothens %K

Experimental Comparison • Indicator were normalized as NN can’t handle wide range of values • Prediction of price change allow greater error tolerance than prediction of exact price • A three layered network architecture was used. The required number of hidden nodes is estimated by No of hidden nodes = (M + N) / 2 where M and N is the number of input and output nodes respectively

Experimental Comparison • Following scenario were examined :- • Conjugate gradient with random intialization(CG/RI) • Conjugate gradient with multiple linear regression initialization(CG/MLRI) • Steepest descent with random initialization(SD/RI) • Steepest descent with multiple linear regression initialization • In steepest descent algorithm • Learning rate was set 0.1 • Training is terminated when mean square error (MSE) < 0.5%

Experimental Comparison

Experimental Comparison In figures above and actual have relative greater deviation in some regions After transformation to exact price value, deviation between actual price value and predicted price is small Selection of input was appropriate

Experimental Comparison • Conclusions • All scenarios except SD / MLRI achieve similar average MSE and percentage of correct predictions • All scenario perform satisfactory • Conjugate learning requires significant less amount of iterations than steepest descent algorithm • MLR initialization requires less number of iteration in CG algorithm. This shows regression provides better starting point. • In case of SD, MLRI requires more iteration because good starting point is spoiled by zigzag motion of Steepest descent algorithm • Efficiency of classical back propagation can be improved by conjugate learning with multiple linear regression weight initialization

Recurrent Neural Networks Connection between perceptron form a directed cycle They have feed-back mechanism which allows it to exhibit dynamic temporal behaviour Takes into account the sequence in which the input is presented. It can predict price according to recent history more closely Most RNNs have had scaling issues. In particular, RNNs cannot be easily trained for large numbers of neuron units nor for large numbers of inputs units. Successful training has been mostly in time series problems with few inputs.

Network Comparison Main difference between FFN and RNN is existance of feed-back mechanism in RNN RNN can‘t be trained for larger no of inputs/neurons whereas FFN can be trained for both the cases RNN learn spatiotemporal information from training data whereas FFN learns spatial relationship only

Limits of Neural Networks • Highly Subjective choice of inputs and model • Computational power • Less accessible • Costly • Black box • Cannot handle wide range of values • If implemented on large scale it create self-fulfilling prediction

Conclusion Share Prices are predictable under reasonable constraints Accuracy of Prediction increases from Fundamental Analysis (FA) to Technical Analysis(TA) to Neural Network based prediction FA < TA < NN NN takes into account much more information than classical methods Use of Feed Forward Network is better than Recurrent Neural Network due to higher difficulty in training of RNN Feed Forward network with Back propagation trained using Conjugate Gradient method and weights initialized using Multiple linear regression gives better accuracy than classical back propagation model

References Ramon Lawrence. Using Neural Networks to Forecast Stock Market Prices. University of Manitoba, 1997. Buisnessweek.com. A “Neural” Approach to the Market. Bloomberg BusinessWeek Jason E. Kutsurelis. Forecasting Financial Market Using Neural Networks: An Analysis of Methods and Accuracy. Naval Postgraduate School, Monterey, California, 1998. Charles Duhigg. Artificial Intelligence applied heavily to picking stocks – business – International Herald Tribune. Nytimes.com, 2006. CHAN Man-Chung, WONG Chi-Cheong, LAM Chi-Chung. Financial Time Series Forecasting by Neural Network Using Conjugate Gradient Learning Algorithm and Multiple Linear Regression Weight Initialization. Department of Computing, The Hong Kong Polytechnic University Kowloon, Hong Kong, 1996 Jovina Roman, AkhtaJameel. Backpropagation and Recurrent Neural Netwrok in Financial Analysis Of Multiple Stock Market Returns. Department of Computer Science, Xavier University of Louisiana, 1996 en.wikipedia.org/wiki/Recurrent_neural_network en.wikipedia.org/wiki/Neural_network

Thankyou!

S tock T rend P rediction using Neural N etworks

S tock T rend P rediction using Neural N etworks

Presentation Transcript

P R E S E N T S

S ecurity I N W IRELESS S ENSOR N ETWORKS

Investment environment, now a t rend , n ot n ecessity

T-P-S

T ransport and P ercolation in C omplex N etworks

P R E S E N T S

P P+ T R A I N I N G P R E S E N T

T R A N S E P T

P l a n t s

P rofessional L earning N etworks

S TOCK S ELECTION

S h a p e s a n d P a t t e r n s

P A S T T E N S E

P H O T O N S

P H O T O N S