A Neural Network Approach to Predicting Stock Performance John Piefer ECE/CS 539 Project Presentation
Presentation Outline • Introduction • Problem Description • Neural Network Design • Data Format • Program Description (code in Appendix B of report) • Default Network Parameters • Selected Results • Discussion • Limitations of my model • Comparison with model from www.stock100.com • Conclusion
Introduction • Predicting stock performance is a very large and profitable area of study • Many companies have developed stock predictors based on neural networks • This technique has proven successful in aiding the decisions of investors • Can give an edge to beginning investors who don’t have a lifetime of experience
Problem Description • Collect a sufficient amount of historical stock data • Using this data, train a neural network • Once trained, the neural network can be used to predict stock behavior • Need to some way to gauge value of results – I will compare with www.stock100.com as well as compare with what actually happened • Advantages • Neural network can be trained with a very large amount of data. Years, decades, even centuries • Able to consider a “lifetime” worth of data when making a prediction • Completely unbiased • Disadvantages • No way to predict unexpected factors, i.e. natural disaster, legal problems, etc.
Neural Network Design • I will use back propagation learning on a MLP with one hidden layer • This decision is based on previous success • Due to time restraints, fairly easy to use • This will be approached as an approximation problem, but with classification in mind • I will predict an exact value, but we really care about whether the network predicts the general behavior correctly because that is how we make or lose money • I believe the actual output will give a better idea of what to expect. If it is close to 0, that doesn’t tell us much. But if it is +5, that gives a better indication that it will increase – the network is more “sure” • Training will be done using matlab programs, including a modified version of ‘bpappro.m’ written by professor Yu Hen Hu • Network will have one output: the predicted value for the next day
Data Format • Stock data for various companies will be downloaded from the S&P 500 data page • I will write a C++ program (datagen.cpp) to put the data into matrix form to be used by matlab (See Appendix A of report for code) • Only the stock price will be considered • The data will be input as the change in stock price from open to close • Ex: $5 open, $6 close +1 is the actual value used • Ex: $5 open, $4 close -1 is the actual value used • A certain number of samples will be used as inputs, e.g. 20 samples, user can specify in datagen.cpp
Data Format, cont’d • The target value will be the last column, corresponding to the change in price the next day • Target values will become inputs in subsequent samples as follows • Ex: [+1 -1 +0.5 -2] -2 is target • next sample is [-1 +0.5 -2 +3] • next sample is [+0.5 –2 +3 …] etc • This allows for more training samples (apx. 250 for one year worth of data)
Default Network Parameters • These are the default network parameters, determined by running experiments Hidden Neurons (H): 18 Learning Rate (alpha): 0.4 Momentum Constant (mom): 0.75 Max Epochs (nepoch): 2000 Epoch Size (K): 24
Program Description – pred.m • Built on ‘bpappro.m’ by Yu Hen Hu • Trains neural network and predicts the next day(s), giving an exact value for each prediction • Usage: [Predicted,flagpct,flag1,flag2] = pred(Stockdata, H, alpha, mom, nepoch, K, days) • Predicted: vector of the predictions - [day1 day2 …] • flagpct: the percent of times it predicted the wrong behavior on the training set • flag1: the number of “type 1 flags” – predicted an increase but it actually decreased the next day • flag2: the number of “type 2 flags” – predicted a decrease but it actually increased the next day • Stockdata: the matrix generated by datagen.cpp • days: the number of days to predict (default: 1) • All other intputs are the network parameters specified on the previous page • Also outputs some statistics about the training set
Program Description – sp.m • Driver program that calls ‘pred’ • User inputs the number of trials to run, sp calls pred that many times, each time getting a new prediction • Outputs some statistics about all the trials to be used for making the decision • Also important to look at results from individual trials for any odd behavior
Selected Results • Walmart (actual next day: -1.375) • Over 50 trials, I predicted increases 32 times • Overall average of +0.6727 • Type 1 flags: 12.3065% of the time • Type 2 flags: 6.2137% of the time • Discussion • I would have recommended investing based on these results – would have lost money • AT&T Corp (actual next day: +1.875) • Over 40 trials, I predicted increases 29 times • Overall average of +0.8563 • Type 1 flags: 9.5959% of the time • Type 2 flags: 8.4884% of the time • Discussion • I would have recommended investing based on these results – would have made money
Selected Results, cont’d • America Online (actual next day: -1.125) • Over 50 trials, I predicted increases 25 times and decreases 25 times • Overall average value of +0.6429 • Type 1 flags: 5.2618% of the time • Type 2 flags: 8.051% of the time • This is a very good overall classification rate (86.6872%) • Discussion • Not consistent enough to make a decision • No majority of predicted increases or decreases • Overall value is close to 0, could easily drop below 0 • See report for six more companies and more detailed analysis of results • Also see Appendix A of report for sample graphs for each company
Discussion • Overall results • Predictions were generally not uniform over all trials, less consistent means the models are not very “sure” based on data • This leads to more risky decisions • Shows a lot of promise - good classification rates (predicted increases/decreases correctly) • Showed a tendency to predict more increases, even when actual behavior was a decrease (see report for more discussion on this) • Limitations of my model • Only considers stock price, and only one year’s worth of data • Only one output, maybe better to predict more than one day ahead • Comaprison with www.stock100.com • Walmart: both predicted increase, actually decreased • AOL: I said uncertain, they predicted increase • AT&T: both predicted increase, actually increased
Conclusion • My model shows promise, but needs improvement before becoming an effective aid • Needs more data, possibly more types of data • No human or computer can perfectly predict the volatile stock market • Under “normal” conditions, in most cases, a good neural network will outperform most other current stock market predictors and be a very worthwhile, and potentially profitable aid to investors • Should be used as an aid only!